feat(agent-zero): route chat_model through fc-llm-bridge (ADR-088)
Flips Agent Zero's chat_model from direct local Ollama (gemma3:12b via
the 127.0.0.1:11434 sidecar proxy) to the FlowerCore LLM Bridge
(fc:balanced tier, OpenAI-compatible, Anthropic Claude Sonnet under the
hood) so chat turns are spend-tracked and can dispatch to any provider
via a single tier alias.
Scope is intentionally minimal and reversible:
- chat_model: ollama/gemma3:12b/127.0.0.1:11434
→ openai/fc:balanced/fc-llm-bridge internal service URL
- utility_model, embedding_model, browser_model: UNCHANGED
(stay on local 127.0.0.1 Ollama sidecar — no spend, low latency,
not worth routing through the bridge for small-model traffic).
Auth: new A0_SET_chat_model_api_key env var wired to the
fc-llm-bridge-api-keys Secret (field: agent-zero-k8s). The Secret is
synced by a new OnePasswordItem pointing at "FC LLM Bridge API Keys"
in the IAmWorkin vault. Bearer-token auth is now accepted by the
bridge (FlowerCore.LlmBridge@3225f1f).
Rollback: revert this commit; old image v202604231449 is still present
on all RKE2 nodes, and Agent Zero's strategy: Recreate makes the flip
atomic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -94,6 +94,19 @@ subjects:
|
||||
# Connects to a local proxy that routes to workstation Ollama first and edge1 second
|
||||
# Blue Jay profile with 21 tools, 3 prompts, 4 extensions
|
||||
|
||||
---
|
||||
# FC LLM Bridge API key for Agent Zero (ADR-088 chat_model routing).
|
||||
# Syncs from 1Password item "FC LLM Bridge API Keys" (field: agent-zero-k8s).
|
||||
# Consumed by the chat_model only; util / embedding / browser stay on local
|
||||
# Ollama via the 127.0.0.1 sidecar proxy.
|
||||
apiVersion: onepassword.com/v1
|
||||
kind: OnePasswordItem
|
||||
metadata:
|
||||
name: fc-llm-bridge-api-keys
|
||||
namespace: agent-zero
|
||||
spec:
|
||||
itemPath: "vaults/IAmWorkin/items/FC LLM Bridge API Keys"
|
||||
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
@@ -239,12 +252,15 @@ spec:
|
||||
# Link Blue Jay profile from workspace into Agent Zero's expected path
|
||||
ln -sfn /a0/work/.bluejay/agents/bluejay /a0/agents/bluejay
|
||||
# Write model config BEFORE initialize.sh loads it
|
||||
# The _model_config plugin reads config.json (NOT config.yaml)
|
||||
# Default is OpenRouter; override to the local proxy, which prefers
|
||||
# the workstation and falls back to edge1 automatically.
|
||||
# The _model_config plugin reads config.json (NOT config.yaml).
|
||||
# chat_model: FlowerCore LLM Bridge (ADR-088) — OpenAI-compat,
|
||||
# spend-tracked, tier-aliased (fc:balanced → Claude Sonnet).
|
||||
# api_key comes from A0_SET_chat_model_api_key env var (overrides
|
||||
# config.json). util + embedding stay on local 127.0.0.1 Ollama
|
||||
# proxy (workstation primary, edge1 fallback).
|
||||
mkdir -p /a0/usr/plugins/_model_config
|
||||
cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG'
|
||||
{"allow_chat_override":true,"chat_model":{"provider":"ollama","name":"gemma3:12b","api_base":"http://127.0.0.1:11434","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"ollama","name":"qwen2.5:1.5b","api_base":"http://127.0.0.1:11434","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"ollama","name":"nomic-embed-text","api_base":"http://127.0.0.1:11434","kwargs":{}}}
|
||||
{"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc.cluster.local:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"ollama","name":"qwen2.5:1.5b","api_base":"http://127.0.0.1:11434","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"ollama","name":"nomic-embed-text","api_base":"http://127.0.0.1:11434","kwargs":{}}}
|
||||
MODELCFG
|
||||
# Strip heredoc indentation
|
||||
sed -i 's/^ //' /a0/usr/plugins/_model_config/config.json
|
||||
@@ -256,13 +272,22 @@ spec:
|
||||
# Agent identity
|
||||
- name: AGENT_NAME
|
||||
value: "Blue Jay (NUC)"
|
||||
# Chat model — workstation primary, edge1 fallback via local proxy
|
||||
# Chat model — routed through FlowerCore LLM Bridge (ADR-088)
|
||||
# so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep)
|
||||
# dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint.
|
||||
# Util / embedding / browser stay on local Ollama via 127.0.0.1 proxy
|
||||
# for zero-latency, zero-cost small-model traffic.
|
||||
- name: A0_SET_chat_model_provider
|
||||
value: "ollama"
|
||||
value: "openai"
|
||||
- name: A0_SET_chat_model_name
|
||||
value: "gemma3:12b"
|
||||
value: "fc:balanced"
|
||||
- name: A0_SET_chat_model_api_base
|
||||
value: "http://127.0.0.1:11434"
|
||||
value: "http://fc-llm-bridge.fc-llm-bridge.svc.cluster.local:8080/v1"
|
||||
- name: A0_SET_chat_model_api_key
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: fc-llm-bridge-api-keys
|
||||
key: agent-zero-k8s
|
||||
- name: A0_SET_chat_model_ctx_length
|
||||
value: "8192"
|
||||
- name: A0_SET_chat_model_kwargs
|
||||
|
||||
Reference in New Issue
Block a user