Compare commits

..

13 Commits

Author SHA1 Message Date
Andrew Stoltz
fbbc07023b deploy(fc-llm-bridge): roll fc:vision image v202604300022
Source: FlowerCore.LlmBridge@8dd181c (feat: fc:vision route + image
content forwarding). Adds:

- fc:vision tier alias parsing (TryParseTier handles fc:vision,
  FC:VISION, openai/fc:vision, vision)
- Image content forwarding: OpenAi image_url shape (https URL +
  data:[mediaType];base64,... URI) and Anthropic image/source
  passthrough are now promoted to LlmContentBlocks. Text-only
  content-parts arrays still flatten to the legacy joined string.
- DefaultRoutes seeder + appsettings.json gain Vision -> Anthropic +
  claude-sonnet-4-6.

Image built on BLUEJAY-WS, podman save + ctr import to all 3 RKE2
nodes (rke2-server, rke2-agent1, rke2-agent2). Bridge tests: 62/62
green (was 51/51, +11). Backwards-compatible with current chat /
util / embed callers; existing routes unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 00:26:45 -05:00
Andrew Stoltz
4b0eef0fb0 deploy(fc-llm-bridge): roll alias-fix image v20260430001132 2026-04-30 00:13:48 -05:00
Andrew Stoltz
bb09a3786f fix(knowledge): pin live manifest to bundled edition path 2026-04-29 23:37:02 -05:00
Andrew Stoltz
006dbcf671 fix(agent-zero): export knowledge mcp gate to python builder 2026-04-29 23:32:55 -05:00
Andrew Stoltz
1be71d6ba7 fix(agent-zero): export mcp servers without python indent errors 2026-04-29 23:19:48 -05:00
Andrew Stoltz
0c8026c912 fix(agent-zero): avoid heredoc break in mcp bootstrap 2026-04-29 23:16:54 -05:00
Andrew Stoltz
621ae47e00 fix(agent-zero): repair fc knowledge mcp manifest 2026-04-29 23:11:57 -05:00
Andrew Stoltz
ae6b8c0142 fix(knowledge): keep mcp key env on new token secret 2026-04-29 23:06:07 -05:00
Andrew Stoltz
da55220218 feat(agent-zero): wire fc_knowledge phase1 rollout 2026-04-29 22:59:19 -05:00
Andrew Stoltz
b1ad253dd6 fix(agent-zero): prefix bridge embedding alias for litellm 2026-04-29 21:14:12 -05:00
Andrew Stoltz
ee935f6e07 fix(agent-zero): keep internal util/embed on bridge v1 2026-04-29 21:09:04 -05:00
Andrew Stoltz
2853ee2024 chore(bridge): bump fc-llm-bridge image tag v202604292028 2026-04-29 20:50:55 -05:00
Andrew Stoltz
b4a34e16ca refactor(agent-zero): drop ollama-proxy sidecar (Phase 3) 2026-04-29 20:50:55 -05:00
5 changed files with 153 additions and 125 deletions

View File

@@ -92,16 +92,17 @@ subjects:
# ============================================================================= # =============================================================================
# Agent Zero — AI Agent Web UI (NUC Edition, Blue Jay Profile) # Agent Zero — AI Agent Web UI (NUC Edition, Blue Jay Profile)
# ============================================================================= # =============================================================================
# Chat / utility / embedding lanes route through fc-llm-bridge. Browser keeps # Connects directly to fc-llm-bridge for chat + internal util/embed + browser.
# a local nginx proxy to edge1 Pi 5 + AI HAT+ until the bridge grows a live # Agent Zero's internal util/embed slots stay on the bridge's OpenAI-compatible
# Vision route and the in-pod tools stop calling Ollama directly. # /v1 surface, while browser + corpus-search use the Ollama-compatible /api/*
# Blue Jay profile with 21 tools, 3 prompts, 4 extensions # surface through OLLAMA_HOST.
# Blue Jay profile with 21 tools, 3 prompts, 4 extensions.
--- ---
# FC LLM Bridge API key for Agent Zero (ADR-088 chat / util / embed routing). # FC LLM Bridge API key for Agent Zero (ADR-088 chat/util/embed/browser routing).
# Syncs from 1Password item "FC LLM Bridge API Keys" (field: agent-zero-k8s). # Syncs from 1Password item "FC LLM Bridge API Keys" (field: agent-zero-k8s).
# Consumed by the OpenAI-compatible chat / util / embedding lanes. Browser # Consumed by chat, internal util/embed, browser, and corpus-search requests
# stays on the local Ollama sidecar until fc:vision is configured on the bridge. # that traverse fc-llm-bridge.
apiVersion: onepassword.com/v1 apiVersion: onepassword.com/v1
kind: OnePasswordItem kind: OnePasswordItem
metadata: metadata:
@@ -126,6 +127,18 @@ metadata:
spec: spec:
itemPath: "vaults/IAmWorkin/items/Print.Web API Keys" itemPath: "vaults/IAmWorkin/items/Print.Web API Keys"
---
# Knowledge MCP bearer token for the direct Agent Zero -> Knowledge.Web path.
# The 1Password item currently stores the raw token in its concealed PASSWORD
# field, which the operator syncs to Secret key `password`.
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: knowledge-mcp-tokens
namespace: agent-zero
spec:
itemPath: "vaults/IAmWorkin/items/FlowerCore Knowledge MCP Tokens"
--- ---
apiVersion: apps/v1 apiVersion: apps/v1
kind: Deployment kind: Deployment
@@ -137,7 +150,7 @@ metadata:
annotations: annotations:
agent-zero/deployment: "nuc" agent-zero/deployment: "nuc"
agent-zero/profile: "bluejay" agent-zero/profile: "bluejay"
agent-zero/ollama: "edge1 Pi 5 + AI HAT+ only (10.0.57.17:11434) — workstation Ollama is private dev hardware, not a cluster dependency" agent-zero/ollama: "fc-llm-bridge fronts edge1 Pi 5 + AI HAT+ Ollama for cluster browser/corpus-search traffic; internal chat/util/embed route through the bridge's authenticated OpenAI surface"
spec: spec:
replicas: 1 replicas: 1
selector: selector:
@@ -152,19 +165,18 @@ spec:
spec: spec:
serviceAccountName: agent-zero serviceAccountName: agent-zero
initContainers: initContainers:
# Wait for edge1 Ollama to be reachable before starting Agent Zero. # Wait for fc-llm-bridge to be reachable before starting Agent Zero.
# (Workstation Ollama is intentionally NOT in the cluster path.) - name: wait-for-llm-bridge
- name: wait-for-ollama
image: busybox:1.37 image: busybox:1.37
command: ["sh", "-c"] command: ["sh", "-c"]
args: args:
- | - |
echo "Waiting for edge1 Ollama (10.0.57.17:11434)..." echo "Waiting for fc-llm-bridge..."
until wget -qO- --timeout=2 http://10.0.57.17:11434/api/tags >/dev/null 2>&1; do until wget -qO- --timeout=2 http://fc-llm-bridge.fc-llm-bridge.svc:8080/healthz >/dev/null 2>&1; do
echo "edge1 Ollama not ready yet, retrying in 5s..." echo "fc-llm-bridge not ready yet, retrying in 5s..."
sleep 5 sleep 5
done done
echo "edge1 Ollama is reachable." echo "fc-llm-bridge is reachable."
# Assemble the Blue Jay profile directory structure from ConfigMaps. # Assemble the Blue Jay profile directory structure from ConfigMaps.
# ConfigMaps can't create nested dirs, so we copy into the workspace PVC. # ConfigMaps can't create nested dirs, so we copy into the workspace PVC.
- name: setup-bluejay - name: setup-bluejay
@@ -211,73 +223,6 @@ spec:
- name: bluejay-theme - name: bluejay-theme
mountPath: /tmp/bluejay-theme mountPath: /tmp/bluejay-theme
containers: containers:
- name: ollama-proxy
image: nginx:1.27-alpine
command: ["/bin/sh", "-c"]
args:
- |
cat > /etc/nginx/nginx.conf <<'NGINX'
worker_processes 1;
events { worker_connections 1024; }
http {
upstream ollama_upstream {
# edge1 Pi 5 + AI HAT+ is the SOLE upstream.
# Workstation Ollama (BLUEJAY-WS) is private dev hardware and
# MUST NOT be added back here without explicit operator decision —
# adding it would expose the workstation to cluster traffic.
server 10.0.57.17:11434 max_fails=2 fail_timeout=10s;
keepalive 16;
}
server {
listen 11434;
# Local healthcheck — proves nginx itself is alive.
# Must NOT depend on upstream so liveness doesn't restart
# the container when edge1 is slow/offline.
location = /healthz {
access_log off;
return 200 'ok\n';
default_type text/plain;
}
location / {
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_connect_timeout 5s;
proxy_read_timeout 600s;
proxy_send_timeout 600s;
proxy_next_upstream error timeout invalid_header http_502 http_503 http_504;
proxy_pass http://ollama_upstream;
}
}
}
NGINX
exec nginx -g 'daemon off;'
ports:
- containerPort: 11434
# Readiness probe DOES check upstream so K8s only routes traffic
# when edge1 Ollama is reachable. timeoutSeconds=5 absorbs the Pi's
# slower TCP handshake under load (was timeoutSeconds=1 default →
# 172 historic restarts when the workstation primary path went down,
# before the cluster was repointed to edge1-only on 2026-04-27).
readinessProbe:
httpGet:
path: /api/tags
port: 11434
initialDelaySeconds: 5
periodSeconds: 15
timeoutSeconds: 5
failureThreshold: 3
# Liveness probe hits ONLY local healthz — restarts the container
# only when nginx itself is dead. Decoupling liveness from upstream
# eliminates restart-loops caused by transient upstream outages.
livenessProbe:
httpGet:
path: /healthz
port: 11434
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 3
failureThreshold: 3
- name: agent-zero - name: agent-zero
image: agent0ai/agent-zero:latest image: agent0ai/agent-zero:latest
command: ["/bin/bash", "-c"] command: ["/bin/bash", "-c"]
@@ -297,26 +242,42 @@ spec:
# The _model_config plugin reads config.json (NOT config.yaml). # The _model_config plugin reads config.json (NOT config.yaml).
# chat_model: FlowerCore LLM Bridge (ADR-088) — OpenAI-compat, # chat_model: FlowerCore LLM Bridge (ADR-088) — OpenAI-compat,
# spend-tracked, tier-aliased (fc:balanced → Claude Sonnet). # spend-tracked, tier-aliased (fc:balanced → Claude Sonnet).
# api_key comes from OPENAI_API_KEY / A0_SET_chat_model_api_key. # api_key comes from A0_SET_chat_model_api_key env var (overrides
# Utility + embedding now share the same bridge surface so Agent # config.json). Utility + embedding stay on the authenticated
# Zero stops talking to Ollama directly for those model lanes. # OpenAI-compatible /v1 surface; browser and direct tool traffic
# Browser stays on the local 127.0.0.1 proxy until the bridge has # use the bridge's Ollama-compatible root via OLLAMA_HOST.
# a live Vision route and the in-pod tools stop calling Ollama.
mkdir -p /a0/usr/plugins/_model_config mkdir -p /a0/usr/plugins/_model_config
cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG' cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG'
{"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"openai","name":"fc:cheap","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"openai","name":"fc:embedding","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","kwargs":{}}} {"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"openai","name":"fc:cheap","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"openai","name":"openai/fc:embedding","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","kwargs":{}}}
MODELCFG MODELCFG
# Strip heredoc indentation # Strip heredoc indentation
sed -i 's/^ //' /a0/usr/plugins/_model_config/config.json sed -i 's/^ //' /a0/usr/plugins/_model_config/config.json
# Phase 0 Chat MCP pilot: Agent Zero does not interpolate env vars # Phase 0 Chat MCP pilot: Agent Zero does not interpolate env vars
# inside A0_SET_mcp_servers JSON, so build the final JSON here from # inside A0_SET_mcp_servers JSON, so build the final JSON here from
# the secret-backed CHAT_MCP_API_KEY env var before initialize.sh. # the secret-backed env vars before initialize.sh. Keep the local
# Use the in-cluster Chat service URL rather than the public # corpus_search.py tool mounted either way so outage fallback
# Traefik hostname so the pod stays off the private VIP lane that # remains available even when fc_knowledge is not advertised.
# the default egress rule blocks. export KNOWLEDGE_MCP_ENABLED=false
if [ -n "${CHAT_MCP_API_KEY:-}" ]; then if [ -n "${KNOWLEDGE_MCP_BEARER_TOKEN:-}" ]; then
export A0_SET_mcp_servers="{\"mcpServers\":{\"fc-chat\":{\"type\":\"streamable-http\",\"url\":\"http://chat-web.fc-chat.svc/mcp\",\"headers\":{\"X-Api-Key\":\"${CHAT_MCP_API_KEY}\"}}}}" if curl -sf --connect-timeout 3 "${KNOWLEDGE_MCP_HEALTH_URL}" > /dev/null && \
curl -sf --connect-timeout 5 \
-H "Authorization: Bearer ${KNOWLEDGE_MCP_BEARER_TOKEN}" \
-H "Accept: application/json, text/event-stream" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":"fc-knowledge-bootstrap","method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"agent-zero-bootstrap","version":"1.0"}}}' \
"${KNOWLEDGE_MCP_URL}" > /dev/null; then
export KNOWLEDGE_MCP_ENABLED=true
echo "fc_knowledge enabled from ${KNOWLEDGE_MCP_URL}."
else
echo "fc_knowledge unavailable or unauthorized; keeping local corpus_search.py as the fallback path."
fi fi
else
echo "fc_knowledge token missing; keeping local corpus_search.py as the fallback path."
fi
export A0_SET_mcp_servers="$(
python3 -c 'import json, os; servers = {}; chat_key = os.getenv("CHAT_MCP_API_KEY"); knowledge_enabled = os.getenv("KNOWLEDGE_MCP_ENABLED", "false").lower() == "true"; token = os.getenv("KNOWLEDGE_MCP_BEARER_TOKEN", "") if knowledge_enabled else ""; chat_key and servers.setdefault("fc_chat", {"type": "streamable-http", "url": "http://chat-web.fc-chat.svc/mcp", "headers": {"X-Api-Key": chat_key}}); token and servers.setdefault("fc_knowledge", {"type": "streamable-http", "url": os.getenv("KNOWLEDGE_MCP_URL", "http://knowledge-web.knowledge.svc/mcp"), "headers": {"Authorization": f"Bearer {token}"}}); print(json.dumps({"mcpServers": servers}, separators=(",", ":")))'
)"
# Run the original entrypoint # Run the original entrypoint
exec /exe/initialize.sh $BRANCH exec /exe/initialize.sh $BRANCH
ports: ports:
@@ -328,9 +289,9 @@ spec:
# Chat model — routed through FlowerCore LLM Bridge (ADR-088) # Chat model — routed through FlowerCore LLM Bridge (ADR-088)
# so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep) # so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep)
# dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint. # dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint.
# Utility + embedding now share the bridge/auth surface too. # Internal utility + embedding use the authenticated OpenAI surface,
# Browser stays on local Ollama until the bridge has a live # while browser/corpus-search use the bridge's Ollama-compatible
# Vision route and the in-pod tools stop calling Ollama directly. # endpoints so Agent Zero no longer needs a local proxy sidecar.
- name: A0_SET_chat_model_provider - name: A0_SET_chat_model_provider
value: "openai" value: "openai"
- name: A0_SET_chat_model_name - name: A0_SET_chat_model_name
@@ -352,11 +313,16 @@ spec:
secretKeyRef: secretKeyRef:
name: fc-llm-bridge-api-keys name: fc-llm-bridge-api-keys
key: agent-zero-k8s key: agent-zero-k8s
- name: FC_LLM_BRIDGE_API_KEY
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: agent-zero-k8s
- name: A0_SET_chat_model_ctx_length - name: A0_SET_chat_model_ctx_length
value: "8192" value: "8192"
- name: A0_SET_chat_model_kwargs - name: A0_SET_chat_model_kwargs
value: '{"temperature": 0, "num_ctx": 8192}' value: '{"temperature": 0, "num_ctx": 8192}'
# Utility model — fast small helper tier through the same proxy # Utility model — fast small helper tier through the OpenAI surface
- name: A0_SET_util_model_provider - name: A0_SET_util_model_provider
value: "openai" value: "openai"
- name: A0_SET_util_model_name - name: A0_SET_util_model_name
@@ -365,23 +331,33 @@ spec:
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1" value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1"
- name: A0_SET_util_model_kwargs - name: A0_SET_util_model_kwargs
value: '{"num_ctx": 2048}' value: '{"num_ctx": 2048}'
# Embedding model — bridge alias to nomic-embed-text on edge1 # Embedding model — authenticated bridge alias to nomic-embed-text.
# LiteLLM's embedding() path needs an explicit provider prefix here
# even though the chat slot can use bare fc:* aliases.
- name: A0_SET_embed_model_provider - name: A0_SET_embed_model_provider
value: "openai" value: "openai"
- name: A0_SET_embed_model_name - name: A0_SET_embed_model_name
value: "fc:embedding" value: "openai/fc:embedding"
- name: A0_SET_embed_model_api_base - name: A0_SET_embed_model_api_base
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1" value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1"
# Browser model — small Gemma candidate stays on the local proxy # Browser model — small Gemma candidate through the same proxy
# until fc:vision is configured on the bridge.
- name: A0_SET_browser_model_provider - name: A0_SET_browser_model_provider
value: "ollama" value: "ollama"
- name: A0_SET_browser_model_name - name: A0_SET_browser_model_name
value: "gemma3:4b" value: "gemma3:4b"
- name: A0_SET_browser_model_api_base - name: A0_SET_browser_model_api_base
value: "http://127.0.0.1:11434" value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
- name: A0_SET_browser_model_api_key
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: agent-zero-k8s
- name: A0_SET_browser_model_vision - name: A0_SET_browser_model_vision
value: "true" value: "true"
- name: OLLAMA_HOST
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
- name: FLOWERCORE_AGENTZERO_OLLAMA_URL
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
# Agent profile — Blue Jay personality, tools, and system prompt # Agent profile — Blue Jay personality, tools, and system prompt
- name: A0_SET_agent_profile - name: A0_SET_agent_profile
value: "bluejay" value: "bluejay"
@@ -404,6 +380,19 @@ spec:
name: chat-mcp-api-key name: chat-mcp-api-key
key: api-key key: api-key
optional: true optional: true
# FlowerCore.Knowledge MCP Phase 1 — direct Agent Zero client path.
# Probe /healthz first, then try an authenticated initialize call.
# If either fails, Agent Zero boots without fc_knowledge and keeps
# the local corpus_search.py tool as the outage-safe path.
- name: KNOWLEDGE_MCP_URL
value: "http://knowledge-web.knowledge.svc/mcp"
- name: KNOWLEDGE_MCP_HEALTH_URL
value: "http://knowledge-web.knowledge.svc/healthz"
- name: KNOWLEDGE_MCP_BEARER_TOKEN
valueFrom:
secretKeyRef:
name: knowledge-mcp-tokens
key: password
# Print.Web — Thermal printer service on edge2. # Print.Web — Thermal printer service on edge2.
# PRINT_WEB_URL: internal HTTP (bypasses Traefik TLS — print_web.py # PRINT_WEB_URL: internal HTTP (bypasses Traefik TLS — print_web.py
# runs in-cluster and can reach edge2 directly on the PROD VLAN). # runs in-cluster and can reach edge2 directly on the PROD VLAN).
@@ -457,7 +446,7 @@ spec:
command: command:
- /bin/bash - /bin/bash
- -c - -c
- "curl -sf http://localhost:80/ > /dev/null && curl -sf --connect-timeout 3 http://127.0.0.1:11434/api/tags > /dev/null" - "curl -sf http://localhost:80/ > /dev/null && curl -sf --connect-timeout 3 http://fc-llm-bridge.fc-llm-bridge.svc:8080/healthz > /dev/null"
periodSeconds: 30 periodSeconds: 30
failureThreshold: 2 failureThreshold: 2
resources: resources:
@@ -595,13 +584,6 @@ spec:
protocol: UDP protocol: UDP
- port: 53 - port: 53
protocol: TCP protocol: TCP
# Ollama on edge1 Pi 5 + AI HAT+ (sole upstream — workstation
# is private dev hardware and intentionally not allowlisted)
- to:
- ipBlock:
cidr: 10.0.57.17/32
ports:
- port: 11434
# Print.Web on edge2 # Print.Web on edge2
- to: - to:
- ipBlock: - ipBlock:
@@ -635,6 +617,17 @@ spec:
protocol: TCP protocol: TCP
- port: 8080 - port: 8080
protocol: TCP protocol: TCP
# FlowerCore.Knowledge MCP (Phase 1) — in-cluster direct route with
# anonymous /healthz probe plus authenticated /mcp initialize/tool calls.
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: knowledge
ports:
- port: 80
protocol: TCP
- port: 8080
protocol: TCP
# Intranet search API — use in-cluster svc so traffic stays inside # Intranet search API — use in-cluster svc so traffic stays inside
# the cluster and is not blocked by the private-range egress denylist. # the cluster and is not blocked by the private-range egress denylist.
- to: - to:

View File

@@ -7209,6 +7209,9 @@ data:
"keep_alive": keep_alive, "keep_alive": keep_alive,
"stream": False, "stream": False,
}) })
curl_headers = ["-H", "Content-Type: application/json"]
if os.environ.get("FC_LLM_BRIDGE_API_KEY"):
curl_headers.extend(["-H", f"X-Api-Key: {os.environ['FC_LLM_BRIDGE_API_KEY']}"])
try: try:
result = subprocess.run( result = subprocess.run(
@@ -7216,7 +7219,7 @@ data:
"curl", "-s", "--max-time", "120", "curl", "-s", "--max-time", "120",
"-X", "POST", "-X", "POST",
f"{api_base}/api/generate", f"{api_base}/api/generate",
"-H", "Content-Type: application/json", *curl_headers,
"-d", payload, "-d", payload,
], ],
capture_output=True, capture_output=True,
@@ -13191,6 +13194,7 @@ data:
"FLOWERCORE_AGENTZERO_OLLAMA_URL", "FLOWERCORE_AGENTZERO_OLLAMA_URL",
"http://host.containers.internal:11434", "http://host.containers.internal:11434",
) )
BRIDGE_API_KEY = os.environ.get("FC_LLM_BRIDGE_API_KEY", "").strip()
EMBEDDING_MODEL = os.environ.get( EMBEDDING_MODEL = os.environ.get(
"FLOWERCORE_FLEET_EMBEDDING_MODEL", "FLOWERCORE_FLEET_EMBEDDING_MODEL",
"nomic-embed-text", "nomic-embed-text",
@@ -13327,10 +13331,13 @@ data:
def _embed(text: str) -> list: def _embed(text: str) -> list:
"""Embed a query via Ollama's /api/embeddings. Single-vector response.""" """Embed a query via Ollama's /api/embeddings. Single-vector response."""
body = json.dumps({"model": EMBEDDING_MODEL, "prompt": text}).encode("utf-8") body = json.dumps({"model": EMBEDDING_MODEL, "prompt": text}).encode("utf-8")
headers = {"Content-Type": "application/json"}
if BRIDGE_API_KEY:
headers["X-Api-Key"] = BRIDGE_API_KEY
req = urllib.request.Request( req = urllib.request.Request(
f"{OLLAMA_BASE_URL.rstrip('/')}/api/embeddings", f"{OLLAMA_BASE_URL.rstrip('/')}/api/embeddings",
data=body, data=body,
headers={"Content-Type": "application/json"}, headers=headers,
) )
with urllib.request.urlopen(req, timeout=60) as resp: with urllib.request.urlopen(req, timeout=60) as resp:
data = json.loads(resp.read().decode("utf-8")) data = json.loads(resp.read().decode("utf-8"))

View File

@@ -97,7 +97,7 @@ spec:
# dotnet.exe publish -c Release -o deploy/app \ # dotnet.exe publish -c Release -o deploy/app \
# src/FlowerCore.LlmBridge.Web/FlowerCore.LlmBridge.Web.csproj # src/FlowerCore.LlmBridge.Web/FlowerCore.LlmBridge.Web.csproj
# podman build -t localhost/fc-llm-bridge:v<tag> -f deploy/Dockerfile.deploy deploy # podman build -t localhost/fc-llm-bridge:v<tag> -f deploy/Dockerfile.deploy deploy
image: localhost/fc-llm-bridge:v202604231520 image: localhost/fc-llm-bridge:v202604300022
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 8080 - containerPort: 8080
@@ -116,6 +116,10 @@ spec:
value: "default" value: "default"
- name: FlowerCore__LlmBridge__DefaultAppName - name: FlowerCore__LlmBridge__DefaultAppName
value: "agent-zero" value: "agent-zero"
- name: FlowerCore__LlmBridge__UtilModel
value: "qwen2.5:1.5b"
- name: FlowerCore__LlmBridge__EmbedModel
value: "nomic-embed-text"
# Per-consumer API keys — from OnePasswordItem fc-llm-bridge-api-keys. # Per-consumer API keys — from OnePasswordItem fc-llm-bridge-api-keys.
# Each field becomes a Secret key of the same name. The key-name # Each field becomes a Secret key of the same name. The key-name
# lands in the auth principal's `fc.app` claim for ledger scoping. # lands in the auth principal's `fc.app` claim for ledger scoping.

View File

@@ -5,7 +5,9 @@ Phase 2.4 closed. Pod running, certificate issued (step-ca-acme), PVC
bound (Longhorn 20Gi RWO), ArgoCD `infra-knowledge` synced. `/healthz` bound (Longhorn 20Gi RWO), ArgoCD `infra-knowledge` synced. `/healthz`
returns 200, `/api/v1/editions` returns `[]` (initial-deploy state — no returns 200, `/api/v1/editions` returns `[]` (initial-deploy state — no
*.db files in the PVC yet; Phase 2.5+ admin UI handles bulk *.db files in the PVC yet; Phase 2.5+ admin UI handles bulk
population). population). Phase 1 of the Agent Zero MCP rollout keeps `/healthz`
anonymous and gates `/mcp` behind `Authorization: Bearer <token>` built
from the 1Password item `FlowerCore Knowledge MCP Tokens`.
- Plan: [`../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md`](../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md) - Plan: [`../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md`](../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md)
- Sprint: [`../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md`](../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md) (Track B) - Sprint: [`../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md`](../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md) (Track B)
@@ -19,6 +21,12 @@ search to the rest of the FC ecosystem (Agent Zero, Chat.Web persona
memory, AiStation embeddings explorer, TtsReader chapter context, BMO memory, AiStation embeddings explorer, TtsReader chapter context, BMO
bot, Pi nodes via `fc-index sync`). bot, Pi nodes via `fc-index sync`).
Phase 1 MCP routing is explicit:
- in-cluster Agent Zero → `http://knowledge-web.knowledge.svc/mcp`
- workstation Agent Zero → `https://knowledge.iamworkin.lan/mcp`
- probe URL for both lanes → `/healthz`
## Deployment order (do NOT skip / reorder) ## Deployment order (do NOT skip / reorder)
### 1. FlowerCore.DNS public A record — knowledge.iamworkin.lan -> 10.0.56.200 ### 1. FlowerCore.DNS public A record — knowledge.iamworkin.lan -> 10.0.56.200

View File

@@ -40,16 +40,16 @@ metadata:
labels: labels:
app.kubernetes.io/part-of: bluejay-infra app.kubernetes.io/part-of: bluejay-infra
--- ---
# MCP API key — synced from 1Password so /mcp stays gated without baking # MCP bearer token for the read-only Agent Zero Phase 1 lane. The 1Password
# secrets into Git. The PASSWORD category maps the concealed field to Secret # item currently stores the raw token in its concealed PASSWORD field, which
# key `password`, which the Deployment reads into FlowerCore:Mcp:ApiKey:Key. # the operator syncs into the namespaced Secret key `password`.
apiVersion: onepassword.com/v1 apiVersion: onepassword.com/v1
kind: OnePasswordItem kind: OnePasswordItem
metadata: metadata:
name: knowledge-mcp-api-key name: knowledge-mcp-tokens
namespace: knowledge namespace: knowledge
spec: spec:
itemPath: "vaults/IAmWorkin/items/KnowledgeApiKey" itemPath: "vaults/IAmWorkin/items/FlowerCore Knowledge MCP Tokens"
--- ---
apiVersion: v1 apiVersion: v1
kind: PersistentVolumeClaim kind: PersistentVolumeClaim
@@ -102,8 +102,17 @@ spec:
- name: web - name: web
# Placeholder tag — bump to the image you built + imported to ALL # Placeholder tag — bump to the image you built + imported to ALL
# RKE2 nodes via scripts/deploy-knowledge.sh before applying. # RKE2 nodes via scripts/deploy-knowledge.sh before applying.
image: localhost/fc-knowledge-web:v202604272200 image: localhost/fc-knowledge-web:v20260429232635
imagePullPolicy: Never imagePullPolicy: Never
command:
- /bin/sh
- -c
args:
- |
if [ -n "${KNOWLEDGE_MCP_BEARER_TOKEN:-}" ]; then
export FlowerCore__Mcp__ApiKey__Key="Bearer ${KNOWLEDGE_MCP_BEARER_TOKEN}"
fi
exec dotnet FlowerCore.Knowledge.Web.dll
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
@@ -115,7 +124,7 @@ spec:
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT - name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false" value: "false"
# Vector-store directory + embedding model + edition profile dir. # Vector-store directory + embedding model + edition profile dir.
# Profile JSON is baked into the image at /app/editions via the # Profile JSON is baked into the image at /home/app/editions via the
# csproj Content-link from FlowerCore.Common/editions/. # csproj Content-link from FlowerCore.Common/editions/.
- name: Knowledge__VectorStoresDirectory - name: Knowledge__VectorStoresDirectory
value: "/data/vector-stores" value: "/data/vector-stores"
@@ -126,7 +135,7 @@ spec:
- name: Knowledge__MaxLimit - name: Knowledge__MaxLimit
value: "50" value: "50"
- name: FlowerCore__Editions__ProfileDirectory - name: FlowerCore__Editions__ProfileDirectory
value: "/app/editions" value: "/home/app/editions"
# Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster # Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster
# services do not depend on BLUEJAY-WS (private dev hardware) per # services do not depend on BLUEJAY-WS (private dev hardware) per
# bluejay-infra@0f9d56e. Query-time embedding is fast enough on # bluejay-infra@0f9d56e. Query-time embedding is fast enough on
@@ -138,7 +147,14 @@ spec:
- name: FlowerCore__Mcp__ApiKey__Key - name: FlowerCore__Mcp__ApiKey__Key
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: knowledge-mcp-api-key name: knowledge-mcp-tokens
key: password
- name: FlowerCore__Mcp__ApiKey__HeaderName
value: "Authorization"
- name: KNOWLEDGE_MCP_BEARER_TOKEN
valueFrom:
secretKeyRef:
name: knowledge-mcp-tokens
key: password key: password
resources: resources:
requests: requests:
@@ -185,7 +201,7 @@ spec:
- name: tmp - name: tmp
mountPath: /tmp mountPath: /tmp
- name: logs - name: logs
mountPath: /app/logs mountPath: /home/app/logs
volumes: volumes:
- name: vector-store - name: vector-store
persistentVolumeClaim: persistentVolumeClaim: