fix(agent-zero): prefix bridge embedding alias for litellm

fix(agent-zero): keep internal util/embed on bridge v1
chore(bridge): bump fc-llm-bridge image tag v202604292028
2026-04-29 21:14:12 -05:00 · 2026-04-29 21:09:04 -05:00 · 2026-04-29 20:50:55 -05:00 · 2026-04-29 20:50:55 -05:00 · 2026-04-29 19:14:01 -05:00 · 2026-04-29 18:04:43 -05:00
9 changed files with 706 additions and 116 deletions
--- a/apps/agent-zero/agent-zero.yaml
+++ b/apps/agent-zero/agent-zero.yaml
@@ -92,14 +92,17 @@ subjects:
 # =============================================================================
 # Agent Zero — AI Agent Web UI (NUC Edition, Blue Jay Profile)
 # =============================================================================
-# Connects to a local nginx proxy that routes to edge1 Pi 5 + AI HAT+ Ollama only
-# Blue Jay profile with 21 tools, 3 prompts, 4 extensions
+# Connects directly to fc-llm-bridge for chat + internal util/embed + browser.
+# Agent Zero's internal util/embed slots stay on the bridge's OpenAI-compatible
+# /v1 surface, while browser + corpus-search use the Ollama-compatible /api/*
+# surface through OLLAMA_HOST.
+# Blue Jay profile with 21 tools, 3 prompts, 4 extensions.

 ---
-# FC LLM Bridge API key for Agent Zero (ADR-088 chat_model routing).
+# FC LLM Bridge API key for Agent Zero (ADR-088 chat/util/embed/browser routing).
 # Syncs from 1Password item "FC LLM Bridge API Keys" (field: agent-zero-k8s).
-# Consumed by the chat_model only; util / embedding / browser stay on local
-# Ollama via the 127.0.0.1 sidecar proxy.
+# Consumed by chat, internal util/embed, browser, and corpus-search requests
+# that traverse fc-llm-bridge.
 apiVersion: onepassword.com/v1
 kind: OnePasswordItem
 metadata:
@@ -108,6 +111,22 @@ metadata:
 spec:
  itemPath: "vaults/IAmWorkin/items/FC LLM Bridge API Keys"

+---
+# Print.Web API key for Agent Zero's print_web.py Python tool.
+# Syncs from 1Password item "Print.Web API Keys" (password field = API key).
+# The print_web.py tool reads PRINT_WEB_API_KEY env var for all HTTP requests
+# to the thermal print service (GET /api/mcp/tools, POST /api/print/*, etc.).
+# Note: Print.Web uses the legacy REST MCP shape (/api/mcp/tools/*), not the
+# streamable-http MCP protocol. The print_web Python tool bridges this gap
+# and is already present in bluejay-tools ConfigMaps.
+apiVersion: onepassword.com/v1
+kind: OnePasswordItem
+metadata:
+  name: print-web-api-keys
+  namespace: agent-zero
+spec:
+  itemPath: "vaults/IAmWorkin/items/Print.Web API Keys"
+
 ---
 apiVersion: apps/v1
 kind: Deployment
@@ -119,7 +138,7 @@ metadata:
  annotations:
    agent-zero/deployment: "nuc"
    agent-zero/profile: "bluejay"
-    agent-zero/ollama: "edge1 Pi 5 + AI HAT+ only (10.0.57.17:11434) — workstation Ollama is private dev hardware, not a cluster dependency"
+    agent-zero/ollama: "fc-llm-bridge fronts edge1 Pi 5 + AI HAT+ Ollama for cluster browser/corpus-search traffic; internal chat/util/embed route through the bridge's authenticated OpenAI surface"
 spec:
  replicas: 1
  selector:
@@ -134,19 +153,18 @@ spec:
    spec:
      serviceAccountName: agent-zero
      initContainers:
-        # Wait for edge1 Ollama to be reachable before starting Agent Zero.
-        # (Workstation Ollama is intentionally NOT in the cluster path.)
-        - name: wait-for-ollama
+        # Wait for fc-llm-bridge to be reachable before starting Agent Zero.
+        - name: wait-for-llm-bridge
          image: busybox:1.37
          command: ["sh", "-c"]
          args:
            - |
-              echo "Waiting for edge1 Ollama (10.0.57.17:11434)..."
-              until wget -qO- --timeout=2 http://10.0.57.17:11434/api/tags >/dev/null 2>&1; do
-                echo "edge1 Ollama not ready yet, retrying in 5s..."
+              echo "Waiting for fc-llm-bridge..."
+              until wget -qO- --timeout=2 http://fc-llm-bridge.fc-llm-bridge.svc:8080/healthz >/dev/null 2>&1; do
+                echo "fc-llm-bridge not ready yet, retrying in 5s..."
                sleep 5
              done
-              echo "edge1 Ollama is reachable."
+              echo "fc-llm-bridge is reachable."
        # Assemble the Blue Jay profile directory structure from ConfigMaps.
        # ConfigMaps can't create nested dirs, so we copy into the workspace PVC.
        - name: setup-bluejay
@@ -193,73 +211,6 @@ spec:
            - name: bluejay-theme
              mountPath: /tmp/bluejay-theme
      containers:
-        - name: ollama-proxy
-          image: nginx:1.27-alpine
-          command: ["/bin/sh", "-c"]
-          args:
-            - |
-              cat > /etc/nginx/nginx.conf <<'NGINX'
-              worker_processes  1;
-              events { worker_connections 1024; }
-              http {
-                upstream ollama_upstream {
-                  # edge1 Pi 5 + AI HAT+ is the SOLE upstream.
-                  # Workstation Ollama (BLUEJAY-WS) is private dev hardware and
-                  # MUST NOT be added back here without explicit operator decision —
-                  # adding it would expose the workstation to cluster traffic.
-                  server 10.0.57.17:11434 max_fails=2 fail_timeout=10s;
-                  keepalive 16;
-                }
-                server {
-                  listen 11434;
-                  # Local healthcheck — proves nginx itself is alive.
-                  # Must NOT depend on upstream so liveness doesn't restart
-                  # the container when edge1 is slow/offline.
-                  location = /healthz {
-                    access_log off;
-                    return 200 'ok\n';
-                    default_type text/plain;
-                  }
-                  location / {
-                    proxy_http_version 1.1;
-                    proxy_set_header Connection "";
-                    proxy_set_header Host $host;
-                    proxy_connect_timeout 5s;
-                    proxy_read_timeout 600s;
-                    proxy_send_timeout 600s;
-                    proxy_next_upstream error timeout invalid_header http_502 http_503 http_504;
-                    proxy_pass http://ollama_upstream;
-                  }
-                }
-              }
-              NGINX
-              exec nginx -g 'daemon off;'
-          ports:
-            - containerPort: 11434
-          # Readiness probe DOES check upstream so K8s only routes traffic
-          # when edge1 Ollama is reachable. timeoutSeconds=5 absorbs the Pi's
-          # slower TCP handshake under load (was timeoutSeconds=1 default →
-          # 172 historic restarts when the workstation primary path went down,
-          # before the cluster was repointed to edge1-only on 2026-04-27).
-          readinessProbe:
-            httpGet:
-              path: /api/tags
-              port: 11434
-            initialDelaySeconds: 5
-            periodSeconds: 15
-            timeoutSeconds: 5
-            failureThreshold: 3
-          # Liveness probe hits ONLY local healthz — restarts the container
-          # only when nginx itself is dead. Decoupling liveness from upstream
-          # eliminates restart-loops caused by transient upstream outages.
-          livenessProbe:
-            httpGet:
-              path: /healthz
-              port: 11434
-            initialDelaySeconds: 10
-            periodSeconds: 30
-            timeoutSeconds: 3
-            failureThreshold: 3
        - name: agent-zero
          image: agent0ai/agent-zero:latest
          command: ["/bin/bash", "-c"]
@@ -280,12 +231,12 @@ spec:
              # chat_model: FlowerCore LLM Bridge (ADR-088) — OpenAI-compat,
              # spend-tracked, tier-aliased (fc:balanced → Claude Sonnet).
              # api_key comes from A0_SET_chat_model_api_key env var (overrides
-              # config.json). util + embedding go to local 127.0.0.1 nginx
-              # proxy which routes to edge1 Pi 5 + AI HAT+ ONLY (workstation
-              # is private dev hardware, intentionally not in the cluster path).
+              # config.json). Utility + embedding stay on the authenticated
+              # OpenAI-compatible /v1 surface; browser and direct tool traffic
+              # use the bridge's Ollama-compatible root via OLLAMA_HOST.
              mkdir -p /a0/usr/plugins/_model_config
              cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG'
-              {"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"ollama","name":"qwen2.5:1.5b","api_base":"http://127.0.0.1:11434","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"ollama","name":"nomic-embed-text","api_base":"http://127.0.0.1:11434","kwargs":{}}}
+              {"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"openai","name":"fc:cheap","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"openai","name":"openai/fc:embedding","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","kwargs":{}}}
              MODELCFG
              # Strip heredoc indentation
              sed -i 's/^              //' /a0/usr/plugins/_model_config/config.json
@@ -309,8 +260,9 @@ spec:
            # Chat model — routed through FlowerCore LLM Bridge (ADR-088)
            # so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep)
            # dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint.
-            # Util / embedding / browser stay on local Ollama via 127.0.0.1 proxy
-            # for zero-latency, zero-cost small-model traffic.
+            # Internal utility + embedding use the authenticated OpenAI surface,
+            # while browser/corpus-search use the bridge's Ollama-compatible
+            # endpoints so Agent Zero no longer needs a local proxy sidecar.
            - name: A0_SET_chat_model_provider
              value: "openai"
            - name: A0_SET_chat_model_name
@@ -332,35 +284,51 @@ spec:
                secretKeyRef:
                  name: fc-llm-bridge-api-keys
                  key: agent-zero-k8s
+            - name: FC_LLM_BRIDGE_API_KEY
+              valueFrom:
+                secretKeyRef:
+                  name: fc-llm-bridge-api-keys
+                  key: agent-zero-k8s
            - name: A0_SET_chat_model_ctx_length
              value: "8192"
            - name: A0_SET_chat_model_kwargs
              value: '{"temperature": 0, "num_ctx": 8192}'
-            # Utility model — fast small helper tier through the same proxy
+            # Utility model — fast small helper tier through the OpenAI surface
            - name: A0_SET_util_model_provider
-              value: "ollama"
+              value: "openai"
            - name: A0_SET_util_model_name
-              value: "qwen2.5:1.5b"
+              value: "fc:cheap"
            - name: A0_SET_util_model_api_base
-              value: "http://127.0.0.1:11434"
+              value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1"
            - name: A0_SET_util_model_kwargs
              value: '{"num_ctx": 2048}'
-            # Embedding model — nomic through the same proxy
+            # Embedding model — authenticated bridge alias to nomic-embed-text.
+            # LiteLLM's embedding() path needs an explicit provider prefix here
+            # even though the chat slot can use bare fc:* aliases.
            - name: A0_SET_embed_model_provider
-              value: "ollama"
+              value: "openai"
            - name: A0_SET_embed_model_name
-              value: "nomic-embed-text"
+              value: "openai/fc:embedding"
            - name: A0_SET_embed_model_api_base
-              value: "http://127.0.0.1:11434"
+              value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1"
            # Browser model — small Gemma candidate through the same proxy
            - name: A0_SET_browser_model_provider
              value: "ollama"
            - name: A0_SET_browser_model_name
              value: "gemma3:4b"
            - name: A0_SET_browser_model_api_base
-              value: "http://127.0.0.1:11434"
+              value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
+            - name: A0_SET_browser_model_api_key
+              valueFrom:
+                secretKeyRef:
+                  name: fc-llm-bridge-api-keys
+                  key: agent-zero-k8s
            - name: A0_SET_browser_model_vision
              value: "true"
+            - name: OLLAMA_HOST
+              value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
+            - name: FLOWERCORE_AGENTZERO_OLLAMA_URL
+              value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
            # Agent profile — Blue Jay personality, tools, and system prompt
            - name: A0_SET_agent_profile
              value: "bluejay"
@@ -383,9 +351,25 @@ spec:
                  name: chat-mcp-api-key
                  key: api-key
                  optional: true
-            # Print.Web — Thermal printer service on edge2
+            # Print.Web — Thermal printer service on edge2.
+            # PRINT_WEB_URL: internal HTTP (bypasses Traefik TLS — print_web.py
+            # runs in-cluster and can reach edge2 directly on the PROD VLAN).
+            # PRINT_WEB_API_KEY: from 1Password "Print.Web API Keys" password field,
+            # synced by the print-web-api-keys OnePasswordItem CRD above.
+            # The print_web.py Python tool reads both env vars for all HTTP calls.
            - name: PRINT_WEB_URL
              value: "http://10.0.57.16:5200"
+            - name: PRINT_WEB_API_KEY
+              valueFrom:
+                secretKeyRef:
+                  name: print-web-api-keys
+                  key: password
+            # Intranet search — use in-cluster HTTP (no step-ca TLS needed)
+            # corpus_search.py reads FLOWERCORE_FLEET_VECTOR_DIR but that mount is not
+            # on the cluster yet (BLUEJAY-WS only). The tool gracefully returns a
+            # "no DB found" message with rebuild instructions rather than crashing.
+            - name: FLOWERCORE_INTRANET_URL
+              value: "http://intranet-web.intranet.svc:5300"
            # Kubernetes
            - name: KUBERNETES_SERVICE_HOST
              value: "kubernetes.default.svc"
@@ -420,7 +404,7 @@ spec:
              command:
                - /bin/bash
                - -c
-                - "curl -sf http://localhost:80/ > /dev/null && curl -sf --connect-timeout 3 http://127.0.0.1:11434/api/tags > /dev/null"
+                - "curl -sf http://localhost:80/ > /dev/null && curl -sf --connect-timeout 3 http://fc-llm-bridge.fc-llm-bridge.svc:8080/healthz > /dev/null"
            periodSeconds: 30
            failureThreshold: 2
          resources:
@@ -558,13 +542,6 @@ spec:
          protocol: UDP
        - port: 53
          protocol: TCP
-    # Ollama on edge1 Pi 5 + AI HAT+ (sole upstream — workstation
-    # is private dev hardware and intentionally not allowlisted)
-    - to:
-        - ipBlock:
-            cidr: 10.0.57.17/32
-      ports:
-        - port: 11434
    # Print.Web on edge2
    - to:
        - ipBlock:
@@ -598,6 +575,15 @@ spec:
          protocol: TCP
        - port: 8080
          protocol: TCP
+    # Intranet search API — use in-cluster svc so traffic stays inside
+    # the cluster and is not blocked by the private-range egress denylist.
+    - to:
+        - namespaceSelector:
+            matchLabels:
+              kubernetes.io/metadata.name: intranet
+      ports:
+        - port: 5300
+          protocol: TCP
    # Allow internet (for kubectl image pull, etc)
    - to:
        - ipBlock:
--- a/apps/agent-zero/configmaps-bluejay.yaml
+++ b/apps/agent-zero/configmaps-bluejay.yaml
@@ -7209,6 +7209,9 @@ data:
            "keep_alive": keep_alive,
            "stream": False,
        })
+        curl_headers = ["-H", "Content-Type: application/json"]
+        if os.environ.get("FC_LLM_BRIDGE_API_KEY"):
+            curl_headers.extend(["-H", f"X-Api-Key: {os.environ['FC_LLM_BRIDGE_API_KEY']}"])

        try:
            result = subprocess.run(
@@ -7216,7 +7219,7 @@ data:
                    "curl", "-s", "--max-time", "120",
                    "-X", "POST",
                    f"{api_base}/api/generate",
-                    "-H", "Content-Type: application/json",
+                    *curl_headers,
                    "-d", payload,
                ],
                capture_output=True,
@@ -13150,6 +13153,451 @@ data:
    - PowerShell 5.1 compatibility is assumed (no PowerShell 7+ features).
    - All commands run with `-NoProfile -NonInteractive` flags for clean execution.
    """
+  corpus_search.py: |
+    # FlowerCore Fleet Corpus Vector Search Tool
+    #
+    # Queries the AiStation-built SqliteVecVectorStore DB at /a0/usr/vectors/fleet.db
+    # (bind-mounted read-only from /var/lib/flowercore/vector-stores/ on the host).
+    # Embeds the query through Ollama's nomic-embed-text model, computes cosine
+    # similarity against every stored chunk in pure Python (no numpy — not present
+    # in the container), and returns the top-K nearest neighbors with source metadata.
+    #
+    # This is the offline-friendly counterpart to `intranet_search` (which hits the
+    # Intranet's live REST API). Use it for Bible/Greek/Hebrew/Strong's lookups and
+    # anywhere the workstation has a newer DB than the Intranet one. The store is
+    # refreshed by `aistation-indexer build <edition>` — see the FlowerCore.Knowledge
+    # ADR at docs/ai-agents/flowercore-knowledge-service-plan.md.
+
+    import json
+    import math
+    import os
+    import sqlite3
+    import urllib.request
+    from pathlib import Path
+
+    from python.helpers.tool import Tool, Response
+
+
+    DEFAULT_VECTORS_DIR = os.environ.get(
+        "FLOWERCORE_FLEET_VECTOR_DIR",
+        "/a0/usr/vectors",
+    )
+    # When the caller doesn't pick an explicit DB, prefer the biggest fleet tier
+    # present on disk. Workstation → pi-edge → bmo-bot.
+    PREFERRED_DB_ORDER = [
+        os.environ.get("FLOWERCORE_FLEET_VECTOR_DB", ""),
+        "fleet-workstation-full.db",
+        "fleet-pi-edge.db",
+        "fleet-bmo-bot.db",
+    ]
+    OLLAMA_BASE_URL = os.environ.get(
+        "FLOWERCORE_AGENTZERO_OLLAMA_URL",
+        "http://host.containers.internal:11434",
+    )
+    BRIDGE_API_KEY = os.environ.get("FC_LLM_BRIDGE_API_KEY", "").strip()
+    EMBEDDING_MODEL = os.environ.get(
+        "FLOWERCORE_FLEET_EMBEDDING_MODEL",
+        "nomic-embed-text",
+    )
+
+
+    class CorpusSearch(Tool):
+        async def execute(self, **kwargs) -> Response:
+            """
+            Semantic search over the FlowerCore fleet corpus (Bible texts, lexicons,
+            dictionaries, morphology) pre-indexed by aistation-indexer.
+
+            Args (via self.args):
+                query (str): Search query text. Required unless action=stats.
+                limit (int): Max results. Default 8.
+                index (str): Optional index name filter ("bible-texts", "lexicons",
+                             "dictionaries", "morphology"). Default: all indexes.
+                repo (str): Optional repo filter (e.g. "world-english-bible").
+                db (str): Override DB path OR file name inside FLOWERCORE_FLEET_VECTOR_DIR
+                          (defaults to /a0/usr/vectors). If omitted, the largest
+                          fleet tier present on disk is picked automatically.
+                action (str): Optional. "stats" returns an inventory of all fleet DBs
+                             visible to the tool (names, sizes, index counts, chunk
+                             counts, last-built timestamps). No embedding call.
+
+            Returns:
+                Response with ranked chunks (score, source, text preview) OR
+                (when action=stats) a markdown inventory of available fleet DBs.
+            """
+            query = (self.args.get("query") or "").strip()
+            limit = int(self.args.get("limit") or 8)
+            index_filter = (self.args.get("index") or "").strip()
+            repo_filter = (self.args.get("repo") or "").strip()
+            db_override = (self.args.get("db") or "").strip()
+            action = (self.args.get("action") or "").strip().lower()
+
+            if action == "stats":
+                return Response(message=_render_stats(), break_loop=False)
+
+            if not query:
+                return Response(
+                    message=(
+                        "Error: 'query' is required unless action=stats.\n"
+                        "Example: query=\"what does Genesis 1:1 say\" limit=5\n"
+                        "Inventory: action=stats"
+                    ),
+                    break_loop=False,
+                )
+
+            db = _resolve_db(db_override)
+            if db is None:
+                return Response(
+                    message=(
+                        f"Error: no fleet vector DB found under {DEFAULT_VECTORS_DIR}.\n"
+                        "Host side: run `aistation-indexer build fleet-workstation-full`\n"
+                        "(or `fleet-pi-edge`/`fleet-bmo-bot`) to produce\n"
+                        "`/var/lib/flowercore/vector-stores/<slug>.db`, then confirm the\n"
+                        "Podman unit mounts that directory into `/a0/usr/vectors:ro`."
+                    ),
+                    break_loop=False,
+                )
+
+            try:
+                query_vec = _embed(query)
+            except Exception as e:
+                return Response(
+                    message=f"Error: failed to embed query via Ollama at {OLLAMA_BASE_URL}: {e}",
+                    break_loop=False,
+                )
+
+            try:
+                hits = _search(db, query_vec, index_filter, repo_filter, limit)
+            except Exception as e:
+                return Response(
+                    message=f"Error: corpus search failed: {e}",
+                    break_loop=False,
+                )
+
+            if not hits:
+                return Response(
+                    message=(
+                        f"No matches for '{query}' in {db.name}.\n"
+                        f"Indexes available: " + _list_indexes_summary(db)
+                    ),
+                    break_loop=False,
+                )
+
+            lines = [f"**Corpus search: `{query}`**  (top {len(hits)} of {limit} requested, DB={db.name})", ""]
+            for rank, h in enumerate(hits, 1):
+                passage = h.get("passage") or ""
+                lang = h.get("language") or ""
+                meta_bits = [x for x in (h["index"], h["repo"], passage, lang) if x]
+                meta = "  ·  ".join(meta_bits)
+                preview = h["text"]
+                if len(preview) > 320:
+                    preview = preview[:320].rstrip() + "…"
+                lines.append(f"{rank}. **{h['score']:.3f}**  {meta}")
+                lines.append(f"   `{h['source']}`")
+                lines.append(f"   {preview}")
+                lines.append("")
+
+            return Response(message="\n".join(lines).rstrip() + "\n", break_loop=False)
+
+
+    def _resolve_db(override: str) -> "Path | None":
+        """Pick a fleet DB by explicit path, explicit filename, or preferred order."""
+        vectors_dir = Path(DEFAULT_VECTORS_DIR)
+        if override:
+            # Absolute or relative path that points at a real file wins outright.
+            p = Path(override)
+            if p.is_absolute() and p.exists():
+                return p
+            # Otherwise treat it as a filename within the vectors dir.
+            candidate = vectors_dir / override
+            if candidate.exists():
+                return candidate
+            return None
+
+        for name in PREFERRED_DB_ORDER:
+            if not name:
+                continue
+            p = Path(name) if Path(name).is_absolute() else vectors_dir / name
+            if p.exists():
+                return p
+
+        # Fallback: any *.db in the dir, largest first.
+        if vectors_dir.is_dir():
+            candidates = sorted(vectors_dir.glob("*.db"), key=lambda p: p.stat().st_size, reverse=True)
+            if candidates:
+                return candidates[0]
+        return None
+
+
+    def _embed(text: str) -> list:
+        """Embed a query via Ollama's /api/embeddings. Single-vector response."""
+        body = json.dumps({"model": EMBEDDING_MODEL, "prompt": text}).encode("utf-8")
+        headers = {"Content-Type": "application/json"}
+        if BRIDGE_API_KEY:
+            headers["X-Api-Key"] = BRIDGE_API_KEY
+        req = urllib.request.Request(
+            f"{OLLAMA_BASE_URL.rstrip('/')}/api/embeddings",
+            data=body,
+            headers=headers,
+        )
+        with urllib.request.urlopen(req, timeout=60) as resp:
+            data = json.loads(resp.read().decode("utf-8"))
+        vec = data.get("embedding")
+        if not isinstance(vec, list) or not vec:
+            raise RuntimeError(f"Ollama returned no embedding: {data}")
+        return [float(x) for x in vec]
+
+
+    def _cosine(a: list, b: list) -> float:
+        """Cosine similarity in pure Python — no numpy in the A0 container."""
+        # zip() stops at the shorter — AiStation DB guarantees same dim per index.
+        dot = 0.0
+        na = 0.0
+        nb = 0.0
+        for x, y in zip(a, b):
+            dot += x * y
+            na += x * x
+            nb += y * y
+        if na == 0.0 or nb == 0.0:
+            return 0.0
+        return dot / (math.sqrt(na) * math.sqrt(nb))
+
+
+    def _search(db_path: Path, query_vec: list, index_filter: str, repo_filter: str, limit: int) -> list:
+        """Load entries, compute cosine, return top-K.
+
+        SqliteVecVectorStore schema:
+          VectorIndexes(IndexName, Dimensions, UpdatedAtUtc)
+          VectorEntries(IndexName, ChunkId, TextContent, SourceRepo, SourceFile,
+                        Book, Chapter, VerseRange, Language, ContentType, License,
+                        EstimatedTokens, EmbeddingJson)
+
+        Embeddings are stored as JSON arrays in EmbeddingJson; similarity is computed
+        in Python. For ~100k chunks × 768 dims this takes a couple seconds on a
+        workstation — acceptable for interactive A0 use.
+        """
+        conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
+        try:
+            sql = [
+                "SELECT IndexName, ChunkId, TextContent, SourceRepo, SourceFile, ",
+                "       Book, Chapter, VerseRange, Language, EmbeddingJson ",
+                "FROM VectorEntries",
+            ]
+            where = []
+            params = []
+            if index_filter:
+                where.append("IndexName = ?")
+                params.append(index_filter)
+            if repo_filter:
+                where.append("SourceRepo LIKE ?")
+                params.append(f"%{repo_filter}%")
+            if where:
+                sql.append(" WHERE " + " AND ".join(where))
+            sql.append(";")
+
+            cursor = conn.execute("".join(sql), params)
+
+            # Min-heap by (score, ...) would be faster but for interactive use we
+            # just sort at the end — simpler and readable.
+            scored = []
+            for row in cursor:
+                idx, chunk_id, text, repo, source_file, book, chapter, verses, lang, emb_json = row
+                try:
+                    vec = json.loads(emb_json)
+                except (json.JSONDecodeError, TypeError):
+                    continue
+                score = _cosine(query_vec, vec)
+                passage = None
+                if book and chapter:
+                    passage = f"{book} {chapter}"
+                    if verses:
+                        passage += f":{verses}"
+                scored.append((score, {
+                    "index": idx,
+                    "chunk_id": chunk_id,
+                    "text": text,
+                    "repo": repo or "",
+                    "source": source_file or "",
+                    "passage": passage or "",
+                    "language": lang or "",
+                }))
+            scored.sort(key=lambda t: t[0], reverse=True)
+            return [{"score": s, **meta} for s, meta in scored[:limit]]
+        finally:
+            conn.close()
+
+
+    def _render_stats() -> str:
+        """Markdown inventory of every *.db in FLOWERCORE_FLEET_VECTOR_DIR."""
+        vectors_dir = Path(DEFAULT_VECTORS_DIR)
+        if not vectors_dir.is_dir():
+            return f"No fleet vector dir mounted at {vectors_dir}. Ask the host operator to build an index with scripts/agent-zero/build-fleet-index.sh."
+
+        dbs = sorted(vectors_dir.glob("*.db"))
+        if not dbs:
+            return f"No fleet DBs present under {vectors_dir}. Run `scripts/agent-zero/build-fleet-index.sh fleet-workstation-full` on the host."
+
+        lines = [f"**Fleet vector DB inventory** ({vectors_dir})", ""]
+        for db in dbs:
+            size_mb = db.stat().st_size / (1024 * 1024)
+            lines.append(f"### `{db.name}` ({size_mb:.1f} MB)")
+            try:
+                conn = sqlite3.connect(f"file:{db}?mode=ro", uri=True)
+                try:
+                    idx_rows = conn.execute(
+                        "SELECT IndexName, Dimensions, UpdatedAtUtc FROM VectorIndexes ORDER BY IndexName;"
+                    ).fetchall()
+                    if not idx_rows:
+                        lines.append("- (no indexes registered)")
+                    else:
+                        counts = dict(conn.execute(
+                            "SELECT IndexName, COUNT(*) FROM VectorEntries GROUP BY IndexName;"
+                        ).fetchall())
+                        for name, dim, updated in idx_rows:
+                            count = counts.get(name, 0)
+                            lines.append(f"- **{name}** — {count:,} chunks × {dim}d  (built {updated})")
+                finally:
+                    conn.close()
+            except Exception as e:
+                lines.append(f"- (inspect failed: {e})")
+            lines.append("")
+
+        lines.append(f"**Tool defaults:** embedding model `{EMBEDDING_MODEL}`, Ollama at `{OLLAMA_BASE_URL}`. Pick a DB with `db=<filename>`; filter by `index=<name>`/`repo=<substring>`.")
+        return "\n".join(lines).rstrip() + "\n"
+
+
+    def _list_indexes_summary(db_path: Path) -> str:
+        try:
+            conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
+            try:
+                rows = conn.execute(
+                    "SELECT IndexName, Dimensions, "
+                    "  (SELECT COUNT(*) FROM VectorEntries WHERE VectorEntries.IndexName = VectorIndexes.IndexName) "
+                    "FROM VectorIndexes ORDER BY IndexName;"
+                ).fetchall()
+                if not rows:
+                    return "(no indexes)"
+                return ", ".join(f"{r[0]}({r[2]}×{r[1]}d)" for r in rows)
+            finally:
+                conn.close()
+        except Exception as e:
+            return f"(couldn't list: {e})"
+
+  intranet_search.py: |
+    # Intranet Vector Search Tool
+    # Queries the Blue Jay Lab Intranet's Shared.Indexing RAG corpus over its
+    # live REST API (https://intranet.iamworkin.lan/search). Returns ranked chunks
+    # with source file paths and scores.
+
+    import json
+    import os
+    import ssl
+    import urllib.parse
+    import urllib.request
+
+    from python.helpers.tool import Tool, Response
+
+
+    INTRANET_BASE_URL = os.environ.get(
+        "FLOWERCORE_INTRANET_URL",
+        "https://intranet.iamworkin.lan",
+    )
+    STEPCA_ROOT_CRT = "/a0/usr/ca/stepca-root.crt"
+
+
+    def _ssl_ctx() -> ssl.SSLContext:
+        ctx = ssl.create_default_context()
+        if os.path.exists(STEPCA_ROOT_CRT):
+            ctx.load_verify_locations(cafile=STEPCA_ROOT_CRT)
+        return ctx
+
+
+    class IntranetSearch(Tool):
+        async def execute(self, **kwargs) -> Response:
+            """
+            Search the Blue Jay Lab intranet corpus (docs, project notes, dashboards).
+
+            Args (via self.args):
+                query (str): Search query. Required.
+                limit (int): Max chunks to return. Default 8.
+                corpus (str): Optional corpus filter (e.g. "notes", "docs").
+
+            Returns:
+                Response with ranked chunk text, source path, and score.
+            """
+            query = self.args.get("query", "").strip()
+            limit = int(self.args.get("limit", 8))
+            corpus = self.args.get("corpus", "").strip()
+
+            if not query:
+                return Response(
+                    message="Error: 'query' is required.",
+                    break_loop=False,
+                )
+
+            params = {"q": query, "topK": str(limit)}
+            if corpus:
+                params["indexName"] = corpus
+            url = f"{INTRANET_BASE_URL}/api/search?{urllib.parse.urlencode(params)}"
+
+            try:
+                req = urllib.request.Request(url, headers={"Accept": "application/json"})
+                with urllib.request.urlopen(req, timeout=20, context=_ssl_ctx()) as resp:
+                    raw = resp.read().decode("utf-8", errors="replace")
+            except Exception as exc:
+                return Response(
+                    message=f"Intranet search failed: {exc}\nURL: {url}",
+                    break_loop=False,
+                )
+
+            try:
+                data = json.loads(raw)
+            except json.JSONDecodeError:
+                return Response(
+                    message=f"Intranet returned non-JSON response:\n{raw[:500]}",
+                    break_loop=False,
+                )
+
+            hits = data if isinstance(data, list) else (
+                data.get("results") or data.get("hits") or data.get("chunks") or []
+            )
+            if not hits:
+                return Response(
+                    message=f"No intranet results for query: {query!r}",
+                    break_loop=False,
+                )
+
+            lines = [f"# Intranet search: {query} ({len(hits)} hits)\n"]
+            for i, hit in enumerate(hits[:limit], 1):
+                src = (
+                    hit.get("sourceFile")
+                    or hit.get("source")
+                    or hit.get("path")
+                    or hit.get("file")
+                    or "?"
+                )
+                repo = hit.get("sourceRepo") or ""
+                idx = hit.get("indexName") or ""
+                score = hit.get("score") or hit.get("similarity") or ""
+                text = (
+                    hit.get("snippet")
+                    or hit.get("text")
+                    or hit.get("content")
+                    or hit.get("chunk")
+                    or ""
+                ).strip()
+                if len(text) > 600:
+                    text = text[:600] + "..."
+                header = f"## [{i}] {repo}/{src}" if repo else f"## [{i}] {src}"
+                if idx:
+                    header += f"  ({idx})"
+                if score:
+                    header += f"  score={score:.3f}" if isinstance(score, float) else f"  score={score}"
+                lines.append(header)
+                lines.append(text)
+                lines.append("")
+
+            return Response(message="\n".join(lines), break_loop=False)
+
 kind: ConfigMap
 metadata:
  name: bluejay-tools-c
--- a/apps/fc-llm-bridge/fc-llm-bridge.yaml
+++ b/apps/fc-llm-bridge/fc-llm-bridge.yaml
@@ -97,7 +97,7 @@ spec:
          #   dotnet.exe publish -c Release -o deploy/app \
          #     src/FlowerCore.LlmBridge.Web/FlowerCore.LlmBridge.Web.csproj
          #   podman build -t localhost/fc-llm-bridge:v<tag> -f deploy/Dockerfile.deploy deploy
-          image: localhost/fc-llm-bridge:v202604231520
+          image: localhost/fc-llm-bridge:v202604292028
          imagePullPolicy: Never
          ports:
            - containerPort: 8080
@@ -116,6 +116,10 @@ spec:
              value: "default"
            - name: FlowerCore__LlmBridge__DefaultAppName
              value: "agent-zero"
+            - name: FlowerCore__LlmBridge__UtilModel
+              value: "qwen2.5:1.5b"
+            - name: FlowerCore__LlmBridge__EmbedModel
+              value: "nomic-embed-text"
            # Per-consumer API keys — from OnePasswordItem fc-llm-bridge-api-keys.
            # Each field becomes a Secret key of the same name. The key-name
            # lands in the auth principal's `fc.app` claim for ledger scoping.
--- a/apps/fc-ttsreader/fc-ttsreader.yaml
+++ b/apps/fc-ttsreader/fc-ttsreader.yaml
@@ -296,14 +296,23 @@ spec:
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 18
+          # Sprint E Phase 1a (kokoro stability) — 4 restarts in 2d6h with
+          # exit 143 traced to liveness probe `context deadline exceeded` while
+          # kokoro was busy synthesizing. /v1/audio/voices shares the FastAPI
+          # worker pool with /v1/audio/speech, so a long synth can starve the
+          # probe out within the prior 5s × 3 = 15s window. Bump timeoutSeconds
+          # 5 → 15 and failureThreshold 3 → 5 → 75s grace before kubelet kills
+          # the pod. The TtsCircuitBreaker on the synthesizer side (Phase 1b)
+          # backs this up so the FC backend stops slamming kokoro during
+          # recovery.
          livenessProbe:
            httpGet:
              path: /v1/audio/voices
              port: 8880
            initialDelaySeconds: 180
            periodSeconds: 30
-            timeoutSeconds: 5
-            failureThreshold: 3
+            timeoutSeconds: 15
+            failureThreshold: 5
 ---
 # fc-biblical-tts — eSpeak-NG-backed Ancient Greek + Hebrew TTS with
 # word-level timing for read-along playback. Companion to ttsreader-kokoro
@@ -510,7 +519,7 @@ spec:
        fsGroupChangePolicy: OnRootMismatch
      containers:
        - name: web
-          image: localhost/fc-ttsreader-web:v202604252002
+          image: localhost/fc-ttsreader-web:v202604291817
          imagePullPolicy: Never
          ports:
            - containerPort: 5217
@@ -573,6 +582,19 @@ spec:
              value: "/data/logs"
            - name: TtsReader__Runtime__SmokeStatePath
              value: "/data/ops/smoke-status.json"
+            # Sprint E Day 8 voice-preview disk cache — writes WAVs under
+            # this directory. Default "data/voice-previews" resolves to
+            # the read-only $HOME path under runAsNonRoot=true. Pin to
+            # the writable PVC mount.
+            - name: TtsReader__Preview__CacheDirectory
+              value: "/data/voice-previews"
+            # Sprint E XXL Phase 4γ — content-addressed CDN bundle dir for
+            # POST /api/v1/render. Default "wwwroot/cdn" resolves under the
+            # read-only app filesystem, so pin to the writable PVC mount
+            # alongside other TtsReader runtime data. Manifests + cue audio
+            # land at /data/cdn/sha256/<hash>/manifest.json + cues/.
+            - name: TtsReader__Render__CdnDirectory
+              value: "/data/cdn"
            - name: Auth__ApiKey
              valueFrom:
                secretKeyRef:
--- a/apps/guacamole/guacamole.yaml
+++ b/apps/guacamole/guacamole.yaml
@@ -465,6 +465,22 @@ metadata:
 spec:
  itemPath: vaults/IAmWorkin/items/Guacamole JSON Auth
 ---
+---
+# 1Password-backed credentials for Mac mini VNC access (Phase 1 — 2026-04-28)
+# The operator mints Secret 'macmini-vnc-creds' with keys: username, password, VNC Password
+# Note: '1Password' field label 'VNC Password' -> K8s Secret key 'VNC Password' (space retained)
+# Guacamole VNC connection password is sourced from the 'VNC Password' field.
+# Actual IP is 10.0.56.115 (INFRA VLAN) — the 1P item 'IP' field is kept as backup reference.
+apiVersion: onepassword.com/v1
+kind: OnePasswordItem
+metadata:
+  name: macmini-vnc-creds
+  namespace: guacamole
+  labels:
+    app.kubernetes.io/component: credentials
+    app.kubernetes.io/part-of: flowercore
+spec:
+  itemPath: vaults/IAmWorkin/items/Mac Mini
 # Blue Jay Branding Extension (CSS + translations)
 apiVersion: v1
 kind: ConfigMap
--- a/apps/intranet/intranet.yaml
+++ b/apps/intranet/intranet.yaml
@@ -16,6 +16,15 @@ spec:
    requests:
      storage: 1Gi
 ---
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: intranet-config
+  namespace: intranet
+data:
+  KnowledgeApiKey: ""
+  TrustedHeaderSharedSecret: ""
+---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
@@ -37,7 +46,7 @@ spec:
    spec:
      containers:
        - name: intranet-web
-          image: localhost/fc-intranet-web:v202604242354overridefix
+          image: localhost/fc-intranet-web:v20260429-1646
          imagePullPolicy: Never
          ports:
            - containerPort: 5300
@@ -52,6 +61,27 @@ spec:
            # in minutes. Memory: feedback_pi5_nomic_embed_slow.
            - name: IntranetSearch__OllamaBaseUrl
              value: "http://10.0.56.20:11434"
+            # Sprint E Phase 2α — JSON-file-backed PageReadingOverride persistence
+            # on the writable PVC at /data. Without this env var the
+            # intranet falls back to the in-memory store (loses state on
+            # pod restart). Master's PageReadingOverrideOptions binds
+            # PageReadingOverrides:FilePath.
+            - name: PageReadingOverrides__FilePath
+              value: "/data/page-reading-overrides.json"
+            - name: KnowledgeFleetSearch__BaseUrl
+              value: "https://knowledge.iamworkin.lan"
+            - name: KnowledgeFleetSearch__ApiKey
+              valueFrom:
+                configMapKeyRef:
+                  name: intranet-config
+                  key: KnowledgeApiKey
+                  optional: true
+            - name: TrustedHeaderAuthentication__SharedSecret
+              valueFrom:
+                configMapKeyRef:
+                  name: intranet-config
+                  key: TrustedHeaderSharedSecret
+                  optional: true
          resources:
            requests:
              memory: "256Mi"
--- a/apps/knowledge/README.md
+++ b/apps/knowledge/README.md
@@ -1,7 +1,11 @@
 # knowledge — FlowerCore.Knowledge.Web (Phase 2.4 K8s deploy)

-**Status:** manifests staged, **NOT YET APPLIED**. Image must be built +
-imported AND DNS record provisioned before `git push`.
+**Status:** **LIVE 2026-04-27** at `https://knowledge.iamworkin.lan` —
+Phase 2.4 closed. Pod running, certificate issued (step-ca-acme), PVC
+bound (Longhorn 20Gi RWO), ArgoCD `infra-knowledge` synced. `/healthz`
+returns 200, `/api/v1/editions` returns `[]` (initial-deploy state — no
+*.db files in the PVC yet; Phase 2.5+ admin UI handles bulk
+population).

 - Plan: [`../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md`](../../../FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md)
 - Sprint: [`../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md`](../../../FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md) (Track B)
--- a/apps/knowledge/knowledge.yaml
+++ b/apps/knowledge/knowledge.yaml
@@ -40,6 +40,17 @@ metadata:
  labels:
    app.kubernetes.io/part-of: bluejay-infra
 ---
+# MCP API key — synced from 1Password so /mcp stays gated without baking
+# secrets into Git. The PASSWORD category maps the concealed field to Secret
+# key `password`, which the Deployment reads into FlowerCore:Mcp:ApiKey:Key.
+apiVersion: onepassword.com/v1
+kind: OnePasswordItem
+metadata:
+  name: knowledge-mcp-api-key
+  namespace: knowledge
+spec:
+  itemPath: "vaults/IAmWorkin/items/KnowledgeApiKey"
+---
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
@@ -116,11 +127,19 @@ spec:
              value: "50"
            - name: FlowerCore__Editions__ProfileDirectory
              value: "/app/editions"
-            # Embed via BLUEJAY-WS GPU (R9700, 32GB VRAM). Pi5 Ollama is
-            # ~4-5x slower; use the workstation while we have it.
-            # Memory: feedback_pi5_nomic_embed_slow.
+            # Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster
+            # services do not depend on BLUEJAY-WS (private dev hardware) per
+            # bluejay-infra@0f9d56e. Query-time embedding is fast enough on
+            # edge1 (~ms per query); bulk index rebuilds (Phase 2.5+) will
+            # need a separate ingestion lane that can opt into the
+            # workstation GPU when present.
            - name: FlowerCore__Ollama__BaseUrl
-              value: "http://10.0.56.20:11434"
+              value: "http://10.0.57.17:11434"
+            - name: FlowerCore__Mcp__ApiKey__Key
+              valueFrom:
+                secretKeyRef:
+                  name: knowledge-mcp-api-key
+                  key: password
          resources:
            requests:
              cpu: 100m
--- a/apps/noc-services/noc-services.yaml
+++ b/apps/noc-services/noc-services.yaml
@@ -219,6 +219,65 @@ spec:
  tls:
    secretName: cockpit-tls
 ---
+# ============================================================
+# PuppetDB Dashboard - noc1:8080 (HTTP, web UI only)
+# Agent-to-PuppetDB mTLS still uses port 8081 directly via Puppet CA
+# (NOT via this proxy). See docs/infrastructure/cert-recovery-2026-04-28.md
+# ============================================================
+apiVersion: v1
+kind: Service
+metadata:
+  name: puppetdb-external
+  namespace: noc-proxy
+spec:
+  ports:
+    - port: 8080
+      targetPort: 8080
+      name: http
+  clusterIP: None
+---
+apiVersion: v1
+kind: Endpoints
+metadata:
+  name: puppetdb-external
+  namespace: noc-proxy
+subsets:
+  - addresses:
+      - ip: 10.0.56.10
+    ports:
+      - port: 8080
+        name: http
+---
+apiVersion: cert-manager.io/v1
+kind: Certificate
+metadata:
+  name: puppetdb-tls
+  namespace: noc-proxy
+spec:
+  secretName: puppetdb-tls
+  issuerRef:
+    name: step-ca-acme
+    kind: ClusterIssuer
+  dnsNames:
+    - puppetdb.iamworkin.lan
+---
+apiVersion: traefik.io/v1alpha1
+kind: IngressRoute
+metadata:
+  name: puppetdb
+  namespace: noc-proxy
+spec:
+  entryPoints:
+    - websecure
+  routes:
+    - kind: Rule
+      match: Host(`puppetdb.iamworkin.lan`)
+      services:
+        - name: puppetdb-external
+          port: 8080
+  tls:
+    secretName: puppetdb-tls
+---
 # NetworkPolicy: allow Traefik ingress, allow egress to noc1
 apiVersion: networking.k8s.io/v1
 kind: NetworkPolicy
@@ -242,6 +301,8 @@ spec:
      ports:
        - port: 3000
          protocol: TCP
+        - port: 8080
+          protocol: TCP
        - port: 9090
          protocol: TCP
        - port: 9091
Author	SHA1	Message	Date
Andrew Stoltz	b1ad253dd6	fix(agent-zero): prefix bridge embedding alias for litellm	2026-04-29 21:14:12 -05:00
Andrew Stoltz	ee935f6e07	fix(agent-zero): keep internal util/embed on bridge v1	2026-04-29 21:09:04 -05:00
Andrew Stoltz	2853ee2024	chore(bridge): bump fc-llm-bridge image tag v202604292028	2026-04-29 20:50:55 -05:00
Andrew Stoltz	b4a34e16ca	refactor(agent-zero): drop ollama-proxy sidecar (Phase 3)	2026-04-29 20:50:55 -05:00
Andrew Stoltz	0d5a1fd530	fix(agent-zero): route util and embed through llm bridge	2026-04-29 19:14:01 -05:00
Andrew Stoltz	1b633f57b2	chore(infra): wire knowledge MCP api key secret	2026-04-29 18:04:43 -05:00
Andrew Stoltz	ee8afd0a08	deploy(intranet): promote auth-gated intranet image	2026-04-29 17:11:17 -05:00
Andrew Stoltz	cf35884eae	deploy(intranet): harden knowledge search rollout	2026-04-29 16:43:09 -05:00
Andrew Stoltz	9881767b11	deploy(intranet): bump intranet web for knowledge search lane	2026-04-29 16:21:27 -05:00
Andrew Stoltz	c9bf23834b	chore(ttsreader): bump image to v202604291817 Per-profile MoodAnnotationModelOverride picker — Profiles page now shows a model dropdown from IModelRegistry instead of a free-text field; model override null-falls-back to global TtsReader:Ollama:DefaultModel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 13:21:40 -05:00
Andrew Stoltz	174002023d	fix(agent-zero): move corpus_search + intranet_search into bluejay-tools-c The prior commit `b71f9e4` created a stray YAML document between the bluejay-tools-c and bluejay-profile sections. kubectl applied the stray block's data to bluejay-profile (wrong ConfigMap, wrong mount target). The setup-bluejay initContainer copies bluejay-tools-{a,b,c} to the tools directory; bluejay-profile is copied to the agent profile directory. Tools must live in one of the three tools ConfigMaps. Fix: insert corpus_search.py and intranet_search.py directly into the bluejay-tools-c YAML document (before kind/metadata, matching the data-first layout the rest of the file uses). Also fix two mojibake characters (→ and ·) that were corrupted in the prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 08:49:23 -05:00
Andrew Stoltz	b71f9e4ec9	feat(agent-zero): add corpus_search + intranet_search to cluster configmaps - Add corpus_search.py to bluejay-tools-c: semantic vector search over fleet SQLite-vec DBs (fleet-workstation-full, fleet-pi-edge, fleet-bmo-bot). Returns offline-friendly results for Bible/Greek/Hebrew/Strongs corpora. Cluster pod degrades gracefully (no DB mounted yet — BLUEJAY-WS only for now). - Add intranet_search.py to bluejay-tools-c: live RAG search over the intranet vector store via GET /api/search?q=...&topK=N. Uses in-cluster service URL (http://intranet-web.intranet.svc:5300) to bypass Traefik TLS and the private-range egress denylist. - Fix intranet_search.py param name: was 'limit', now 'topK' matching the SearchController's [FromQuery] parameter name. - NetworkPolicy: add egress rule for intranet namespace port 5300 (without this the pod's TCP connection to the search endpoint was dropped). - agent-zero.yaml: set FLOWERCORE_INTRANET_URL env var to in-cluster service URL so intranet_search uses internal routing, not the public Traefik VIP. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 08:34:31 -05:00
Andrew Stoltz	f1431f7324	feat(agent-zero): wire Print.Web API key to pod via 1Password OnePasswordItem Add `print-web-api-keys` OnePasswordItem CRD that syncs from 1Password "Print.Web API Keys" vault item (password field). Mount as PRINT_WEB_API_KEY env var in the agent-zero container. The print_web.py Python tool (already in bluejay-tools ConfigMaps) reads PRINT_WEB_URL and PRINT_WEB_API_KEY env vars for all HTTP calls to the thermal print service on edge2. Previously the key was unset so every API call was rejected with 401. Note: Print.Web uses the legacy REST MCP shape (/api/mcp/tools/*) not the streamable-http protocol. The Python tool bridges this gap — no /mcp endpoint exists on Print.Web today. Network policy already allows 10.0.57.16:5200. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 20:36:36 -05:00
Andrew Stoltz	35bd055cb4	feat(guacamole): add macmini-vnc-creds OnePasswordItem + fix Mac mini connection IPs Phase 1 of Mac mini onboarding (2026-04-28): - Add OnePasswordItem CRD 'macmini-vnc-creds' in guacamole namespace bound to vault item 'Mac Mini' — operator mints Secret with username/password/VNC Password fields - Mac mini discovered at 10.0.56.115 (INFRA VLAN) — not 10.0.57.50 stored in 1P IP field - Guacamole connections updated via API (not stored here): VNC conn #10, SSH conns #9/#33 corrected from old IP 10.0.57.50 → 10.0.56.115 - macOS: 26.4.1 (Sequoia), Apple M1, 16 GB, user: bluejay (admin group) - VNC port 5900 confirmed open; SSH works via noc1 jumpbox with password auth Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 20:09:45 -05:00
Andrew Stoltz	f604ab419e	feat(ttsreader): bump image to v202604281923 (SignalR ProgressHub) Adds ProgressHub endpoint at /hubs/progress with project-scoped group broadcasting for JobStarted, CueProgress, JobCompleted, and JobFailed events. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 19:30:41 -05:00
Andrew Stoltz	b2786252b0	chore(ttsreader): bump web image to v202604281831 (ops failed-manifest cleanup) Deploys fix for stale Failed manifest accumulation in TTS Reader Ops view and atomic-write guard against empty/corrupt job manifests. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 18:31:53 -05:00
Andrew Stoltz	45ee40920d	fix(ttsreader): bump image to v202604281638 (Range support + Ollama timeout 240s)	2026-04-28 16:44:57 -05:00
Andrew Stoltz	8ad7eb714b	fix(ttsreader): bump image to v202604281542 (annotation few-shot prompt + UI hint)	2026-04-28 15:46:28 -05:00
Andrew Stoltz	3cb44c3104	feat(noc-services): wire puppetdb.iamworkin.lan through Traefik step-ca cert	2026-04-28 15:13:20 -05:00
Andrew Stoltz	2400329acd	fix(intranet): bump image to v20260428-1500 (Monitoring crash patch + Lane 11 anatomy refresh)	2026-04-28 14:59:27 -05:00
Andrew Stoltz	c17af882cc	fix(ttsreader): bump image to v202604281444 for UX polish (cross-chapter Bible passage, /profiles dedup, /ops table)	2026-04-28 14:48:13 -05:00
Andrew Stoltz	76b1938afa	fix(ttsreader): bump image to v202604281434 for live playback regression patch (study-player + speech override synth)	2026-04-28 14:43:06 -05:00
Andrew Stoltz	ced04a6148	intranet: bump web image to v20260428-0953 Sprint E XXL Intranet docs depth + read-aloud-root sweep deploy. Image tag v20260427-2353 → v20260428-0953: - Track A (Intranet.Web@c4f3d78): 7 service pages deepened toward PrintService.razor's 8-tab depth standard. Workflows / Verified Surfaces / Recent Verified Changes added. - Read-aloud-root sweep (Intranet.Web@787982c): data-read-aloud-root wrappers added to 6 older /services/* pages so the read-aloud overlay scopes content extraction precisely instead of falling back to <main> with layout chrome included.	2026-04-28 09:54:27 -05:00
Andrew Stoltz	f2258b92a2	fc-ttsreader: bump web image to v202604280946 + add Render__CdnDirectory env Sprint E XXL Phase 4γ MVP deploy — POST /api/v1/render endpoint. Two changes: 1. Image tag v202604272339 → v202604280946 (TtsReader@d9e0a58 master tip includes the new RenderController + RenderService + 9 tests). 2. New TtsReader__Render__CdnDirectory=/data/cdn env var. Default wwwroot/cdn resolves under the read-only app filesystem when runAsNonRoot=true; pin to the existing writable PVC mount alongside other TtsReader runtime data. Manifests + cue audio land at /data/cdn/sha256/<hash>/manifest.json + cues/. Pre-existing PVC mount at /data/ already covers this — no PVC change needed, just the env var override. Pairs with TtsReader@d9e0a58 master tip (ready for image build + import).	2026-04-28 09:47:46 -05:00
Andrew Stoltz	979a7c7b25	feat(intranet): bump fc-intranet-web to v20260427-2353 + persist PageReadingOverrides Bump intranet image to v20260427-2353 (master @ 38b0148): - Sprint E search lane: /search Blazor page + IntranetSearchService + DocsCorpusIndexer + Shared.Indexing wiring - 7 new service pages: LocalAiAgents, AiTopology, Distribution, Dns, Knowledge, LlmBridge, Provisioning - PiManager drift docs New env var: PageReadingOverrides__FilePath=/data/page-reading-overrides.json so the persisted Lane 2α store lives on the writable PVC instead of the default in-memory fallback (which loses state on pod restart). Operator-edited overrides via the existing /api/v1/pages/{encoded}/overrides controller will now survive across restarts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 23:54:17 -05:00
Andrew Stoltz	0df8f7b936	chore(ttsreader): bump fc-ttsreader-web to v202604272339 (Sprint E Phase C — partial-render UX) TtsReader@9333480: distinguishes partial-render (yellow Warning, audio plays, 'Re-render N failed sentences' button) from full-fail (red Danger, 'Try render again'). New TtsFallbackChainFailedException carries both voices when Kokoro + Piper both fail; chapter breadcrumb names the entire chain instead of just the requested voice. +8 tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 23:40:19 -05:00
Andrew Stoltz	38558641c1	fix(ttsreader-kokoro): bump liveness probe timeouts (Sprint E Phase 1a) Kokoro pod has 4 restarts in 2d6h with exit 143 (SIGTERM from kubelet). kubectl describe events all show: Liveness probe failed: Get "http://10.42.229.109:8880/v1/audio/voices": context deadline exceeded The probe path /v1/audio/voices shares the FastAPI worker pool with /v1/audio/speech. A long synth (Bible chapter, 30+ sentences) holds the pool past the prior 5s × 3 = 15s probe window, kubelet kills the pod, in-flight renders fail. Operator hits "fallback chain failed" toasts + partial-render breadcrumbs during these windows. Bump probe timeoutSeconds 5 → 15 and failureThreshold 3 → 5 → 75 s of grace before kubelet gives up. Combined with the kokoro-side circuit breaker landing in TtsReader (Sprint E Phase 1b), the FC backend will also stop slamming kokoro during recovery so it can serve the probe even faster. The companion Prometheus alerts (KokoroPodFlapping, PiperPodFlapping) land in FlowerCore.Notes/scripts/monitoring/alerts.yml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 23:28:07 -05:00
Andrew Stoltz	63d905b4df	chore(ttsreader): bump fc-ttsreader-web to v202604272236 (Thinking + Feedback ALTERs)	2026-04-27 22:37:08 -05:00
Andrew Stoltz	d95f4e0caf	chore(ttsreader): bump fc-ttsreader-web to v202604272228 (ChatSessions IsFavorite ALTER hotfix)	2026-04-27 22:28:56 -05:00
Andrew Stoltz	7bc565d17e	fix(ttsreader): pin VoicePreview CacheDirectory to /data PVC Day 8 disk-cache warmer crashes on production with 'Read-only file system : /home/app/data' because the relative default 'data/voice-previews' resolves under runAsNonRoot HOME (read-only with readOnlyRootFilesystem=true). Pin to /data/voice-previews so the cache lands on the writable PVC mount alongside ttsreader.db, audio output, and jobs root. Image v202604272216 (already on nodes) is unaffected by this — only the env routing changes. ArgoCD reconciles + rollout restart picks up the new env without rebuild. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 22:24:04 -05:00
Andrew Stoltz	dfe9c3b67e	chore(ttsreader): bump fc-ttsreader-web to v202604272216 (brace-escape fix)	2026-04-27 22:16:19 -05:00
Andrew Stoltz	37f8db89e4	chore(ttsreader): bump fc-ttsreader-web to v202604272208 (Day 10 + VoiceProfiles hotfix) v202604272157 crash-looped on the production PVC because Database.EnsureCreated() is a no-op on existing DBs and the VoiceProfiles table was missing. TtsReader@a9f0b73 adds an idempotent CREATE TABLE IF NOT EXISTS to the infra reconciler before TtsReaderDataSeeder runs. Bumping the manifest to pick up that fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 22:09:08 -05:00
Andrew Stoltz	00c7d8df24	chore(ttsreader): bump fc-ttsreader-web to v202604272157 (Sprint E Day 10 UX polish) Compact project page (Setup chip strip + chapter inspect-toggle drawer) + render feedback (rolling ETA strip + active-chapter pulse) + Bible Dashboard navigates to /projects/{id} on queue. Source TtsReader@79de78b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 21:58:12 -05:00
Andrew Stoltz	c6811eadd8	intranet: bump image to v20260427-newpages-and-topology Adds 7 new pages (5 service pages, AI topology, opencode operator guide) to https://intranet.iamworkin.lan: /services/dns /services/distribution /services/llm-bridge /services/knowledge /services/provisioning /services/ai-topology /development/local-ai-agents Plus topology corrections in /services/ai (AiStack.razor) and 6 new nav entries. Source commit: FlowerCore.Intranet.Web@1598542 on codex-wip-pre-readaloud-collision-2026-04-24. Image built from artifacts/publish via Dockerfile.deploy on BLUEJAY-WS, imported to all 3 RKE2 nodes (rke2-server + rke2-agent1 + rke2-agent2). Build: 0 warnings, 0 errors, 197/197 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 17:52:34 -05:00
Andrew Stoltz	4d9d537d83	fix(knowledge): repoint Ollama at edge1 + flip README to LIVE (Sprint E B7) Two changes after the Phase 2.4 deploy went live at https://knowledge.iamworkin.lan: 1. Ollama URL flip: from BLUEJAY-WS (10.0.56.20:11434) to edge1 Pi 5 (10.0.57.17:11434). Honors the cluster-clean architecture from bluejay-infra@0f9d56e ("Workstation is private dev hardware and should not be in the cluster path"). Query-time embeddings (~ms per query) are fast enough on edge1; bulk index rebuilds (Phase 2.5+) will need a separate ingestion lane that can opt into the workstation GPU when present. ArgoCD picks up the env-var change and rolls the pod automatically — no image rebuild needed. 2. README LIVE status: flip the staged-not-yet-applied banner to LIVE 2026-04-27. Pod running, certificate issued, PVC bound, /healthz 200, /api/v1/editions [] (initial-deploy state). Phase 2.5+ admin UI handles bulk population. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 16:56:35 -05:00