fix(knowledge): repoint Ollama at edge1 + flip README to LIVE (Sprint E B7)

Two changes after the Phase 2.4 deploy went live at
https://knowledge.iamworkin.lan:

1. **Ollama URL flip**: from BLUEJAY-WS (10.0.56.20:11434) to edge1 Pi 5
   (10.0.57.17:11434). Honors the cluster-clean architecture from
   bluejay-infra@0f9d56e ("Workstation is private dev hardware and should
   not be in the cluster path"). Query-time embeddings (~ms per query)
   are fast enough on edge1; bulk index rebuilds (Phase 2.5+) will need a
   separate ingestion lane that can opt into the workstation GPU when
   present. ArgoCD picks up the env-var change and rolls the pod
   automatically — no image rebuild needed.

2. **README LIVE status**: flip the staged-not-yet-applied banner to
   LIVE 2026-04-27. Pod running, certificate issued, PVC bound,
   /healthz 200, /api/v1/editions [] (initial-deploy state). Phase 2.5+
   admin UI handles bulk population.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Andrew Stoltz
2026-04-27 16:56:35 -05:00
parent 0f9d56ee16
commit 4d9d537d83
2 changed files with 13 additions and 6 deletions

View File

@@ -116,11 +116,14 @@ spec:
value: "50"
- name: FlowerCore__Editions__ProfileDirectory
value: "/app/editions"
# Embed via BLUEJAY-WS GPU (R9700, 32GB VRAM). Pi5 Ollama is
# ~4-5x slower; use the workstation while we have it.
# Memory: feedback_pi5_nomic_embed_slow.
# Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster
# services do not depend on BLUEJAY-WS (private dev hardware) per
# bluejay-infra@0f9d56e. Query-time embedding is fast enough on
# edge1 (~ms per query); bulk index rebuilds (Phase 2.5+) will
# need a separate ingestion lane that can opt into the
# workstation GPU when present.
- name: FlowerCore__Ollama__BaseUrl
value: "http://10.0.56.20:11434"
value: "http://10.0.57.17:11434"
resources:
requests:
cpu: 100m