Files
bluejay-infra/apps/fc-llm-bridge/fc-llm-bridge.yaml
Codex 0b52093b36 K8s manifest hardening + new bluejay-infra-lint test project
Manifest hardening (per documented memories):
- apps/asterisk/deployment.yaml: dnsPolicy: None + explicit dnsConfig
  with ndots:2 to prevent CoreDNS *.iamworkin.lan template from
  hijacking external egress (downloads.asterisk.org).
- apps/fc-llm-bridge/fc-llm-bridge.yaml: same dnsConfig pattern for
  api.anthropic.com egress.
- apps/fc-ttsreader/fc-ttsreader.yaml: same dnsConfig pattern for
  huggingface.co model seeding.
- apps/fc-messageboard/fc-messageboard.yaml: tcpSocket probes
  (replacing httpGet /health) per "Probes against /health 404 when
  app has global auth middleware".
- apps/fc-signalcontrol/fc-signalcontrol.yaml: same tcpSocket probe
  fix.

New lint project:
- tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj — local-first
  lint test sweep for the recurring K8s gotchas in the fleet.
- tests/bluejay-infra-lint/FleetManifestLintTests.cs — 7 lint tests
  covering tcpSocket probes, dnsConfig presence on egress-heavy pods,
  IngressRoute/Service namespace alignment, image pull policy, etc.
- tests/bluejay-infra-lint/conftest.dev/ — matching conftest policies
  for environments with conftest/opa.
- .gitignore — adds bin/ + obj/ + DS_Store/swp.

README.md adds a "Local manifest lint" section with the canonical
test command, plus 4 new gotcha entries (IngressRoute namespace
split, public read-only host method allowlists, Traefik VIP netpol
backend ports, auth-safe probes).

Tests: 7 / 7 lint tests passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:18:04 -05:00

284 lines
9.6 KiB
YAML

# FlowerCore.LlmBridge — OpenAI-compatible bridge for Agent Zero.
# Routes through FlowerCore.Shared.Chat (ILlmProviderClient) with budget
# enforcement, response caching, and tier-based model routing. Lets Agent
# Zero (Python) reach Anthropic and Ollama providers without re-implementing
# the C# budget/cache/router primitives.
#
# Design: FlowerCore.Notes/docs/ai-agents/agent-zero-anthropic-bridge.md
# ADR: FlowerCore.Notes/ARCHITECTURE.md (ADR-088)
#
# Deployment order (see bluejay-infra/README.md):
# 1. pfSense DNS override for fc-llm-bridge.iamworkin.lan -> 10.0.56.200
# (REQUIRED before this is applied — cert-manager HTTP-01 will silently
# fail for ~2h backoff otherwise). Run scripts/pfsense-add-dns-overrides.py.
# 2. 1Password items `Claude API Key` (already exists) and
# `FC LLM Bridge API Keys` (create when first non-dev environment comes up).
# 3. Build + import image: localhost/fc-llm-bridge:v<YYYYMMDD><HHMM>
# Import to rke2-server, rke2-agent1, rke2-agent2 via ctr images import.
# 4. Bump the image tag below and git push; ArgoCD ApplicationSet picks up.
# 5. Flip Agent Zero chat.openai.base_url to https://fc-llm-bridge.iamworkin.lan/v1
# and api_key to the op://IAmWorkin/FC LLM Bridge API Keys/agent-zero-k8s value.
---
apiVersion: v1
kind: Namespace
metadata:
name: fc-llm-bridge
labels:
app.kubernetes.io/part-of: flowercore
---
# Claude (Anthropic) API key — shared across FC services.
# Existing 1Password item. `credential` field -> Secret `anthropic-api-key`.
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: anthropic-api-key
namespace: fc-llm-bridge
spec:
itemPath: "vaults/IAmWorkin/items/Claude API Key"
---
# Per-consumer API keys for the bridge itself.
# NEW 1Password item — see apps/fc-llm-bridge/README.md for the field layout
# to create before first apply. Fields become Secret keys of the same name:
# agent-zero-ws, agent-zero-k8s, spare-1, spare-2
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: fc-llm-bridge-api-keys
namespace: fc-llm-bridge
spec:
itemPath: "vaults/IAmWorkin/items/FC LLM Bridge API Keys"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: fc-llm-bridge-data
namespace: fc-llm-bridge
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 2Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fc-llm-bridge
namespace: fc-llm-bridge
labels:
app.kubernetes.io/name: fc-llm-bridge
app.kubernetes.io/part-of: flowercore
spec:
replicas: 1
revisionHistoryLimit: 3
strategy:
type: Recreate
selector:
matchLabels:
app.kubernetes.io/name: fc-llm-bridge
template:
metadata:
labels:
app.kubernetes.io/name: fc-llm-bridge
app.kubernetes.io/part-of: flowercore
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
# Use an explicit DNS policy so external FQDNs like api.anthropic.com are
# resolved directly instead of being expanded through the cluster search
# path that includes iamworkin.lan.
dnsPolicy: None
dnsConfig:
nameservers:
- 10.43.0.10
searches:
- fc-llm-bridge.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "2"
securityContext:
fsGroup: 1654
fsGroupChangePolicy: OnRootMismatch
containers:
- name: web
# Placeholder tag — bump to the image you built + imported to every
# RKE2 node before applying. Build with:
# dotnet.exe publish -c Release -o deploy/app \
# src/FlowerCore.LlmBridge.Web/FlowerCore.LlmBridge.Web.csproj
# podman build -t localhost/fc-llm-bridge:v<tag> -f deploy/Dockerfile.deploy deploy
image: localhost/fc-llm-bridge:v202604300022
imagePullPolicy: Never
ports:
- containerPort: 8080
name: http
env:
- name: ASPNETCORE_URLS
value: "http://+:8080"
- name: ASPNETCORE_ENVIRONMENT
value: "Production"
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false"
# SQLite (budget ledger + response cache + data-protection keys)
- name: FlowerCore__LlmBridge__SqliteConnectionString
value: "Data Source=/data/llm-bridge.db"
- name: FlowerCore__LlmBridge__DefaultTenantId
value: "default"
- name: FlowerCore__LlmBridge__DefaultAppName
value: "agent-zero"
- name: FlowerCore__LlmBridge__UtilModel
value: "qwen2.5:1.5b"
- name: FlowerCore__LlmBridge__EmbedModel
value: "nomic-embed-text"
# Per-consumer API keys — from OnePasswordItem fc-llm-bridge-api-keys.
# Each field becomes a Secret key of the same name. The key-name
# lands in the auth principal's `fc.app` claim for ledger scoping.
- name: FlowerCore__LlmBridge__ApiKeys__agent-zero-ws
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: agent-zero-ws
optional: true
- name: FlowerCore__LlmBridge__ApiKeys__agent-zero-k8s
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: agent-zero-k8s
optional: true
- name: FlowerCore__LlmBridge__ApiKeys__spare-1
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: spare-1
optional: true
- name: FlowerCore__LlmBridge__ApiKeys__spare-2
valueFrom:
secretKeyRef:
name: fc-llm-bridge-api-keys
key: spare-2
optional: true
# Shared.Chat — Ollama (edge1 Pi 5 + AI HAT+, matches bridge default)
- name: FlowerCore__Chat__OllamaBaseUrl
value: "http://10.0.57.17:11434"
- name: FlowerCore__Chat__HttpTimeout
value: "00:05:00"
# Shared.Chat — Anthropic
- name: FlowerCore__Chat__Anthropic__Enabled
value: "true"
- name: FlowerCore__Chat__Anthropic__ApiKey
valueFrom:
secretKeyRef:
name: anthropic-api-key
key: password
- name: FlowerCore__Chat__Anthropic__OrganizationId
valueFrom:
secretKeyRef:
name: anthropic-api-key
key: organization_id
optional: true
- name: FlowerCore__Chat__Anthropic__BaseUrl
value: "https://api.anthropic.com"
- name: FlowerCore__Chat__Anthropic__DefaultModel
value: "claude-sonnet-4-6"
- name: FlowerCore__Chat__Anthropic__AnthropicVersion
value: "2023-06-01"
- name: FlowerCore__Chat__Anthropic__Timeout
value: "00:05:00"
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 1000m
memory: 768Mi
volumeMounts:
- name: data
mountPath: /data
- name: tmp
mountPath: /tmp
- name: app-data
mountPath: /app/data
securityContext:
runAsNonRoot: true
runAsUser: 1654
runAsGroup: 1654
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# tcpSocket probes: the app runs ApiKeyAuthMiddleware. /healthz is
# registered as anonymous via AuthExemptPaths but tcpSocket avoids any
# future accidental middleware ordering regression
# (memory: feedback_k8s_probes_behind_auth_middleware).
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: fc-llm-bridge-data
- name: tmp
emptyDir: {}
# The Dockerfile `WORKDIR /app` pairs with the default
# SqliteConnectionString "Data Source=data/llm-bridge.db" (relative).
# The env var above overrides to /data, so /app/data can be emptyDir.
- name: app-data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: fc-llm-bridge
namespace: fc-llm-bridge
spec:
selector:
app.kubernetes.io/name: fc-llm-bridge
ports:
- port: 8080
targetPort: 8080
name: http
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: fc-llm-bridge-cert
namespace: fc-llm-bridge
spec:
secretName: fc-llm-bridge-tls
issuerRef:
name: step-ca-acme
kind: ClusterIssuer
dnsNames:
- fc-llm-bridge.iamworkin.lan
duration: 720h
renewBefore: 240h
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: fc-llm-bridge
namespace: fc-llm-bridge
spec:
entryPoints:
- websecure
routes:
- match: Host(`fc-llm-bridge.iamworkin.lan`)
kind: Rule
services:
- name: fc-llm-bridge
port: 8080
tls:
secretName: fc-llm-bridge-tls