Two changes after the Phase 2.4 deploy went live at https://knowledge.iamworkin.lan: 1. **Ollama URL flip**: from BLUEJAY-WS (10.0.56.20:11434) to edge1 Pi 5 (10.0.57.17:11434). Honors the cluster-clean architecture from bluejay-infra@0f9d56e ("Workstation is private dev hardware and should not be in the cluster path"). Query-time embeddings (~ms per query) are fast enough on edge1; bulk index rebuilds (Phase 2.5+) will need a separate ingestion lane that can opt into the workstation GPU when present. ArgoCD picks up the env-var change and rolls the pod automatically — no image rebuild needed. 2. **README LIVE status**: flip the staged-not-yet-applied banner to LIVE 2026-04-27. Pod running, certificate issued, PVC bound, /healthz 200, /api/v1/editions [] (initial-deploy state). Phase 2.5+ admin UI handles bulk population. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
231 lines
7.3 KiB
YAML
231 lines
7.3 KiB
YAML
# FlowerCore.Knowledge.Web — fleet vector indexing & RAG hub.
|
|
#
|
|
# Phase 2.4 of the Knowledge service plan. REST + MCP service that scans
|
|
# *.db files under /data/vector-stores and exposes:
|
|
# - REST: /api/v1/editions, /api/v1/corpus/search, /healthz
|
|
# - MCP: list_editions, describe_edition, corpus_search
|
|
# - Static OpenAPI/Scalar via UseFlowerCoreApi
|
|
#
|
|
# Architecture:
|
|
# Plan: FlowerCore.Notes/docs/ai-agents/flowercore-knowledge-service-plan.md
|
|
# Sprint: FlowerCore.Notes/docs/ai-station/sprint-e-xxl-plan.md (Track B)
|
|
# Repo: D:\git\FlowerCore\FlowerCore.Knowledge\
|
|
# Shared: FlowerCore.Common -> FlowerCore.Shared.Indexing (chunkers, vector
|
|
# stores, edition profiles, ICorpusSearchService facade)
|
|
#
|
|
# Deployment order (see apps/knowledge/README.md and the bluejay-infra/README.md
|
|
# top-level checklist):
|
|
# 1. FlowerCore.DNS public A record knowledge.iamworkin.lan -> 10.0.56.200
|
|
# MUST exist BEFORE the Certificate is created, or cert-manager HTTP-01
|
|
# backs off ~2h. Memory: feedback_pfsense_dns_required_for_acme.
|
|
# 2. Build + import the image to ALL RKE2 nodes (server + both agents) since
|
|
# the Pod uses a Longhorn PVC and may schedule anywhere.
|
|
# Memory: feedback_rke2_localhost_imagepullpolicy.
|
|
# 3. Bump the image tag in this file, git push.
|
|
# 4. ArgoCD ApplicationSet picks up within ~3 minutes and creates
|
|
# infra-knowledge.
|
|
#
|
|
# Initial-deploy state:
|
|
# The Longhorn PVC is empty on first deploy. Knowledge.Web's filesystem
|
|
# catalog will report zero editions until vector-store *.db files are
|
|
# pushed into /data/vector-stores. Initial population is a follow-up step
|
|
# (Phase 2.5+, Blazor admin UI's "Rebuild" button); for the first deploy
|
|
# the goal is just to prove the pod boots, /healthz returns 200, and the
|
|
# Traefik IngressRoute serves the Scalar UI.
|
|
---
|
|
apiVersion: v1
|
|
kind: Namespace
|
|
metadata:
|
|
name: knowledge
|
|
labels:
|
|
app.kubernetes.io/part-of: bluejay-infra
|
|
---
|
|
apiVersion: v1
|
|
kind: PersistentVolumeClaim
|
|
metadata:
|
|
name: knowledge-vector-store
|
|
namespace: knowledge
|
|
spec:
|
|
accessModes:
|
|
- ReadWriteOnce
|
|
storageClassName: longhorn
|
|
resources:
|
|
requests:
|
|
storage: 20Gi
|
|
---
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: knowledge-web
|
|
namespace: knowledge
|
|
labels:
|
|
app: knowledge-web
|
|
app.kubernetes.io/name: knowledge-web
|
|
app.kubernetes.io/part-of: bluejay-infra
|
|
spec:
|
|
replicas: 1
|
|
revisionHistoryLimit: 3
|
|
# RWO Longhorn PVC blocks rolling updates (multi-attach error). Recreate
|
|
# is the canonical pattern (memory: feedback_rwo_pvc_blocks_rolling).
|
|
strategy:
|
|
type: Recreate
|
|
selector:
|
|
matchLabels:
|
|
app: knowledge-web
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: knowledge-web
|
|
app.kubernetes.io/name: knowledge-web
|
|
app.kubernetes.io/part-of: bluejay-infra
|
|
annotations:
|
|
prometheus.io/scrape: "true"
|
|
prometheus.io/port: "8080"
|
|
prometheus.io/path: "/metrics"
|
|
spec:
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
fsGroup: 1654
|
|
fsGroupChangePolicy: OnRootMismatch
|
|
containers:
|
|
- name: web
|
|
# Placeholder tag — bump to the image you built + imported to ALL
|
|
# RKE2 nodes via scripts/deploy-knowledge.sh before applying.
|
|
image: localhost/fc-knowledge-web:v202604272200
|
|
imagePullPolicy: Never
|
|
ports:
|
|
- containerPort: 8080
|
|
name: http
|
|
env:
|
|
- name: ASPNETCORE_URLS
|
|
value: "http://+:8080"
|
|
- name: ASPNETCORE_ENVIRONMENT
|
|
value: "Production"
|
|
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
|
|
value: "false"
|
|
# Vector-store directory + embedding model + edition profile dir.
|
|
# Profile JSON is baked into the image at /app/editions via the
|
|
# csproj Content-link from FlowerCore.Common/editions/.
|
|
- name: Knowledge__VectorStoresDirectory
|
|
value: "/data/vector-stores"
|
|
- name: Knowledge__EmbeddingModel
|
|
value: "nomic-embed-text"
|
|
- name: Knowledge__DefaultLimit
|
|
value: "5"
|
|
- name: Knowledge__MaxLimit
|
|
value: "50"
|
|
- name: FlowerCore__Editions__ProfileDirectory
|
|
value: "/app/editions"
|
|
# Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster
|
|
# services do not depend on BLUEJAY-WS (private dev hardware) per
|
|
# bluejay-infra@0f9d56e. Query-time embedding is fast enough on
|
|
# edge1 (~ms per query); bulk index rebuilds (Phase 2.5+) will
|
|
# need a separate ingestion lane that can opt into the
|
|
# workstation GPU when present.
|
|
- name: FlowerCore__Ollama__BaseUrl
|
|
value: "http://10.0.57.17:11434"
|
|
resources:
|
|
requests:
|
|
cpu: 100m
|
|
memory: 256Mi
|
|
limits:
|
|
cpu: 1000m
|
|
memory: 1Gi
|
|
# /healthz is mapped by HealthController (controller-based route).
|
|
# tcpSocket liveness is the defensive fallback in case middleware
|
|
# later gates /healthz behind auth (memory:
|
|
# feedback_k8s_probes_behind_auth_middleware).
|
|
startupProbe:
|
|
httpGet:
|
|
path: /healthz
|
|
port: 8080
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
failureThreshold: 30
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /healthz
|
|
port: 8080
|
|
periodSeconds: 10
|
|
failureThreshold: 3
|
|
livenessProbe:
|
|
tcpSocket:
|
|
port: 8080
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 30
|
|
failureThreshold: 3
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1654
|
|
runAsGroup: 1654
|
|
allowPrivilegeEscalation: false
|
|
readOnlyRootFilesystem: true
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
volumeMounts:
|
|
- name: vector-store
|
|
mountPath: /data/vector-stores
|
|
- name: tmp
|
|
mountPath: /tmp
|
|
- name: logs
|
|
mountPath: /app/logs
|
|
volumes:
|
|
- name: vector-store
|
|
persistentVolumeClaim:
|
|
claimName: knowledge-vector-store
|
|
- name: tmp
|
|
emptyDir: {}
|
|
- name: logs
|
|
emptyDir: {}
|
|
---
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: knowledge-web
|
|
namespace: knowledge
|
|
labels:
|
|
app: knowledge-web
|
|
app.kubernetes.io/name: knowledge-web
|
|
app.kubernetes.io/part-of: bluejay-infra
|
|
spec:
|
|
type: ClusterIP
|
|
selector:
|
|
app: knowledge-web
|
|
ports:
|
|
- name: http
|
|
port: 80
|
|
targetPort: 8080
|
|
---
|
|
apiVersion: cert-manager.io/v1
|
|
kind: Certificate
|
|
metadata:
|
|
name: knowledge-tls
|
|
namespace: knowledge
|
|
spec:
|
|
secretName: knowledge-tls
|
|
issuerRef:
|
|
name: step-ca-acme
|
|
kind: ClusterIssuer
|
|
dnsNames:
|
|
- knowledge.iamworkin.lan
|
|
duration: 2160h # 90d
|
|
renewBefore: 720h # 30d
|
|
---
|
|
apiVersion: traefik.io/v1alpha1
|
|
kind: IngressRoute
|
|
metadata:
|
|
name: knowledge
|
|
namespace: knowledge
|
|
spec:
|
|
entryPoints:
|
|
- websecure
|
|
routes:
|
|
- match: Host(`knowledge.iamworkin.lan`)
|
|
kind: Rule
|
|
services:
|
|
- name: knowledge-web
|
|
port: 80
|
|
tls:
|
|
secretName: knowledge-tls
|