Compare commits

..

128 Commits

Author SHA1 Message Date
Andrew Stoltz
40fd35ba44 deploy(chat): pin CH-6 presence image 2026-06-14 19:26:31 -05:00
Andrew Stoltz
17654835e7 gx10/platform: step-ca-acme issuer + Traefik HelmChart (migration platform layer)
Bootstrap manifests for the GX10 cluster platform layer (NUC->GX10 migration).
Direct-applied to GX10 + LIVE: step-ca-acme ClusterIssuer Ready (ACME->noc1 step-ca),
Traefik v3.6.10 via RKE2 HelmChart CRD at MetalLB VIP 10.0.57.202 (prod-pool, temp
parallel-run; no clash with live old .200). Under gx10/ NOT apps/* to avoid the old
ApplicationSet auto-deploying GX10 manifests to the OLD cluster.
2026-06-14 18:06:25 -05:00
Andrew Stoltz
63b8d4b667 Deploy Chat regroup CH-3 image 2026-06-14 18:01:43 -05:00
Andrew Stoltz
2c12f35f75 agent-zero: fix fc_dms netpol egress port (8080 = pod targetPort, not svc 80)
NetworkPolicy matches the destination POD port. dms-web svc:80 -> containerPort
8080, so the egress must allow 8080 (the fc-chat rule already allows 80+8080,
which is why chat worked and dms timed out). Add 8080 to the fc-dms egress.
2026-06-14 16:25:25 -05:00
Andrew Stoltz
e33fe81823 agent-zero: connect fc_dms MCP (product-manager fan-out, first server)
AZ only had fc_chat (chat-session) + fc_knowledge (RAG) — so it had no product
capabilities (the 'mysql manager' gap). Wire fc_dms (dynamic message signs, ~13
tools): OnePasswordItem dms-mcp-keys (1P 'FlowerCore DMS MCP Keys' field credential)
-> DMS_MCP_API_KEY -> X-Api-Key; builder adds fc_dms; netpol egress fc-dms:80.
Proven: dms-web/mcp returns 200 with this key. presentations/messageboard/
segmentdisplay/telephony 1P MCP-key items exist for the same pattern; mysql+signage
need 1P items provisioned first (mysql/mcp 401s with no key). Watch context budget.
2026-06-14 16:19:34 -05:00
Andrew Stoltz
ef6afdd577 fc-llm-bridge: repoint Ollama to GX10 NodePort (fix AZ MTU black-hole)
The PROD-VLAN VIP 10.0.57.201 MTU-black-holes Agent Zero's ~150KB requests
(full prompt + 108 MCP tools) -> connection reset mid-stream -> AZ 'same message
again' loop. Switch FlowerCore__Chat__OllamaBaseUrl to the INFRA-VLAN NodePort
10.0.56.14:30976 (same VLAN as the old cluster, carries 150KB fine). Verified:
150KB POST = 200 via NodePort, times out via VIP. NodePort pinned to 30976 on GX10.
2026-06-14 15:12:05 -05:00
Andrew Stoltz
62ca7dacf6 telephony: deploy ARI abort-fix image v20260614-arifix; drop 3600s band-aids
Image -> v20260614-arifix (Telephony 86ff0d0: ReceiveAsync no longer cancelled).
Remove the WebSocketKeepAliveTimeoutSeconds/WebSocketReceiveTimeoutSeconds=3600
band-aids; the code now disables the pong deadline by default and ignores the
receive timeout (liveness = keepalive ping + WebSocketException/Close).
2026-06-14 14:36:11 -05:00
Andrew Stoltz
d03a92407d gx10/tts: persist Piper /tts source + manifest (telephony TTS port baseline)
Dockerfile (linux/arm64, en_US-amy-medium baked), tts_service.py (16kHz/16-bit/mono
WAV, numpy resample 22050->16000), gx10-tts.yaml (CPU NodePort 30850, no GPU request),
README (build/import/cutover/verify on the GX10 cluster).
2026-06-14 14:14:59 -05:00
Andrew Stoltz
e4d1735d35 telephony: make TTS cutover EFFECTIVE via Tts__PiperUrl env (overrides configmap)
Root cause: the live deploy carried env Tts__PiperUrl=edge1 (drifted, not in git)
which shadows appsettings Tts.PiperUrl. Codify Tts__PiperUrl=GX10 + Ari__ env to
match live so git is source-of-truth; the configmap edit alone was inert.
2026-06-14 14:12:02 -05:00
Andrew Stoltz
15edcb7c71 telephony: cut TTS over to GX10 (10.0.56.14:30850, amy-medium); keep edge1 warm
- Tts.PiperUrl edge1 10.0.57.17:8500 -> GX10 NodePort 10.0.56.14:30850
- add netpol egress to GX10 TTS; keep edge1 egress as rollback target
- DefaultEngine piper / SampleRate 8000 unchanged (sln16 16kHz path)
2026-06-14 14:01:50 -05:00
Andrew Stoltz
284ca84166 agent-zero: GX10 system prompt rewrite (tool-calling + RAG rules, strip dead lanes)
Sync the bluejay-profile ConfigMap's embedded system_prompt.md with the
rewritten scripts/agent-zero/agents/bluejay/system_prompt.md: Ollama section
-> GX10 hub (VIP 10.0.57.201, GB10/121GiB); model table with tool-calling
flags (qwen2.5 = tools, gemma3 = 400-on-tools/vision-only, nomic = embed);
new 'Models & Tool-Calling' + 'Knowledge & RAG' rule blocks; stripped dead
WSL/R9700/.132/host.docker.internal/port-30050 lanes; de-pinned test counts;
'Blu' team is persona vocabulary not a fixed team. Personality preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 13:40:25 -05:00
Andrew Stoltz
7a86c40cf1 fix(telephony): ARI receive timeout 45s->3600s — the real false-abort root cause
Cancelling ClientWebSocket.ReceiveAsync via CancellationToken ABORTS the
socket (a half-read WS frame can't resume). The per-iteration
iterationCts.CancelAfter(WebSocketReceiveTimeoutSeconds) therefore aborted a
healthy idle ARI WebSocket every 45s (state=Aborted), not the keepalive pong
(proven: loop persisted after pong-timeout 15s->3600s). A large receive
timeout lets ReceiveAsync block harmlessly while the PBX is idle; real drops
still surface immediately as WebSocketException -> reconnect. Proper code fix
(stop using CancelAfter on the receive) tracked separately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 13:04:13 -05:00
Andrew Stoltz
de5c9f39fd deploy(devicemgmt): pin regroup web image 2026-06-14 12:52:30 -05:00
Andrew Stoltz
d5311de676 fix(telephony): stop ARI WebSocket false-abort loop (pong-timeout 15s->3600s)
Asterisk res_http_websocket does not reliably answer client PING frames
with PONG, so .NET KeepAliveTimeout (default 15s) aborted a healthy idle
ARI WebSocket every ~45s (ping@30s + pong-wait@15s), dropping StasisStart
events so the *100 IVR intermittently answered with no audio. Generous pong
timeout stops the false aborts; genuine drops still caught by the 45s
receive-timeout state re-check and TCP-level WebSocketException.

Surfaced by FlowerCore.Telephony.SipTests Call_Star100_ReceivesAudibleAudioStream
(0 RTP packets while ExtToExt RTP-hook passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 12:50:12 -05:00
Andrew Stoltz
7b4f57bb97 deploy(updater): pin regroup web image 2026-06-14 12:45:39 -05:00
Andrew Stoltz
c569c05ad7 deploy(retail-library): roll regroup web images 2026-06-14 12:38:57 -05:00
Andrew Stoltz
fc8297041a deploy(fc-chat): roll effective-prompt debug reveal v20260614-debugreveal-d389e4b
Influence Audit panel now surfaces the per-turn effective prompt
(RagContextSnapshot) as an operator/debug row. FlowerCore.Chat d389e4b.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 12:33:37 -05:00
Andrew Stoltz
e1554757e8 deploy(fc-chat): roll user-bubble prompt-leak fix v20260614-bubblefix-37f57b0
Stored/displayed user message is now the raw prompt; injected scaffolding
(mood contract + guidance + memory) goes to the model via ragContext as a
system message and is captured in RagContextSnapshot for debug.
FlowerCore.Chat 37f57b0 + FlowerCore.Common 4d741b3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 03:15:26 -05:00
Andrew Stoltz
0c8e6ee8ab agent-zero(models): tool-capable qwen2.5 on GX10 via fc-llm-bridge (Wiring A)
Agent Zero's agentic tool-loop ran on cloud Anthropic Sonnet (the bridge's
Anthropic key is currently 401) + gemma3:4b util (gemma3 returns 400 "does not
support tools" — fatal for the loop). Repoint the bridge ModelRouter tiers:
Balanced -> Ollama qwen2.5:14b (AZ chat) and Cheap -> qwen2.5:7b (AZ util), both
on the GX10 VIP 10.0.57.201 (already the bridge OllamaBaseUrl). Env-only, no
rebuild; Wiring A keeps the budget ledger + cache. Also: AZ chat ctx -> 32768,
browser -> qwen2.5:7b (text/tool-capable, vision off), AGENT_NAME -> "Blue Jay"
(the NUC role is retired). qwen2.5:7b + :14b pulled + warm-pinned on the GX10.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 02:38:17 -05:00
Andrew Stoltz
9d5a1cce97 deploy(fc-chat): roll mood-signal build v20260614-moodsignal-a606892
Workstream A: set_mood structured signal replaces leaky [mood:X] text
(FlowerCore.Chat a606892). Image built + imported to rke2-server and
rke2-agent1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 02:21:47 -05:00
Andrew Stoltz
e0460bd881 infra(ai): consolidate fleet Ollama consumers onto GX10 VIP 10.0.57.201
Repoints fc-chat, fc-ttsreader, knowledge, fc-llm-bridge (off the slow edge1
Pi5 10.0.57.17) and intranet (off the reimaged BLUEJAY-AI test laptop
10.0.56.132) to the GX10 (DGX Spark / GB10) Ollama over the PROD MetalLB VIP
10.0.57.201. GX10 serves gemma3:12b/gemma3:4b/qwen2.5:1.5b/nomic-embed-text/
llama3.2:1b on local NVMe, warm-pinned (keep_alive=-1).

fc-chat default model qwen2.5-coder:7b -> gemma3:12b (the coder model won't
pull reliably on the GX10; gemma3:12b is the warm fleet default + a better
general-chat model). Other consumers keep their exact models. Inline comments
referencing edge1/BLUEJAY-AI are now historical; the values are the GX10 VIP.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 00:54:36 -05:00
Robot
303c450bc9 Cl-5: Admin console infra finding — rides DM.Web (zero new infra)
Audit of apps/fc-devicemgmt/ confirms the admin/helpdesk console needs NO new
infra: the existing host-matched IngressRoute (devices.iamworkin.lan, no path
constraint) + step-ca-acme Certificate already cover admin routes served under
FlowerCore:PathBase (ADR-204 routes-inside-DM.Web). ADMIN-CONSOLE-INFRA.md
records the finding + the open Q-MP question (distinct admin hostname vs PathBase
path) with the exact 3-step add if a separate host is later chosen.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 23:22:16 -05:00
Andrew Stoltz
9dd170a9ac deploy(chat): route wave5 chat ollama to edge1 2026-06-13 22:59:18 -05:00
Andrew Stoltz
50a3ee5e8e deploy(chat): enable helpdesk sentiment escalation 2026-06-13 22:51:21 -05:00
Andrew Stoltz
87de007a7f deploy(wave5): roll deep-regroup product images 2026-06-13 22:48:31 -05:00
Andrew Stoltz
77df227425 deploy(intranet): roll product docs image 2026-06-13 20:23:08 -05:00
Andrew Stoltz
a65f422147 infra(gated): stage authentik-tenant-mapping-sync CronJob (Au-3, suspended)
Gated substrate (Cl2-4 / Cl-infra-3) — outside apps/ so the ApplicationSet
will not deploy it, and spec.suspend: true. Reconciles the 1Password
tenant-mapping doc into Authentik groups via Connect REST. Activate at Au-3
public-go (un-suspend + materialize the script ConfigMap). Pairs Codex Cx2-7.
Canonical script: FlowerCore.Notes/scripts/authentik/authentik-tenant-mapping-sync.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 17:34:29 -05:00
Andrew Stoltz
6cb54abfa7 perf(intranet): repoint embed backend to BLUEJAY-AI GPU (10.0.56.132) for faster bulk embed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 17:20:28 -05:00
Andrew Stoltz
d06637b747 deploy(cx2-1): roll chat and intranet wave images 2026-06-13 17:18:11 -05:00
Andrew Stoltz
387097485e infra(public-tls): add gated Let's Encrypt issuers + tenant NetworkPolicy substrate
Cl-infra-2 (deep-regroup 2026-06-13). LE staging+prod ClusterIssuers (HTTP-01
via Traefik, DNS-01 stub) + a per-tenant default-deny NetworkPolicy template,
under gated/public-tls/ OUTSIDE apps/ so the ApplicationSet does NOT auto-apply
them (an applied ACME ClusterIssuer registers an account immediately). Internal
*.iamworkin.lan TLS stays on step-ca. Inert until the operator opens the
web-hosting public-exposure gate (R-1; 14/14 blockers red). Pairs with Codex
Wh-C1 (hybrid public TLS) + Wh-C2 (isolation).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 12:06:31 -05:00
Andrew Stoltz
b098604a6f fix(intranet): point IntranetSearch embed backend at edge1 by IPv4 (10.0.57.17)
The hostname edge1.iamworkin.lan resolves to an unroutable IPv6 from cluster
pods and the CoreDNS *.iamworkin.lan template maps it to the Traefik VIP, so
the corpus indexer failed every embed with "No route to host". edge1's IPv4
(10.0.57.17, PROD VLAN) is pod-routable and has nomic-embed-text; an in-pod
embed test returned real vectors. This makes the now-enabled notes-md/notes-html
indexes actually populate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 11:55:26 -05:00
Andrew Stoltz
110d6fd1e0 infra(intranet): mount Notes docs corpus + enable IntranetSearch indexer
Cl-infra-1 (deep-regroup 2026-06-13). Adds a notes-corpus-clone initContainer
(shallow git clone of bluejay/FlowerCore.Notes into an emptyDir at
/srv/flowercore-notes) + a notes-corpus-sync sidecar (30-min pull) and flips
IntranetSearch__Enabled false->true so the previously doubly-disabled indexer
has a corpus to index (768 md + 108 html under docs/).

- Trailing-dot FQDN gitea-clusterip.gitea.svc.cluster.local. bypasses a CoreDNS
  *.iamworkin.lan template that mis-resolves the in-cluster service name to the
  Traefik VIP for musl / ndots:5 pods (search-domain appending).
- Cred via gitea-corpus-cred secret (canonical 1P bluejay read cred, created
  imperatively in-ns; mirrors the gitea-flowercore-notes argocd repo-cred pattern).
- First-boot bulk embed runs in background via edge1 Ollama; /health stays Ready.

Pairs with Codex In-1 (intranet app-side reindex endpoint + SemaphoreSlim).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 11:48:24 -05:00
Andrew Stoltz
6b2e6a61d0 deploy(dns): roll hosting quota image 2026-06-13 02:06:40 -05:00
Andrew Stoltz
503685d0f5 deploy(devicemgmt): roll windows update policy image 2026-06-13 00:46:30 -05:00
Andrew Stoltz
05f37df5d2 deploy(devicemgmt): roll sqlite-safe trust bundle image 2026-06-13 00:12:13 -05:00
Andrew Stoltz
f3afa64c5d deploy(devicemgmt): roll edge network enrollment image 2026-06-13 00:04:44 -05:00
Andrew Stoltz
b4a1cb63f0 deploy: roll dns tenant repeat fix image 2026-06-12 22:54:30 -05:00
Andrew Stoltz
d95aa453ea deploy: roll dns web repeatable tenant image 2026-06-12 22:45:13 -05:00
Andrew Stoltz
0bbba2739c deploy: roll devicemgmt ollama gateway image 2026-06-12 22:16:15 -05:00
Andrew Stoltz
99f49c1b75 deploy: roll devicemgmt patch ledger image 2026-06-12 21:55:07 -05:00
Andrew Stoltz
14a0e87513 deploy: roll devicemgmt sqlite enrollment fix 2026-06-12 21:32:49 -05:00
Andrew Stoltz
d2e8b5f4a8 deploy: roll devicemgmt enrollment image 2026-06-12 21:26:22 -05:00
Andrew Stoltz
861ed42e2c deploy: roll e4 conformance web images 2026-06-12 19:48:07 -05:00
Andrew Stoltz
605073c299 deploy(devicemgmt): roll e3 ollama policy pack image 2026-06-12 19:27:08 -05:00
Andrew Stoltz
346b287a3d chore(fc-devicemgmt): bump web to v20260612-hubfix-afa9f4d (DeviceAgentHub ct-param enrollment outage fix)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 18:10:37 -05:00
Andrew Stoltz
6bd02f5781 chore(worldbuilder): deploy C7 next arc image 2026-06-12 18:08:54 -05:00
Andrew Stoltz
2a2b416d12 chore(dns): deploy C4 tenant onboarding image 2026-06-12 17:47:37 -05:00
Andrew Stoltz
d3ae09865a chore(chat): deploy C8 action execution image 2026-06-12 17:24:55 -05:00
Andrew Stoltz
637a8ffd69 chore(devicemgmt): deploy C13 policy web image 2026-06-12 17:01:37 -05:00
Andrew Stoltz
6ab232761d chore(ttsreader): bump fc-ttsreader-web to v20260612-ui-conformance (FC UI conformance D5)
Gold PWA primary CTA (mobile-button--primary blue->gold cascade fix) + About
operator jump-links / honest update-status / license (FcAboutPanel contract).
Image built + imported to rke2-server + rke2-agent1; pin so ArgoCD adopts the
new tag instead of reverting the kubectl set image.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 15:51:57 -05:00
Andrew Stoltz
bfe42cf44e feat(fc-network): add FlowerCore.Network app (read-only pfSense plane, ADR-189)
Stand up the pfSense automation plane (Phase 0, read-only) on RKE2 as an
ArgoCD-managed workload at network.iamworkin.lan.

- namespace fc-network
- Deployment fc-network-web: localhost/fc-network-web:v20260612-0b5b049,
  imagePullPolicy Never, port 5340, /healthz probes, runAsNonRoot 1654 +
  readOnlyRootFilesystem, RWO-safe RollingUpdate (maxSurge 0/maxUnavailable 1),
  auth gate-OFF, SQLite + snapshot-store + intended-model paths under /data.
- PVC fc-network-web-data (longhorn, 2Gi): SQLite index + on-box snapshot store
  (full-fidelity raw config.xml stays on-box; service surfaces redacted only).
- Service (ClusterIP 80 -> 5340), Certificate (ClusterIssuer step-ca-acme),
  IngressRoute (network.iamworkin.lan, all methods — POST ingest is local-only).
- kustomization.yaml for local previews / single-app validation.

The ApplicationSet git generator picks this up as infra-fc-network; if it lags,
the Application is applied manually (documented pattern).
2026-06-12 14:21:45 -05:00
Andrew Stoltz
bf96f7b9a2 deploy(devicemgmt): use rwo-safe rolling strategy 2026-06-12 12:42:20 -05:00
Andrew Stoltz
8be054f99a deploy(devicemgmt): use recreate for sqlite pvc rollout 2026-06-12 12:38:05 -05:00
Andrew Stoltz
6abb2d6408 deploy(devicemgmt): roll L8 web image 2026-06-12 12:33:15 -05:00
Andrew Stoltz
8e2c960be3 deploy(dns): align l4 image and auth gate 2026-06-12 12:10:23 -05:00
Andrew Stoltz
c482b66187 deploy(worldbuilder): bump image to v202606121657-35aaa2c-gpu (L2 UI sweep)
Ships the L2 pilot UI sweep to worldbuilder.iamworkin.lan: the dashboard
fc-component fix (missing-styles), ComfyUI local detection, and the rebuilt
About page. Image imported to rke2-server (10.0.56.11) + rke2-agent1
(10.0.56.12). rke2-agent2/10.0.56.13 is retired and was not used.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 12:01:16 -05:00
Andrew Stoltz
bacb756173 feat(fc-desktop): OnePasswordItem CRD for remotedesktop-oidc-client (L9 flip-readiness, gate stays OFF)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 11:31:07 -05:00
Andrew Stoltz
8a576c95ed deploy(fc-ttsreader): v20260612-readalong-corrections
TtsReader master@355a9c6: global pronunciation correction memory
(/corrections + REST/MCP), public read-along embed manifests with
fc-reader single-file cue windows (Common@639e233), mood gathering
timelines, listening-note capture, approved-only render contract fix,
and Codex Phase 14.2 rehearsal cue sheets (#42). Tests 1609/1609.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 10:07:37 -05:00
Andrew Stoltz
41c2243f09 deploy(intranet): roll screenshot metadata image 2026-06-12 01:15:23 -05:00
Andrew Stoltz
c21e602e4d deploy(intranet): roll page reading profile image 2026-06-12 00:34:21 -05:00
Andrew Stoltz
9f6b71c400 deploy(intranet): roll remotedesktop api ref image 2026-06-11 19:23:07 -05:00
Andrew Stoltz
26f90acf1f deploy(intranet): roll platform badge image 2026-06-11 18:59:25 -05:00
Andrew Stoltz
ab00d22657 deploy(worldbuilder): roll route fix image 2026-06-11 16:17:17 -05:00
Andrew Stoltz
c1a43c64b3 deploy(worldbuilder): enable live gpu backend 2026-06-11 16:05:40 -05:00
Andrew Stoltz
7103658342 deploy(intranet): roll regroup follow-through image 2026-06-11 15:58:12 -05:00
Andrew Stoltz
6b12b2bb49 deploy(intranet): roll operator depth image 2026-06-11 15:06:08 -05:00
Andrew Stoltz
a4c9e44a36 fix(runners): disable self-update in k8s pods 2026-06-11 14:57:00 -05:00
Andrew Stoltz
9674a9555e deploy(intranet): roll article depth image 2026-06-11 14:27:24 -05:00
Andrew Stoltz
318252da76 deploy(devicemgmt): roll healthz web image 2026-06-11 14:27:14 -05:00
Andrew Stoltz
3798b7c00e deploy(devicemgmt): enable web runtime 2026-06-11 14:21:51 -05:00
Andrew Stoltz
2707f1ae1e deploy(intranet): roll regroup catalog image 2026-06-11 12:32:40 -05:00
Andrew Stoltz
a7e7c1ae72 deploy(intranet): roll content quality image 2026-06-10 20:13:56 -05:00
Andrew Stoltz
c8df788d72 deploy(intranet): roll webmail health image 2026-06-10 19:15:44 -05:00
Andrew Stoltz
b1a4d7120e deploy(intranet): roll registry health image 2026-06-10 19:10:31 -05:00
Andrew Stoltz
4b57b8e939 fix(intranet): align search deploy config 2026-06-10 19:01:08 -05:00
Andrew Stoltz
70f36c546b deploy(intranet): roll hardening image 2026-06-10 18:58:09 -05:00
Robot
cdbddd71af fc-devicemgmt: stage fresh web image v20260610-bluejay (master 1614fce)
Image built from current DM master (network/BT command plane + Blue Jay
UI.Components restyle) and imported on rke2-server + rke2-agent1.
Deployment stays parked at replicas: 0 — gap 1 is wider than previously
noted (the fc-mysql Operator deployment itself is absent, so instance
CRDs would not reconcile) and gap 2 (1P runtime item) is still open.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:57:43 -05:00
Andrew Stoltz
81ac1f3e4f authentik: align volumeClaimTemplates TypeMeta with SSA-created live object
StatefulSet/authentik-postgres has been eternally OutOfSync since ~Sprint 65
even though 'kubectl diff --server-side --field-manager=argocd-controller'
shows zero real change. The STS was created via ServerSideApply, so the live
object carries apiVersion/kind inside volumeClaimTemplates[]; git omitting
them makes ArgoCD's normalized diff disagree forever. Declare them in git.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:18:29 -05:00
b842738a0e Merge pull request 'Sprint 63 Cx-10: align hardening probe paths with live routes' (#44) from codex/s63-cx10 into main
Sprint 63 Cx-10 live-proof fix after Traefik curls found three stale probe-path annotations. Local lint 100/100; git diff --check clean; no Gitea statuses attached.
2026-06-05 03:02:14 +00:00
Andrew Stoltz
f0cb7a5e81 fix(hardening): align probe-path annotations with live health routes 2026-06-04 22:01:04 -05:00
ac0f665323 Merge pull request 'Draft: Sprint 62 Cx-10 broader exposure hardening' (#43) from codex/s62-cx10 into main
Sprint 63 Cx-10 reconcile-first merge after local lint proof: 100/100 passed, no Gitea statuses attached, CRLF diff check clean.
2026-06-05 02:51:37 +00:00
Andrew Stoltz
c4b08f41ab feat(infra): prestage broader app exposure hardening 2026-06-04 18:14:22 -05:00
Andrew Stoltz
417d3830ae test(lint): reconcile baseline infra assertions 2026-06-04 18:02:32 -05:00
cb4ea13e7a monitoring: mirror Sprint 60 probe coverage
Merged on local lint plus live noc1 Prometheus /api/v1/rules proof.
2026-06-04 18:19:47 +00:00
Andrew Stoltz
a3cd67d6bb monitoring: mirror Sprint 60 probe coverage 2026-06-04 13:15:18 -05:00
Andrew Stoltz
81a3ddac4c fix(auth): mark OIDC healthz probes anonymous 2026-06-04 11:03:20 -05:00
300f8ad546 fix(monitoring): probe OIDC-safe health routes
Sprint 58 Cx-12. Rebased over OIDC GitOps main; YAML parse and focused bluejay-infra lint tests passed.
2026-06-04 06:45:34 +00:00
fe38c2641f Merge pull request 'fix(auth): deploy distribution root anonymous image' (#38) from codex/s58-distribution-root-anon-gitops into main 2026-06-04 06:20:09 +00:00
Andrew Stoltz
3b40dfb185 fix(auth): deploy distribution root anonymous image 2026-06-04 01:19:16 -05:00
103878671c Merge pull request 'fix(auth): deploy Distribution OIDC image tag' (#37) from codex/s58-oidc-proper into main 2026-06-04 06:05:15 +00:00
Andrew Stoltz
36039c1335 fix(auth): deploy distribution oidc image tag 2026-06-04 01:04:44 -05:00
2a66109f13 Merge pull request 'feat(auth): adopt OIDC GitOps for DNS Distribution Media' (#36) from codex/s58-oidc-proper into main 2026-06-04 05:52:56 +00:00
Andrew Stoltz
933fea89d1 feat(auth): adopt oidc apps in gitops 2026-06-04 00:49:36 -05:00
Andrew Stoltz
13f9bb7710 fix(distribution): revert OIDC enforcement — enabling it gated /healthz probe (service down)
Flipping Auth__Enabled=true gated the /healthz readiness probe (302->NotReady->
no endpoints->distribution.iamworkin.lan down, healthz=000). Classic
feedback_k8s_probes_behind_auth_middleware. Revert to false (OIDC env block kept,
gate off) to restore service. Proper fix (AllowAnonymous /healthz + CA-trust +
idempotent Editions seed + OIDC-challenge wiring + browser-proof) -> falcon OIDC lane.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 23:47:29 -05:00
Andrew Stoltz
9a58fd2af6 oidc: flip enforcement ON for knowledge + distribution (no-live-proof, fix-forward)
Operator 2026-06-04: nothing is production yet, flip OIDC + fix-forward (no
browser-proof gate). knowledge: Auth__Enabled false->true (OIDC env already
wired). distribution: add OIDC env block (Authority/Audience/ClientId=distribution,
ClientSecret from distribution-oidc-client) + Enabled=true; public read/entitlement
+ Method() allowlist stay open (OIDC gates admin only). Clients already provisioned
(secrets present). ArgoCD deploys both.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 23:38:48 -05:00
Andrew Stoltz
404d884863 Adopt live Library Retail AiStation web apps 2026-06-03 20:24:32 -05:00
f4bd90f805 Merge pull request #33 from codex/s56-monitoring-coverage
fix(monitoring): repoint pirelay scrape to signalcontrol
2026-06-04 01:22:49 +00:00
Andrew Stoltz
67d67ab73d fix(monitoring): repoint pirelay scrape to signalcontrol 2026-06-03 20:20:36 -05:00
Andrew Stoltz
f7d41cdc60 revert: drop fc-library manifest — Library.Web already deployed live (41h)
Library.Web is already running + serving at library.iamworkin.lan (root=200,
healthz=200), deployed manually 41h ago (image fc-library-web:v20260602-...,
PVC library-web-data holding the live SQLite DB). My from-scratch manifest used
a different PVC name (library-data) which ArgoCD would attach as a fresh empty
volume, orphaning the live DB. Adopting the live deploy into GitOps is a
separate careful task. Not disturbing a working deployment.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 19:30:23 -05:00
Andrew Stoltz
2c0afc28e4 deploy(fc-library): add Library.Web internal-host deployment
From-scratch .Web deploy at library.iamworkin.lan (operator-authorized 2026-06-03).
Cloned from the worldbuilder pattern: Deployment + Service + Longhorn RWO PVC +
step-ca cert + Traefik IngressRoute. SQLite at /data/library.db, no OIDC, both
/health + /healthz probes. Image localhost/fc-library:v202606031925 imported to
both RKE2 nodes. DNS library.iamworkin.lan -> 10.0.56.200 already in pfSense.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 19:28:22 -05:00
Robot
ba5f5dd0fb deploy(knowledge): roll audit backfill fix 2026-06-03 18:24:22 -05:00
Robot
dc699da7b3 fix(knowledge): persist federation database on PVC 2026-06-03 18:17:31 -05:00
Robot
1e8bf54c6e deploy: roll Chat and Knowledge OIDC images 2026-06-03 18:13:09 -05:00
Andrew Stoltz
e2e93d482c Deploy TtsReader schema repair image
Co-Authored-By: Codex <codex@openai.com>
2026-06-02 22:00:15 -05:00
4319cc2b51 Merge PR #32: divoom pi deploy artifact manifests
Lands Divoom-as-DM-device and Divoom-TV Pi HDMI deploy artifacts for Cx-6.
2026-06-03 02:47:36 +00:00
Andrew Stoltz
2bf339ce51 Deploy TtsReader PR29 live proof image
Co-Authored-By: Codex <codex@openai.com>
2026-06-02 21:47:04 -05:00
Andrew Stoltz
5bdedfc5ae divoom: add pi deploy artifact manifests
Add source-controlled Puppet/Hiera contracts for edge2 Divoom-as-DM-device without replacing the live flowercore-divoom systemd deployment.

Add Divoom TV Pi HDMI systemd/Puppet deployment artifacts, LF shell-script guardrails, and focused lint coverage for the additive non-K8s deploy shape.

Co-Authored-By: Codex <codex@openai.com>
2026-06-02 21:45:27 -05:00
Andrew Stoltz
0307ae16ae monitoring(probe): signage/mysql/php blackbox probe / -> /healthz (K8s-target mirror)
Mirrors the live noc1 podman fix + Notes scripts/monitoring/prometheus.yml.
These services enforce OIDC bearer auth (FlowerCore__Auth__Enabled=true), so an
anonymous probe of / returns 401 -> false TraefikServiceDown. All three expose
anonymous /healthz=200. This noc-monitoring.yaml is the forward K8s-migration
target (not live); brings it in sync with the live config.
2026-06-02 01:09:57 -05:00
Andrew Stoltz
6c18f69cf2 mail: remove cert-manager Certificate (manage mail-tls via step-ca JWK + noc1 renew timer)
step-ca-acme only has an HTTP-01 (Traefik) solver, but mail.iamworkin.lan must resolve
to the dedicated MetalLB IP 10.0.56.202 (SMTP/IMAP), so HTTP-01 cannot validate (order
stuck pending since 2026-05-06; cert expired 2026-05-24). mail-tls is now issued from
step-ca's JWK 'admin' provisioner and auto-renewed by a systemd timer on noc1 that writes
the mail-tls secret directly. The secret + Deployment mount + webmail IngressRoute are
unchanged. Re-add a Certificate only if a DNS-01 solver is deployed for step-ca-acme.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 15:55:38 -05:00
Andrew Stoltz
47e2256556 Deploy TtsReader correction bridge images 2026-05-31 12:35:45 -05:00
Andrew Stoltz
9d77f8ba0e fc-updater: disable loki audit sink 2026-05-31 11:34:12 -05:00
Andrew Stoltz
2f4be19c85 fc-updater: bump signing diagnostics image 2026-05-31 00:32:48 -05:00
Andrew Stoltz
2a62c40990 fc-updater: bump image to MSI installer surface 2026-05-30 23:31:48 -05:00
Andrew Stoltz
7be98e5efc Bump UpdateCenter image to hosted-service fix 2026-05-30 20:22:13 -05:00
Andrew Stoltz
a65b356c9d deploy(fc-updater): roll UC to v202605301823-a6c3354 (Phase 3 SQLite fixes)
Durable image bump for FlowerCore.Updater main a6c3354 (PRs #63-#66): hosted-service
+ request-path SQLite DateTimeOffset fixes, StopHost restored + per-tick resilience,
Shared.Settings 1.0.1. Image built + imported to rke2-server. Un-degrades the Phase-9
provenance verifier + settings poll (were stopped under the removed global Ignore mask).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:27:45 -05:00
Andrew Stoltz
08c17ef1b4 fc-updater: bump to v202605301703-296f350-fix2 (BackgroundServiceExceptionBehavior=Ignore so a hosted-service SQLite query crash can't stop the host)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:04:54 -05:00
Andrew Stoltz
06f2f002b7 fc-updater: bump image to v202605301657-296f350-fix1 (Shared.Settings SQLite poll fix)
The v202605301642-296f350-rework image crash-looped: FlowerCore.Shared.Settings SettingsDbPollHostedService
ran a DateTimeOffset Where/OrderBy on SettingsRecordChanges that SQLite can't
translate, and as a BackgroundService it stopped the host. Shared.Settings 1.0.1
materializes the change-log then filters/orders in memory; Updater Web bumped to 1.0.1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:59:37 -05:00
Andrew Stoltz
7ac4a8b4b7 fc-updater: bump image to v202605301642-296f350-rework (ADR-179 rework live)
Deploy the current FlowerCore.Updater main (PRs #52-#61) to prod: MSI-first
packaging, beta gating + per-install tokens, interactive+bearer Authentik OIDC,
native installer apply, and the .fcsetup.exe retirement (DropReleaseInstallers
migration runs on the now-empty DB). Image pre-imported to rke2-server + agent1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:47:28 -05:00
Andrew Stoltz
90f2a86819 ops: trim load for degraded 2-node cluster (agent2 PSU dead)
Scale all github-runner deployments to 1 replica and halt the ci1
KubeVirt VM. With agent2 down (failed PSU) the cluster runs on two
passively-cooled NUCs; the ci1 8-vCPU VM drove agent1 to ~100C. Keep
total load trimmed until replacement hardware is in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 13:47:13 -05:00
Andrew Stoltz
cbdefb2b23 Revert "ci1: expose WinRM/RDP/SSH ports on masquerade interface for Phase 2 bootstrap"
The port additions caused the new VMI to stick at phase=Scheduled with
reason=GuestNotRunning. The guest-console-log sidecar exited 1 and
qemu never started. Reverting to the working 9-day-stable shape until
the port-add path is verified in a non-production VM.

Phase 2 (Windows runner install + registration) needs an operator-
interactive virtctl-vnc session against the rebuilt VM, OR a separate
investigation of why this port-add tipped over the VM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 11:35:10 -05:00
Andrew Stoltz
1c36fe3a0a ci1: expose WinRM/RDP/SSH ports on masquerade interface for Phase 2 bootstrap
The Phase 1 VM has been Running for 9 days but Phase 2 (Puppet bootstrap +
runner registration) was deferred because the operator-interactive
virtctl-vnc path was the only way in. The masquerade interface listed
no exposed ports, so virtctl ssh and kubectl port-forward both hit
'no route to host' — qemu user-mode NAT does not forward inbound by
default.

Adding 5985 (WinRM HTTP) lets a kubectl port-forward + PowerShell
remoting path drive runner registration entirely from outside the VM.
3389 + 22 are reserved for desktop access via Guacamole or virtctl ssh
once OpenSSH Server is installed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 11:24:34 -05:00
Andrew Stoltz
2b420ce8a4 runners: fleet-wide right-size CPU requests from 500m to 100m
All 33 runner Deployments now request 100m CPU instead of 500m,
freeing roughly 50 idle pods × 400m = ~20 cores back to the cluster.
Observed CPU usage on idle runners is ~1m via kubectl top; the 500m
request was a 500× over-provision that was eating allocatable CPU
and blocking new workload scheduling — WorldBuilder runner could not
be scheduled even at the new 100m request because the pre-existing
fleet held the cluster at 99% requested.

Burst headroom preserved by limits.cpu: 2000m unchanged. TtsReader
keeps its 8Gi memory limit from the 2026-05-25 OOMKill fix; only
the CPU request line moves.

Recreate strategy on each deployment means a brief offline window
per runner during rollout; in-flight CI jobs complete on the
existing container before the new spec takes effect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:09:24 -05:00
Andrew Stoltz
5cbc1a06b1 runners: scale DM/AiStation.Linux/WorldBuilder down to 1 replica until cluster relieved
After cutting requests to 100m, 4 of 6 new pods scheduled and 2 stayed
Pending — cluster CPU REQUEST utilization is 49.6 of 48 allocatable cores
because the existing fleet of ~50 idle runners reserves 25.6 cores
(500m × ~50) for ~50m actual use. Single-replica per new repo gets the
service online without competing with in-flight CI from the rest of the
fleet.

When the broader fleet-wide request right-sizing pass lands
(500m → 100m on all idle runners would free ~20 cores), these can be
bumped back to 2 replicas if PR-CI backlog warrants it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:03:30 -05:00
Andrew Stoltz
9e7ee39b3a runners: drop CPU request 500m→100m on DM/AiStation.Linux/WorldBuilder
All 3 fleet nodes were at 99% CPU REQUEST allocation; the 6 new pods
from the previous commit (3 deployments × 2 replicas × 500m) couldn't
schedule. Idle runners actually use ~1m CPU per `kubectl top pods`;
the 500m request was significantly over-provisioned. Burst headroom
preserved by limits.cpu: 2000m unchanged.

Follow-up: similar request right-sizing pass across the rest of the
runner fleet is queued for a future morning-routine sweep — 25 cores
reserved for ~50m actual use is a large slack we can reclaim cluster-
wide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 10:00:23 -05:00
Andrew Stoltz
ae030a5f33 runners: add github-runner Deployments for DeviceManagement + AiStation.Linux + WorldBuilder
Morning-routine 2026-05-26 — these three repos had ZERO online Linux PR-CI
capacity, blocking the Sprint 37 Cx-1 Linux-CI-migration PRs (DM #20/#21/
#22, AiStation.Linux #13, WorldBuilder #3/#4). Chicken-and-egg: the
migration PRs need Linux runners that the migration creates.

Each Deployment uses the same canonical emptyDir-only pattern as the
fresh-2026-05-26 updater deployment that lives just above:
  - replicas: 2 (room for parallel PR-CI without head-of-line blocking)
  - per-pod emptyDir caches (no RWO PVC contention)
  - shared github-runner-token secret (existing ACCESS_TOKEN PAT has
    org-wide read access)
  - LABELS: self-hosted,linux,fc-build-linux
  - DOTNET_INSTALL_DIR pinned per ADR-170 family

For AiStation.Linux specifically: Linux job will now pick up; the
Windows job in #13 remains queued indefinitely until the Windows runner
host substrate lands per Sprint 36 v2 Cl-2 / ADR-174 — that's a separate
arc, not this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 09:55:31 -05:00
bc8c35896f tests: add bluejay-ws runner-exclusion lint + fix 3 stale runner-fleet assertions (#30)
BLUEJAY-WS must never be a fleet GHA runner (operator directive 2026-05-26). Build-side analog of Sprint 9 safe-account exclusion. Also fixes 3 stale runner-fleet assertions broken by initContainer addition + replica tuning.
2026-05-26 03:42:01 +00:00
Andrew Stoltz
2cc91b6df0 runners: bump tts-reader memory limit 4Gi -> 8Gi
The github-runner-tts-reader pod was being OOMKilled (exit 137)
mid-`dotnet test` on the TtsReader 1000+ test suite. PR #21 CI
(the Windows -> Linux runner migration) flapped twice with the
"self-hosted runner lost communication" annotation before the
K8s-side symptoms surfaced via kubectl describe pod.

Requests bumped 1Gi -> 2Gi, limits 4Gi -> 8Gi. Comment added
inline so future fleet runs don't trip the same wall.

Unblocks PR #21 + the 9 other open TtsReader PRs that all rebase
through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:31:48 -05:00
0d2090fe81 runners: add github-runner-updater Deployment (#29)
Close runner-fleet gap for FlowerCore.Updater. Matches Sprint 32 long-tail pattern; registers entry in fleet-lint required-set.
2026-05-26 03:24:13 +00:00
76 changed files with 22419 additions and 16837 deletions

4
.gitattributes vendored Normal file
View File

@@ -0,0 +1,4 @@
/.gitattributes text eol=lf
*.yaml text eol=lf
*.yml text eol=lf
*.sh text eol=lf

View File

@@ -116,6 +116,16 @@ dotnet test tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj -c Release
That test project sweeps `bluejay-infra/apps/**` plus the canonical sibling `FlowerCore.*\\k8s` manifests that share the same workspace. Matching `conftest.dev` policy files live under `tests/bluejay-infra-lint/conftest.dev/` for environments that also have `conftest` or `opa`. That test project sweeps `bluejay-infra/apps/**` plus the canonical sibling `FlowerCore.*\\k8s` manifests that share the same workspace. Matching `conftest.dev` policy files live under `tests/bluejay-infra-lint/conftest.dev/` for environments that also have `conftest` or `opa`.
## Non-K8s Pi Artifacts
Some `apps/*` directories are deployment artifact bundles consumed by Puppet
instead of Kubernetes workloads. `apps/fc-signage-pi-player/` carries the
Chromium signage Pi player, `apps/fc-divoom-dm-pi-device/` carries the additive
edge2 Divoom-as-DeviceManagement-device profile/Hiera contract, and
`apps/fc-divoom-tv-pi/` carries the Divoom TV Pi HDMI systemd/Puppet shape.
These bundles intentionally avoid Deployment, IngressRoute, Certificate, and
OnePasswordItem resources.
## References ## References
- OpenVox noc1 durability runbook: `docs/runbooks/openvoxserver-quadlet-durability.md` - OpenVox noc1 durability runbook: `docs/runbooks/openvoxserver-quadlet-durability.md`

View File

@@ -139,6 +139,20 @@ metadata:
spec: spec:
itemPath: "vaults/IAmWorkin/items/FlowerCore Knowledge MCP Tokens" itemPath: "vaults/IAmWorkin/items/FlowerCore Knowledge MCP Tokens"
---
# FlowerCore DMS Manager MCP key (product-manager fan-out). Synced from the
# 1Password "FlowerCore DMS MCP Keys" item (field `credential`) into Secret
# `dms-mcp-keys`; the deployment reads it as DMS_MCP_API_KEY for the fc_dms
# MCP server. presentations/messageboard/segmentdisplay/telephony 1P MCP-key
# items also exist and follow this same pattern when added.
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: dms-mcp-keys
namespace: agent-zero
spec:
itemPath: "vaults/IAmWorkin/items/FlowerCore DMS MCP Keys"
--- ---
apiVersion: apps/v1 apiVersion: apps/v1
kind: Deployment kind: Deployment
@@ -248,7 +262,7 @@ spec:
# use the bridge's Ollama-compatible root via OLLAMA_HOST. # use the bridge's Ollama-compatible root via OLLAMA_HOST.
mkdir -p /a0/usr/plugins/_model_config mkdir -p /a0/usr/plugins/_model_config
cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG' cat > /a0/usr/plugins/_model_config/config.json << 'MODELCFG'
{"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":8192}},"utility_model":{"provider":"openai","name":"fc:cheap","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"openai","name":"openai/fc:embedding","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","kwargs":{}}} {"allow_chat_override":true,"chat_model":{"provider":"openai","name":"fc:balanced","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":32768,"ctx_history":0.7,"vision":false,"kwargs":{"temperature":0,"num_ctx":32768}},"utility_model":{"provider":"openai","name":"fc:cheap","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","ctx_length":8192,"ctx_input":0.7,"kwargs":{"num_ctx":8192}},"embedding_model":{"provider":"openai","name":"openai/fc:embedding","api_base":"http://fc-llm-bridge.fc-llm-bridge.svc:8080/v1","kwargs":{}}}
MODELCFG MODELCFG
# Strip heredoc indentation # Strip heredoc indentation
sed -i 's/^ //' /a0/usr/plugins/_model_config/config.json sed -i 's/^ //' /a0/usr/plugins/_model_config/config.json
@@ -276,7 +290,7 @@ spec:
fi fi
export A0_SET_mcp_servers="$( export A0_SET_mcp_servers="$(
python3 -c 'import json, os; servers = {}; chat_key = os.getenv("CHAT_MCP_API_KEY"); knowledge_enabled = os.getenv("KNOWLEDGE_MCP_ENABLED", "false").lower() == "true"; token = os.getenv("KNOWLEDGE_MCP_BEARER_TOKEN", "") if knowledge_enabled else ""; chat_key and servers.setdefault("fc_chat", {"type": "streamable-http", "url": "http://chat-web.fc-chat.svc/mcp", "headers": {"X-Api-Key": chat_key}}); token and servers.setdefault("fc_knowledge", {"type": "streamable-http", "url": os.getenv("KNOWLEDGE_MCP_URL", "http://knowledge-web.knowledge.svc/mcp"), "headers": {"Authorization": f"Bearer {token}"}}); print(json.dumps({"mcpServers": servers}, separators=(",", ":")))' python3 -c 'import json, os; servers = {}; chat_key = os.getenv("CHAT_MCP_API_KEY"); knowledge_enabled = os.getenv("KNOWLEDGE_MCP_ENABLED", "false").lower() == "true"; token = os.getenv("KNOWLEDGE_MCP_BEARER_TOKEN", "") if knowledge_enabled else ""; chat_key and servers.setdefault("fc_chat", {"type": "streamable-http", "url": "http://chat-web.fc-chat.svc/mcp", "headers": {"X-Api-Key": chat_key}}); token and servers.setdefault("fc_knowledge", {"type": "streamable-http", "url": os.getenv("KNOWLEDGE_MCP_URL", "http://knowledge-web.knowledge.svc/mcp"), "headers": {"Authorization": f"Bearer {token}"}}); dms_key = os.getenv("DMS_MCP_API_KEY"); dms_key and servers.setdefault("fc_dms", {"type": "streamable-http", "url": os.getenv("DMS_MCP_URL", "http://dms-web.fc-dms.svc/mcp"), "headers": {"X-Api-Key": dms_key}}); print(json.dumps({"mcpServers": servers}, separators=(",", ":")))'
)" )"
# Run the original entrypoint # Run the original entrypoint
exec /exe/initialize.sh $BRANCH exec /exe/initialize.sh $BRANCH
@@ -285,7 +299,7 @@ spec:
env: env:
# Agent identity # Agent identity
- name: AGENT_NAME - name: AGENT_NAME
value: "Blue Jay (NUC)" value: "Blue Jay"
# Chat model — routed through FlowerCore LLM Bridge (ADR-088) # Chat model — routed through FlowerCore LLM Bridge (ADR-088)
# so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep) # so spend is tracked and tier aliases (fc:cheap/fc:balanced/fc:deep)
# dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint. # dispatch to Ollama or Anthropic via a single OpenAI-compat endpoint.
@@ -344,7 +358,7 @@ spec:
- name: A0_SET_browser_model_provider - name: A0_SET_browser_model_provider
value: "ollama" value: "ollama"
- name: A0_SET_browser_model_name - name: A0_SET_browser_model_name
value: "gemma3:4b" value: "qwen2.5:7b"
- name: A0_SET_browser_model_api_base - name: A0_SET_browser_model_api_base
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080" value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
- name: A0_SET_browser_model_api_key - name: A0_SET_browser_model_api_key
@@ -353,7 +367,7 @@ spec:
name: fc-llm-bridge-api-keys name: fc-llm-bridge-api-keys
key: agent-zero-k8s key: agent-zero-k8s
- name: A0_SET_browser_model_vision - name: A0_SET_browser_model_vision
value: "true" value: "false"
- name: OLLAMA_HOST - name: OLLAMA_HOST
value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080" value: "http://fc-llm-bridge.fc-llm-bridge.svc:8080"
- name: FLOWERCORE_AGENTZERO_OLLAMA_URL - name: FLOWERCORE_AGENTZERO_OLLAMA_URL
@@ -393,6 +407,20 @@ spec:
secretKeyRef: secretKeyRef:
name: knowledge-mcp-tokens name: knowledge-mcp-tokens
key: password key: password
# FlowerCore DMS Manager MCP (dynamic message signs) — first of the
# product-manager MCP fan-out. dms-web /mcp requires X-Api-Key; the key
# is synced from 1Password "FlowerCore DMS MCP Keys" (field credential)
# by the dms-mcp-keys OnePasswordItem CRD above. Same builder+env+netpol
# pattern extends to presentations/messageboard/segmentdisplay/telephony
# (all have 1P MCP-key items). MySQL + Signage still need 1P MCP items
# provisioned before they can join (mysql-web /mcp 401s with no key today).
- name: DMS_MCP_URL
value: "http://dms-web.fc-dms.svc/mcp"
- name: DMS_MCP_API_KEY
valueFrom:
secretKeyRef:
name: dms-mcp-keys
key: credential
# Print.Web — Thermal printer service on edge2. # Print.Web — Thermal printer service on edge2.
# PRINT_WEB_URL: internal HTTP (bypasses Traefik TLS — print_web.py # PRINT_WEB_URL: internal HTTP (bypasses Traefik TLS — print_web.py
# runs in-cluster and can reach edge2 directly on the PROD VLAN). # runs in-cluster and can reach edge2 directly on the PROD VLAN).
@@ -637,6 +665,19 @@ spec:
ports: ports:
- port: 5300 - port: 5300
protocol: TCP protocol: TCP
# FlowerCore DMS Manager MCP (product-manager fan-out) — in-cluster
# dms-web. NetworkPolicy matches the destination POD port: dms-web svc:80
# targets containerPort 8080, so the egress MUST allow 8080 (not the svc
# port 80) — same as the fc-chat rule. Allow both for parity.
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: fc-dms
ports:
- port: 80
protocol: TCP
- port: 8080
protocol: TCP
# Allow internet (for kubectl image pull, etc) # Allow internet (for kubectl image pull, etc)
- to: - to:
- ipBlock: - ipBlock:

File diff suppressed because it is too large Load Diff

View File

@@ -1,448 +1,453 @@
# Authentik OIDC backend # Authentik OIDC backend
# ArgoCD-managed. BlueJay Lab. # ArgoCD-managed. BlueJay Lab.
# #
# Stack: # Stack:
# - PostgreSQL 16 StatefulSet (single replica, Longhorn RWO 5Gi) # - PostgreSQL 16 StatefulSet (single replica, Longhorn RWO 5Gi)
# - Redis 7 Deployment (no persistence — session/cache only) # - Redis 7 Deployment (no persistence — session/cache only)
# - Authentik server + worker Deployments (image ghcr.io/goauthentik/server:2024.12.3) # - Authentik server + worker Deployments (image ghcr.io/goauthentik/server:2024.12.3)
# - Media PVC shared between server + worker (Longhorn RWO 2Gi) # - Media PVC shared between server + worker (Longhorn RWO 2Gi)
# - Certificate via step-ca-acme ClusterIssuer # - Certificate via step-ca-acme ClusterIssuer
# - Traefik IngressRoute at id.iamworkin.lan # - Traefik IngressRoute at id.iamworkin.lan
# #
# Secrets come from 1Password item "authentik-credentials" (IAmWorkin vault, id y6i74ch22q5wvm7znquq4nhhcu) # Secrets come from 1Password item "authentik-credentials" (IAmWorkin vault, id y6i74ch22q5wvm7znquq4nhhcu)
# via the OnePasswordItem CRD, materialized into k8s Secret authentik/authentik-credentials. # via the OnePasswordItem CRD, materialized into k8s Secret authentik/authentik-credentials.
# #
# Why the discovery URL is /application/o/pimanager/ : Authentik issues per-application OIDC providers. # Why the discovery URL is /application/o/pimanager/ : Authentik issues per-application OIDC providers.
# The pimanager OIDC application/provider is created after the cluster pods are healthy (manual or # The pimanager OIDC application/provider is created after the cluster pods are healthy (manual or
# via API once the bootstrap token is available — see Notes substrate). # via API once the bootstrap token is available — see Notes substrate).
--- ---
apiVersion: v1 apiVersion: v1
kind: Namespace kind: Namespace
metadata: metadata:
name: authentik name: authentik
labels: labels:
app.kubernetes.io/part-of: bluejay-infra app.kubernetes.io/part-of: bluejay-infra
--- ---
# 1Password operator pulls the authentik-credentials item into a k8s Secret of the same name. # 1Password operator pulls the authentik-credentials item into a k8s Secret of the same name.
# Field labels in 1P become Secret keys: AUTHENTIK_SECRET_KEY, POSTGRES_PASSWORD, REDIS_PASSWORD, # Field labels in 1P become Secret keys: AUTHENTIK_SECRET_KEY, POSTGRES_PASSWORD, REDIS_PASSWORD,
# BOOTSTRAP_ADMIN_PASSWORD, BOOTSTRAP_ADMIN_TOKEN, BOOTSTRAP_ADMIN_EMAIL. # BOOTSTRAP_ADMIN_PASSWORD, BOOTSTRAP_ADMIN_TOKEN, BOOTSTRAP_ADMIN_EMAIL.
apiVersion: onepassword.com/v1 apiVersion: onepassword.com/v1
kind: OnePasswordItem kind: OnePasswordItem
metadata: metadata:
name: authentik-credentials name: authentik-credentials
namespace: authentik namespace: authentik
spec: spec:
itemPath: "vaults/IAmWorkin/items/authentik-credentials" itemPath: "vaults/IAmWorkin/items/authentik-credentials"
--- ---
# Shared media volume for server + worker pods. # Shared media volume for server + worker pods.
apiVersion: v1 apiVersion: v1
kind: PersistentVolumeClaim kind: PersistentVolumeClaim
metadata: metadata:
name: authentik-media name: authentik-media
namespace: authentik namespace: authentik
spec: spec:
storageClassName: longhorn storageClassName: longhorn
accessModes: [ReadWriteOnce] accessModes: [ReadWriteOnce]
resources: resources:
requests: requests:
storage: 2Gi storage: 2Gi
--- ---
# PostgreSQL 16 StatefulSet — Authentik's primary store. # PostgreSQL 16 StatefulSet — Authentik's primary store.
apiVersion: apps/v1 apiVersion: apps/v1
kind: StatefulSet kind: StatefulSet
metadata: metadata:
name: authentik-postgres name: authentik-postgres
namespace: authentik namespace: authentik
labels: labels:
app: authentik-postgres app: authentik-postgres
argocd.argoproj.io/instance: infra-authentik argocd.argoproj.io/instance: infra-authentik
spec: spec:
persistentVolumeClaimRetentionPolicy: persistentVolumeClaimRetentionPolicy:
whenDeleted: Retain whenDeleted: Retain
whenScaled: Retain whenScaled: Retain
podManagementPolicy: OrderedReady podManagementPolicy: OrderedReady
serviceName: authentik-postgres serviceName: authentik-postgres
replicas: 1 replicas: 1
revisionHistoryLimit: 10 revisionHistoryLimit: 10
selector: selector:
matchLabels: matchLabels:
app: authentik-postgres app: authentik-postgres
template: template:
metadata: metadata:
labels: labels:
app: authentik-postgres app: authentik-postgres
spec: spec:
containers: containers:
- name: postgres - name: postgres
image: postgres:16-alpine image: postgres:16-alpine
ports: ports:
- containerPort: 5432 - containerPort: 5432
name: postgres name: postgres
env: env:
- name: POSTGRES_USER - name: POSTGRES_USER
value: authentik value: authentik
- name: POSTGRES_PASSWORD - name: POSTGRES_PASSWORD
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: authentik-credentials name: authentik-credentials
key: POSTGRES_PASSWORD key: POSTGRES_PASSWORD
- name: POSTGRES_DB - name: POSTGRES_DB
value: authentik value: authentik
- name: POSTGRES_INITDB_ARGS - name: POSTGRES_INITDB_ARGS
value: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C" value: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
- name: PGDATA - name: PGDATA
value: /var/lib/postgresql/data/pgdata value: /var/lib/postgresql/data/pgdata
readinessProbe: readinessProbe:
exec: exec:
command: ["pg_isready", "-U", "authentik"] command: ["pg_isready", "-U", "authentik"]
initialDelaySeconds: 5 initialDelaySeconds: 5
periodSeconds: 5 periodSeconds: 5
livenessProbe: livenessProbe:
exec: exec:
command: ["pg_isready", "-U", "authentik"] command: ["pg_isready", "-U", "authentik"]
initialDelaySeconds: 30 initialDelaySeconds: 30
periodSeconds: 30 periodSeconds: 30
resources: resources:
requests: { cpu: 100m, memory: 256Mi } requests: { cpu: 100m, memory: 256Mi }
limits: { cpu: 1000m, memory: 1Gi } limits: { cpu: 1000m, memory: 1Gi }
volumeMounts: volumeMounts:
- name: pgdata - name: pgdata
mountPath: /var/lib/postgresql/data mountPath: /var/lib/postgresql/data
volumeClaimTemplates: volumeClaimTemplates:
- metadata: # apiVersion/kind included deliberately: this STS was created via ArgoCD ServerSideApply,
name: pgdata # so the live object carries PVC TypeMeta inside volumeClaimTemplates; omitting it here
spec: # leaves the app eternally OutOfSync even though kubectl SSA dry-run shows no change.
storageClassName: longhorn - apiVersion: v1
accessModes: [ReadWriteOnce] kind: PersistentVolumeClaim
volumeMode: Filesystem metadata:
resources: name: pgdata
requests: spec:
storage: 5Gi storageClassName: longhorn
accessModes: [ReadWriteOnce]
--- volumeMode: Filesystem
apiVersion: v1 resources:
kind: Service requests:
metadata: storage: 5Gi
name: authentik-postgres
namespace: authentik ---
spec: apiVersion: v1
clusterIP: None kind: Service
selector: metadata:
app: authentik-postgres name: authentik-postgres
ports: namespace: authentik
- name: postgres spec:
port: 5432 clusterIP: None
targetPort: 5432 selector:
app: authentik-postgres
--- ports:
# Redis 7 — session storage + Celery broker. No persistence needed (cache). - name: postgres
apiVersion: apps/v1 port: 5432
kind: Deployment targetPort: 5432
metadata:
name: authentik-redis ---
namespace: authentik # Redis 7 — session storage + Celery broker. No persistence needed (cache).
labels: apiVersion: apps/v1
app: authentik-redis kind: Deployment
argocd.argoproj.io/instance: infra-authentik metadata:
spec: name: authentik-redis
replicas: 1 namespace: authentik
strategy: labels:
type: Recreate app: authentik-redis
selector: argocd.argoproj.io/instance: infra-authentik
matchLabels: spec:
app: authentik-redis replicas: 1
template: strategy:
metadata: type: Recreate
labels: selector:
app: authentik-redis matchLabels:
spec: app: authentik-redis
containers: template:
- name: redis metadata:
image: redis:7-alpine labels:
args: app: authentik-redis
- "--save" spec:
- "" containers:
- "--appendonly" - name: redis
- "no" image: redis:7-alpine
- "--requirepass" args:
- "$(REDIS_PASSWORD)" - "--save"
env: - ""
- name: REDIS_PASSWORD - "--appendonly"
valueFrom: - "no"
secretKeyRef: - "--requirepass"
name: authentik-credentials - "$(REDIS_PASSWORD)"
key: REDIS_PASSWORD env:
ports: - name: REDIS_PASSWORD
- containerPort: 6379 valueFrom:
name: redis secretKeyRef:
readinessProbe: name: authentik-credentials
tcpSocket: { port: 6379 } key: REDIS_PASSWORD
initialDelaySeconds: 5 ports:
periodSeconds: 5 - containerPort: 6379
livenessProbe: name: redis
tcpSocket: { port: 6379 } readinessProbe:
initialDelaySeconds: 30 tcpSocket: { port: 6379 }
periodSeconds: 30 initialDelaySeconds: 5
resources: periodSeconds: 5
requests: { cpu: 50m, memory: 64Mi } livenessProbe:
limits: { cpu: 500m, memory: 256Mi } tcpSocket: { port: 6379 }
initialDelaySeconds: 30
--- periodSeconds: 30
apiVersion: v1 resources:
kind: Service requests: { cpu: 50m, memory: 64Mi }
metadata: limits: { cpu: 500m, memory: 256Mi }
name: authentik-redis
namespace: authentik ---
spec: apiVersion: v1
selector: kind: Service
app: authentik-redis metadata:
ports: name: authentik-redis
- name: redis namespace: authentik
port: 6379 spec:
targetPort: 6379 selector:
app: authentik-redis
--- ports:
# Authentik server Deployment — HTTP frontend on :9000. - name: redis
apiVersion: apps/v1 port: 6379
kind: Deployment targetPort: 6379
metadata:
name: authentik-server ---
namespace: authentik # Authentik server Deployment — HTTP frontend on :9000.
labels: apiVersion: apps/v1
app: authentik-server kind: Deployment
argocd.argoproj.io/instance: infra-authentik metadata:
spec: name: authentik-server
replicas: 1 namespace: authentik
strategy: labels:
type: Recreate # shares /media RWO PVC with worker app: authentik-server
selector: argocd.argoproj.io/instance: infra-authentik
matchLabels: spec:
app: authentik-server replicas: 1
template: strategy:
metadata: type: Recreate # shares /media RWO PVC with worker
labels: selector:
app: authentik-server matchLabels:
spec: app: authentik-server
securityContext: template:
# Authentik image runs as uid 1000 "authentik" but the Longhorn PVC mounts metadata:
# root:root by default. fsGroup recursively chgrp + chmod g+rwx so the labels:
# non-root container can mkdir /media/public during the tenant_files migration. app: authentik-server
fsGroup: 1000 spec:
containers: securityContext:
- name: server # Authentik image runs as uid 1000 "authentik" but the Longhorn PVC mounts
image: ghcr.io/goauthentik/server:2024.12.3 # root:root by default. fsGroup recursively chgrp + chmod g+rwx so the
args: ["server"] # non-root container can mkdir /media/public during the tenant_files migration.
ports: fsGroup: 1000
- containerPort: 9000 containers:
name: http - name: server
- containerPort: 9443 image: ghcr.io/goauthentik/server:2024.12.3
name: https args: ["server"]
env: ports:
- name: AUTHENTIK_SECRET_KEY - containerPort: 9000
valueFrom: name: http
secretKeyRef: - containerPort: 9443
name: authentik-credentials name: https
key: AUTHENTIK_SECRET_KEY env:
- name: AUTHENTIK_REDIS__HOST - name: AUTHENTIK_SECRET_KEY
value: authentik-redis valueFrom:
- name: AUTHENTIK_REDIS__PASSWORD secretKeyRef:
valueFrom: name: authentik-credentials
secretKeyRef: key: AUTHENTIK_SECRET_KEY
name: authentik-credentials - name: AUTHENTIK_REDIS__HOST
key: REDIS_PASSWORD value: authentik-redis
- name: AUTHENTIK_POSTGRESQL__HOST - name: AUTHENTIK_REDIS__PASSWORD
value: authentik-postgres valueFrom:
- name: AUTHENTIK_POSTGRESQL__NAME secretKeyRef:
value: authentik name: authentik-credentials
- name: AUTHENTIK_POSTGRESQL__USER key: REDIS_PASSWORD
value: authentik - name: AUTHENTIK_POSTGRESQL__HOST
- name: AUTHENTIK_POSTGRESQL__PASSWORD value: authentik-postgres
valueFrom: - name: AUTHENTIK_POSTGRESQL__NAME
secretKeyRef: value: authentik
name: authentik-credentials - name: AUTHENTIK_POSTGRESQL__USER
key: POSTGRES_PASSWORD value: authentik
- name: AUTHENTIK_BOOTSTRAP_PASSWORD - name: AUTHENTIK_POSTGRESQL__PASSWORD
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: authentik-credentials name: authentik-credentials
key: BOOTSTRAP_ADMIN_PASSWORD key: POSTGRES_PASSWORD
- name: AUTHENTIK_BOOTSTRAP_TOKEN - name: AUTHENTIK_BOOTSTRAP_PASSWORD
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: authentik-credentials name: authentik-credentials
key: BOOTSTRAP_ADMIN_TOKEN key: BOOTSTRAP_ADMIN_PASSWORD
- name: AUTHENTIK_BOOTSTRAP_EMAIL - name: AUTHENTIK_BOOTSTRAP_TOKEN
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: authentik-credentials name: authentik-credentials
key: BOOTSTRAP_ADMIN_EMAIL key: BOOTSTRAP_ADMIN_TOKEN
- name: AUTHENTIK_DISABLE_UPDATE_CHECK - name: AUTHENTIK_BOOTSTRAP_EMAIL
value: "true" valueFrom:
- name: AUTHENTIK_ERROR_REPORTING__ENABLED secretKeyRef:
value: "false" name: authentik-credentials
- name: AUTHENTIK_LOG_LEVEL key: BOOTSTRAP_ADMIN_EMAIL
value: info - name: AUTHENTIK_DISABLE_UPDATE_CHECK
# First-boot Authentik can take 3+ min on the migration phase value: "true"
# (waiting on DB lock while worker also runs migrations). Initial - name: AUTHENTIK_ERROR_REPORTING__ENABLED
# delays are generous so kubelet doesn't kill the pod mid-migration; value: "false"
# periodSeconds keeps post-startup probing responsive. - name: AUTHENTIK_LOG_LEVEL
readinessProbe: value: info
httpGet: # First-boot Authentik can take 3+ min on the migration phase
path: /-/health/ready/ # (waiting on DB lock while worker also runs migrations). Initial
port: 9000 # delays are generous so kubelet doesn't kill the pod mid-migration;
initialDelaySeconds: 60 # periodSeconds keeps post-startup probing responsive.
periodSeconds: 10 readinessProbe:
timeoutSeconds: 5 httpGet:
failureThreshold: 12 path: /-/health/ready/
livenessProbe: port: 9000
httpGet: initialDelaySeconds: 60
path: /-/health/live/ periodSeconds: 10
port: 9000 timeoutSeconds: 5
initialDelaySeconds: 300 failureThreshold: 12
periodSeconds: 30 livenessProbe:
timeoutSeconds: 10 httpGet:
failureThreshold: 3 path: /-/health/live/
startupProbe: port: 9000
httpGet: initialDelaySeconds: 300
path: /-/health/live/ periodSeconds: 30
port: 9000 timeoutSeconds: 10
initialDelaySeconds: 30 failureThreshold: 3
periodSeconds: 15 startupProbe:
timeoutSeconds: 10 httpGet:
failureThreshold: 40 # 30s + 40*15s = 10.5 min budget path: /-/health/live/
resources: port: 9000
requests: { cpu: 150m, memory: 512Mi } initialDelaySeconds: 30
limits: { cpu: 1500m, memory: 1Gi } periodSeconds: 15
volumeMounts: timeoutSeconds: 10
- name: media failureThreshold: 40 # 30s + 40*15s = 10.5 min budget
mountPath: /media resources:
volumes: requests: { cpu: 150m, memory: 512Mi }
- name: media limits: { cpu: 1500m, memory: 1Gi }
persistentVolumeClaim: volumeMounts:
claimName: authentik-media - name: media
mountPath: /media
--- volumes:
# Authentik worker Deployment — runs Celery background tasks. - name: media
apiVersion: apps/v1 persistentVolumeClaim:
kind: Deployment claimName: authentik-media
metadata:
name: authentik-worker ---
namespace: authentik # Authentik worker Deployment — runs Celery background tasks.
labels: apiVersion: apps/v1
app: authentik-worker kind: Deployment
argocd.argoproj.io/instance: infra-authentik metadata:
spec: name: authentik-worker
replicas: 1 namespace: authentik
strategy: labels:
type: Recreate # shares /media RWO PVC with server app: authentik-worker
selector: argocd.argoproj.io/instance: infra-authentik
matchLabels: spec:
app: authentik-worker replicas: 1
template: strategy:
metadata: type: Recreate # shares /media RWO PVC with server
labels: selector:
app: authentik-worker matchLabels:
spec: app: authentik-worker
securityContext: template:
# Same as server pod — non-root uid 1000 needs PVC group write. metadata:
fsGroup: 1000 labels:
containers: app: authentik-worker
- name: worker spec:
image: ghcr.io/goauthentik/server:2024.12.3 securityContext:
args: ["worker"] # Same as server pod — non-root uid 1000 needs PVC group write.
env: fsGroup: 1000
- name: AUTHENTIK_SECRET_KEY containers:
valueFrom: - name: worker
secretKeyRef: image: ghcr.io/goauthentik/server:2024.12.3
name: authentik-credentials args: ["worker"]
key: AUTHENTIK_SECRET_KEY env:
- name: AUTHENTIK_REDIS__HOST - name: AUTHENTIK_SECRET_KEY
value: authentik-redis valueFrom:
- name: AUTHENTIK_REDIS__PASSWORD secretKeyRef:
valueFrom: name: authentik-credentials
secretKeyRef: key: AUTHENTIK_SECRET_KEY
name: authentik-credentials - name: AUTHENTIK_REDIS__HOST
key: REDIS_PASSWORD value: authentik-redis
- name: AUTHENTIK_POSTGRESQL__HOST - name: AUTHENTIK_REDIS__PASSWORD
value: authentik-postgres valueFrom:
- name: AUTHENTIK_POSTGRESQL__NAME secretKeyRef:
value: authentik name: authentik-credentials
- name: AUTHENTIK_POSTGRESQL__USER key: REDIS_PASSWORD
value: authentik - name: AUTHENTIK_POSTGRESQL__HOST
- name: AUTHENTIK_POSTGRESQL__PASSWORD value: authentik-postgres
valueFrom: - name: AUTHENTIK_POSTGRESQL__NAME
secretKeyRef: value: authentik
name: authentik-credentials - name: AUTHENTIK_POSTGRESQL__USER
key: POSTGRES_PASSWORD value: authentik
- name: AUTHENTIK_DISABLE_UPDATE_CHECK - name: AUTHENTIK_POSTGRESQL__PASSWORD
value: "true" valueFrom:
- name: AUTHENTIK_ERROR_REPORTING__ENABLED secretKeyRef:
value: "false" name: authentik-credentials
- name: AUTHENTIK_LOG_LEVEL key: POSTGRES_PASSWORD
value: info - name: AUTHENTIK_DISABLE_UPDATE_CHECK
resources: value: "true"
requests: { cpu: 100m, memory: 256Mi } - name: AUTHENTIK_ERROR_REPORTING__ENABLED
limits: { cpu: 1000m, memory: 768Mi } value: "false"
volumeMounts: - name: AUTHENTIK_LOG_LEVEL
- name: media value: info
mountPath: /media resources:
volumes: requests: { cpu: 100m, memory: 256Mi }
- name: media limits: { cpu: 1000m, memory: 768Mi }
persistentVolumeClaim: volumeMounts:
claimName: authentik-media - name: media
mountPath: /media
--- volumes:
apiVersion: v1 - name: media
kind: Service persistentVolumeClaim:
metadata: claimName: authentik-media
name: authentik-server
namespace: authentik ---
spec: apiVersion: v1
selector: kind: Service
app: authentik-server metadata:
ports: name: authentik-server
- name: http namespace: authentik
port: 9000 spec:
targetPort: 9000 selector:
- name: https app: authentik-server
port: 9443 ports:
targetPort: 9443 - name: http
port: 9000
--- targetPort: 9000
# step-ca leaf certificate for id.iamworkin.lan. - name: https
# step-ca container resolver uses pfSense Unbound, so the public A record for id.iamworkin.lan port: 9443
# MUST exist before this Certificate is applied (cert-manager HTTP-01 will silently 2h-backoff targetPort: 9443
# otherwise). Added 2026-05-25 via scripts/pfsense-add-id-host.py.
apiVersion: cert-manager.io/v1 ---
kind: Certificate # step-ca leaf certificate for id.iamworkin.lan.
metadata: # step-ca container resolver uses pfSense Unbound, so the public A record for id.iamworkin.lan
name: authentik-tls # MUST exist before this Certificate is applied (cert-manager HTTP-01 will silently 2h-backoff
namespace: authentik # otherwise). Added 2026-05-25 via scripts/pfsense-add-id-host.py.
spec: apiVersion: cert-manager.io/v1
secretName: authentik-tls kind: Certificate
dnsNames: metadata:
- id.iamworkin.lan name: authentik-tls
issuerRef: namespace: authentik
name: step-ca-acme spec:
kind: ClusterIssuer secretName: authentik-tls
dnsNames:
--- - id.iamworkin.lan
apiVersion: traefik.io/v1alpha1 issuerRef:
kind: IngressRoute name: step-ca-acme
metadata: kind: ClusterIssuer
name: authentik
namespace: authentik ---
spec: apiVersion: traefik.io/v1alpha1
entryPoints: [websecure] kind: IngressRoute
routes: metadata:
- match: Host(`id.iamworkin.lan`) name: authentik
kind: Rule namespace: authentik
services: spec:
- name: authentik-server entryPoints: [websecure]
port: 9000 routes:
tls: - match: Host(`id.iamworkin.lan`)
secretName: authentik-tls kind: Rule
services:
- name: authentik-server
port: 9000
tls:
secretName: authentik-tls

View File

@@ -0,0 +1,195 @@
# FlowerCore.AiStation.Web GitOps adoption manifest.
#
# Authored from the already-live fc-aistation resources on 2026-06-04.
# Keep the live image tag, Service ClusterIP, and PVC volumeName unchanged so
# ArgoCD adopts in place instead of replacing the workload or data volume.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: aistation-web-data
namespace: fc-aistation
labels:
app.kubernetes.io/name: aistation-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-aistation
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: longhorn
volumeMode: Filesystem
volumeName: pvc-27448d6f-6e66-42a7-a293-73dd8bbd6b3e
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: aistation-web
namespace: fc-aistation
labels:
app.kubernetes.io/name: aistation-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-aistation
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 3
selector:
matchLabels:
app.kubernetes.io/name: aistation-web
strategy:
type: Recreate
template:
metadata:
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/path: /metrics/prometheus
prometheus.io/port: "5000"
prometheus.io/scrape: "true"
labels:
app.kubernetes.io/name: aistation-web
app.kubernetes.io/part-of: flowercore
spec:
containers:
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
- envFrom:
- configMapRef:
name: aistation-web-config
image: localhost/fc-aistation-web:v20260602-aistation-owned-deploy-fix2
imagePullPolicy: Never
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 5000
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
name: aistation-web
ports:
- containerPort: 5000
name: http
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /healthz
port: 5000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: aistation-web-data
---
apiVersion: v1
kind: Service
metadata:
name: aistation-web
namespace: fc-aistation
labels:
app.kubernetes.io/name: aistation-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-aistation
spec:
clusterIP: 10.43.211.127
clusterIPs:
- 10.43.211.127
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
port: 80
protocol: TCP
targetPort: 5000
selector:
app.kubernetes.io/name: aistation-web
sessionAffinity: None
type: ClusterIP
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: aistation-web-tls
namespace: fc-aistation
labels:
app.kubernetes.io/name: aistation-web-tls
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-aistation
spec:
dnsNames:
- aistation.iamworkin.lan
issuerRef:
kind: ClusterIssuer
name: step-ca-acme
secretName: aistation-web-tls
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: aistation-web
namespace: fc-aistation
labels:
app.kubernetes.io/name: aistation-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-aistation
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`aistation.iamworkin.lan`)
services:
- name: aistation-web
port: 80
tls:
secretName: aistation-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose aistation-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: aistation-web-public
# namespace: fc-aistation
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`aistation.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: aistation-web-public-profile-header # injects entitlement profile
# services:
# - name: aistation-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -1,5 +1,207 @@
# FlowerCore Chat — TLS + Ingress # FlowerCore Chat
# Deployment and Service managed by deploy script (not ArgoCD) #
# ArgoCD-managed workload plus TLS/Ingress. The chat-web-secret remains an
# out-of-band Secret until the values are moved into a 1Password-backed item;
# the Deployment references it as optional so GitOps can own the workload
# without storing secret material in this repo.
---
apiVersion: v1
kind: Namespace
metadata:
name: fc-chat
labels:
app.kubernetes.io/part-of: flowercore
---
apiVersion: v1
kind: ConfigMap
metadata:
name: chat-web-config
namespace: fc-chat
labels:
app.kubernetes.io/name: chat-web
app.kubernetes.io/part-of: flowercore
data:
ASPNETCORE_ENVIRONMENT: Production
ASPNETCORE_URLS: "http://+:8080"
ASPNETCORE_FORWARDEDHEADERS_ENABLED: "true"
FlowerCore__Auth__Enabled: "false"
FlowerCore__Auth__Oidc__Enabled: "true"
FlowerCore__Auth__Oidc__Authority: "https://id.iamworkin.lan/application/o/chat/"
FlowerCore__Auth__Oidc__Audience: "chat"
FlowerCore__Auth__Oidc__ClientId: "chat"
FlowerCore__Database__ConnectionStrings__Sqlite: "Data Source=/data/chat.db"
# Ollama target. BLUEJAY-WS remains faster from the workstation, but this lane
# proved Chat pods time out reaching 10.0.56.20:11434. Keep generation and
# behavior-rule checks on the cluster-routable edge1 endpoint until that route
# is fixed; choose models that edge1 actually hosts.
FlowerCore__AI__OllamaBaseUrl: "http://10.0.57.201:11434"
FlowerCore__AI__DefaultModelName: "gemma3:12b"
ChatOptions__BehaviorRuleEngine__OllamaBaseUrl: "http://10.0.57.201:11434"
ChatOptions__BehaviorRuleEngine__FallbackOllamaBaseUrl: "http://10.0.57.201:11434"
ChatOptions__BehaviorRuleEngine__ModelName: "gemma3:4b"
FlowerCore__AI__Memory__UseSharedIndexingAdapter: "true"
FlowerCore__AI__Memory__UseOllamaEmbeddings: "true"
FlowerCore__AI__Memory__EmbeddingModel: "nomic-embed-text"
FlowerCore__AI__Memory__EnableSharedIndexingBackfill: "true"
FlowerCore__AI__Memory__SharedIndexingDatabasePath: "/data/chat-memory-index.db"
FlowerCore__AI__Skills__Library__LibraryApiUrl: "http://library-web.fc-library.svc.cluster.local"
FlowerCore__AI__Skills__Retail__RetailApiUrl: "http://retail-web.fc-retail.svc.cluster.local"
FlowerCore__AI__Skills__Intranet__IntranetBaseUrl: "http://intranet-web.intranet.svc.cluster.local"
FlowerCore__AI__Skills__Print__PrintMcpBaseUrl: "http://10.0.57.16:5200"
FlowerCore__AI__Helpdesk__SentimentEscalation__Enabled: "true"
FlowerCore__AI__IrcBridge__Enabled: "true"
FlowerCore__AI__IrcBridge__DefaultProfileSlug: "it-helpdesk"
FlowerCore__AI__IrcBridge__MentionProfileSlug: "it-helpdesk"
FlowerCore__AI__IrcBridge__MentionReactiveMode: "mentions-only"
FlowerCore__AI__IrcBridge__AllowActionExecution: "false"
FlowerCore__AI__Voice__Piper__Host: "10.0.57.17"
FlowerCore__AI__Voice__Piper__Port: "10400"
FlowerCore__AI__Voice__OutputRoot: "/data/audio"
FlowerCore__AI__Voice__RetentionDays: "30"
# LLM provider abstraction (ADR-088). Anthropic stays disabled here -- when
# an operator wants to enable Claude, they flip Enabled=true and mount
# FlowerCore__Anthropic__ApiKey from the onepassword-synced Secret (see
# docs/ai-agents/anthropic-integration.md).
FlowerCore__Anthropic__Enabled: "false"
FlowerCore__Anthropic__BaseUrl: "https://api.anthropic.com"
FlowerCore__Anthropic__DefaultModel: "claude-sonnet-4-6"
FlowerCore__Anthropic__CheapModel: "claude-haiku-4-5-20251001"
FlowerCore__Anthropic__DeepModel: "claude-opus-4-7"
FlowerCore__Budget__ResponseCacheEnabled: "true"
OTEL_SERVICE_NAME: FlowerCore.Chat
OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector.monitoring.svc.cluster.local:4317"
OTEL_EXPORTER_OTLP_PROTOCOL: grpc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: chat-web-data
namespace: fc-chat
labels:
app.kubernetes.io/name: chat-web
app.kubernetes.io/part-of: flowercore
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: chat-web
namespace: fc-chat
labels:
app.kubernetes.io/name: chat-web
app.kubernetes.io/part-of: flowercore
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app.kubernetes.io/name: chat-web
template:
metadata:
labels:
app.kubernetes.io/name: chat-web
app.kubernetes.io/part-of: flowercore
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics/prometheus"
spec:
nodeSelector:
kubernetes.io/hostname: rke2-server
securityContext:
fsGroup: 1654
fsGroupChangePolicy: OnRootMismatch
containers:
- name: chat-web
image: localhost/fc-chat-web:v20260614-regroup-ch6-37285d8
imagePullPolicy: Never
ports:
- name: http
containerPort: 8080
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
envFrom:
- configMapRef:
name: chat-web-config
- secretRef:
name: chat-web-secret
optional: true
env:
- name: FlowerCore__Auth__Oidc__Authority
valueFrom:
secretKeyRef:
name: chat-oidc-client
key: issuer_url
optional: true
- name: FlowerCore__Auth__Oidc__ClientId
valueFrom:
secretKeyRef:
name: chat-oidc-client
key: client_id
optional: true
- name: FlowerCore__Auth__Oidc__ClientSecret
valueFrom:
secretKeyRef:
name: chat-oidc-client
key: client_secret
optional: true
volumeMounts:
- name: data
mountPath: /data
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 6
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
volumes:
- name: data
persistentVolumeClaim:
claimName: chat-web-data
---
apiVersion: v1
kind: Service
metadata:
name: chat-web
namespace: fc-chat
labels:
app.kubernetes.io/name: chat-web
app.kubernetes.io/part-of: flowercore
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: chat-web
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
--- ---
apiVersion: cert-manager.io/v1 apiVersion: cert-manager.io/v1
kind: Certificate kind: Certificate

View File

@@ -14,6 +14,20 @@
# cluster-rebuild repeatability. See # cluster-rebuild repeatability. See
# feedback_networkpolicies_belong_in_bluejay_infra.md. # feedback_networkpolicies_belong_in_bluejay_infra.md.
--- ---
# OIDC client secret for the RemoteDesktop end-user sign-in (fleet regroup L9,
# 2026-06-12). The Authentik provider `remotedesktop` already exists; the 1P item
# `remotedesktop-oidc-client` (vault IAmWorkin) carries issuer_url / client_id /
# client_secret, and the 1Password operator mints the same-named K8s Secret that
# k8s/web-deployment.yaml (FlowerCore.RemoteDesktop repo) consumes with
# optional:true. Gate stays OFF (Q-RD-16) — this is flip-READINESS only.
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: remotedesktop-oidc-client
namespace: fc-desktop
spec:
itemPath: "vaults/IAmWorkin/items/remotedesktop-oidc-client"
---
apiVersion: cert-manager.io/v1 apiVersion: cert-manager.io/v1
kind: Certificate kind: Certificate
metadata: metadata:
@@ -51,3 +65,26 @@ spec:
port: 8080 port: 8080
tls: tls:
secretName: remotedesktop-web-tls secretName: remotedesktop-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose remotedesktop-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: remotedesktop-web-public
# namespace: fc-desktop
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`desktop.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: remotedesktop-web-public-profile-header # injects entitlement profile
# services:
# - name: remotedesktop-web
# port: 8080
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -0,0 +1,70 @@
# Admin / Helpdesk Console — Infra Finding (Cl-5, ADR-204)
**Outcome: ZERO new cluster infra required.** The Admin/helpdesk console rides the
existing `FlowerCore.DeviceManagement.Web` deploy as routes inside DM.Web (ADR-204).
The ingress already in this directory covers every path the admin console serves.
## What already exists for DM.Web (this directory)
| Manifest | Resource | Notes |
|----------|----------|-------|
| `certificate-web.yaml` | cert-manager `Certificate` `fc-devicemgmt-web-tls` | `issuerRef``step-ca-acme` `ClusterIssuer`; `dnsNames: [devices.iamworkin.lan]`; `secretName: fc-devicemgmt-web-tls`. DNS preflight gate documented (pfSense A record `devices.iamworkin.lan → 10.0.56.200` required before ACME sync). |
| `ingressroute-web.yaml` | Traefik `IngressRoute` `fc-devicemgmt-web` | `entryPoints: [websecure]`, `match: Host(\`devices.iamworkin.lan\`)`, service `fc-devicemgmt-web:80`, `tls.secretName: fc-devicemgmt-web-tls`. |
| `service-web.yaml` | `Service` `fc-devicemgmt-web` (ClusterIP, 80→8080) | Owned by the DM.Web deploy. |
| `deployment-web.yaml` | `Deployment` `fc-devicemgmt-web` | Currently `replicas: 0` (gated on fc-mysql operator + `flowercore_devicemgmt` DB + 1Password runtime item — see header comment). Not a Cl-5 concern. |
| also present | operator RBAC, namespace, network-policy, 1password-item | Full app dir, ArgoCD-managed. |
## Why the admin console needs nothing new
The existing IngressRoute matches **`Host(\`devices.iamworkin.lan\`)` with no `PathPrefix`
constraint**. Traefik therefore forwards *all* paths on that host to the
`fc-devicemgmt-web` service — including any admin/helpdesk routes the DM.Web app exposes
under its `FlowerCore:PathBase` (e.g. `/admin`, `/helpdesk`). The same TLS secret
(`fc-devicemgmt-web-tls`) and the same step-ca ACME `Certificate` already protect them.
This matches the established TLS-only-app pattern (e.g. `apps/fc-library/fc-library.yaml`,
`apps/fc-retail/fc-retail.yaml`): `Certificate` (issuerRef `step-ca-acme` ClusterIssuer) +
host-matched `IngressRoute` sharing the `secretName`. Per ADR-204 the admin console's
Deployment/Service stay with the DM.Web deploy — no separate workload is created.
ArgoCD repo URL convention (for reference, not changed here):
`http://gitea-clusterip.gitea.svc.cluster.local:3000/bluejay/bluejay-infra.git`
(internal HTTP — step-ca cert isn't trusted by ArgoCD). Apps in `apps/*` are picked up by
the `bluejay-infra` ApplicationSet directory generator; this dir has no `kustomization.yaml`,
consistent with that pattern.
## Recommendation
**Ride DM.Web at a PathBase path → no new Certificate, no new IngressRoute, no new
Deployment/Service.** Close the lane. The admin console reaches users at
`https://devices.iamworkin.lan/<PathBase>` through the manifests already in this directory.
## Open question (operator decision — NOT actioned)
**Q-MP-ADMIN-HOST — Distinct admin hostname vs PathBase path under DM.Web?**
If the operator ever wants the admin/helpdesk console on its *own* hostname
(e.g. `admin.iamworkin.lan`) rather than a path under `devices.iamworkin.lan`, that is a
deliberate routing/auth-surface choice, not a mechanical infra add. It would require:
1. a pfSense / FlowerCore.DNS A record `admin.iamworkin.lan → 10.0.56.200` (ACME preflight
gate — step-ca HTTP-01 can't see the CoreDNS wildcard);
2. a second cert-manager `Certificate` (`step-ca-acme` ClusterIssuer, `dnsNames:
[admin.iamworkin.lan]`, own `secretName`);
3. a second host-matched `IngressRoute` → the same `fc-devicemgmt-web:80` service
(still no new Deployment/Service — same app behind a second host).
**Default taken (do not block): PathBase path under DM.Web = zero new infra.** A separate
admin hostname is left UNBUILT pending an explicit operator answer to Q-MP-ADMIN-HOST,
because it changes the public/auth surface and conflicts with the ADR-204 "routes inside
DM.Web" intent. If the answer is "separate host," author only the `Certificate` +
`IngressRoute` above (no Deployment/Service), mirroring `apps/fc-library/fc-library.yaml`.
## Verification
- `kubectl apply --dry-run=client` (kubectl v1.34.2, no live cluster): `ingressroute-web.yaml`,
`service-web.yaml`, `deployment-web.yaml` validated clean. `certificate-web.yaml` returned
"no matches for kind Certificate in cert-manager.io/v1" — expected with no cluster
connection (CRD discovery unavailable client-side); the YAML shape is identical to the
proven `fc-library` Certificate. Server-side dry-run + live host resolution =
**fix-forward** (cluster may be unreachable from this lane).
- No manifest authored or changed by this lane — finding note only.

View File

@@ -11,7 +11,7 @@ metadata:
flowercore.io/created-by: bluejay-infra flowercore.io/created-by: bluejay-infra
rules: rules:
- apiGroups: - apiGroups:
- devices.flowercore.io - flowercore.io
resources: resources:
- '*' - '*'
verbs: verbs:
@@ -23,7 +23,7 @@ rules:
- patch - patch
- delete - delete
- apiGroups: - apiGroups:
- devices.flowercore.io - flowercore.io
resources: resources:
- devices/status - devices/status
- devices/finalizers - devices/finalizers
@@ -33,6 +33,8 @@ rules:
- devicepolicies/finalizers - devicepolicies/finalizers
- remotecommands/status - remotecommands/status
- remotecommands/finalizers - remotecommands/finalizers
- desiredstatedocuments/status
- desiredstatedocuments/finalizers
verbs: verbs:
- get - get
- update - update

View File

@@ -0,0 +1,186 @@
# FlowerCore.DeviceManagement CRDs.
#
# These CRDs match the current operator annotations:
# [KubernetesEntity(Group = "flowercore.io", ApiVersion = "v1alpha1", ...)]
# Keep the schemas intentionally permissive until the DeviceManagement operator
# grows enforced CRD validation.
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: devices.flowercore.io
labels:
app.kubernetes.io/name: fc-devicemgmt-operator
app.kubernetes.io/component: operator
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
group: flowercore.io
scope: Namespaced
names:
plural: devices
singular: device
kind: Device
listKind: DeviceList
versions:
- name: v1alpha1
served: true
storage: true
subresources:
status: {}
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: devicegroups.flowercore.io
labels:
app.kubernetes.io/name: fc-devicemgmt-operator
app.kubernetes.io/component: operator
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
group: flowercore.io
scope: Namespaced
names:
plural: devicegroups
singular: devicegroup
kind: DeviceGroup
listKind: DeviceGroupList
versions:
- name: v1alpha1
served: true
storage: true
subresources:
status: {}
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: devicepolicies.flowercore.io
labels:
app.kubernetes.io/name: fc-devicemgmt-operator
app.kubernetes.io/component: operator
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
group: flowercore.io
scope: Namespaced
names:
plural: devicepolicies
singular: devicepolicy
kind: DevicePolicy
listKind: DevicePolicyList
versions:
- name: v1alpha1
served: true
storage: true
subresources:
status: {}
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: remotecommands.flowercore.io
labels:
app.kubernetes.io/name: fc-devicemgmt-operator
app.kubernetes.io/component: operator
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
group: flowercore.io
scope: Namespaced
names:
plural: remotecommands
singular: remotecommand
kind: RemoteCommand
listKind: RemoteCommandList
versions:
- name: v1alpha1
served: true
storage: true
subresources:
status: {}
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: desiredstatedocuments.flowercore.io
labels:
app.kubernetes.io/name: fc-devicemgmt-operator
app.kubernetes.io/component: operator
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
group: flowercore.io
scope: Namespaced
names:
plural: desiredstatedocuments
singular: desiredstatedocument
kind: DesiredStateDocument
listKind: DesiredStateDocumentList
versions:
- name: v1alpha1
served: true
storage: true
subresources:
status: {}
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true

View File

@@ -5,21 +5,35 @@
# exist yet; import localhost/fc-devicemgmt-web:<tag> to all schedulable RKE2 # exist yet; import localhost/fc-devicemgmt-web:<tag> to all schedulable RKE2
# nodes before letting ArgoCD sync a live rollout. # nodes before letting ArgoCD sync a live rollout.
# #
# SCALED TO 0 — 2026-05-19 morning-routine cleanup. # LIVE — 2026-06-11 DeviceManagement product-host enablement.
# The Web pod cannot start until TWO upstream gaps close: # The current DeviceManagement Web source is SQLite-backed in Program.cs, so
# 1. MySQL DB instance `flowercore_devicemgmt` (user `fc_devicemgmt`) is # Phase 1 production uses a Longhorn RWO PVC at /data/devicemgmt.db. The
# provisioned via fc-mysql Manager. The cluster currently has ZERO # 1Password runtime item stays mounted through env for future MySQL/API-key
# MySqlInstanceCrds and no `mysql.fc-mysql.svc:3306` Service, so the # cutover, but MySQL is not required for this first product-host rollout.
# deployment-web container env `FlowerCore__Database__Host=mysql.fc-mysql.svc` # Image v20260613-g2-66a43c1 is built from FlowerCore.DeviceManagement master
# points at nothing. Provision via the fc-mysql Manager UI/REST/MCP. # 66a43c1, carrying edge enrollment network completion and SQLite-safe trust-bundle smoke coverage.
# 2. 1Password vault item `IAmWorkin/FlowerCore DeviceManagement Runtime` ---
# with 5 fields (DB-Password, mtls-ca.pem, mtls-client.crt, mtls-client.key, apiVersion: v1
# mtls-chain.pem) — see apps/fc-devicemgmt/1password-item.yaml. Mint mTLS kind: PersistentVolumeClaim
# from step-ca-agent ClusterIssuer per ADR-126; DB-Password must match the metadata:
# password configured for the MySQL user. name: fc-devicemgmt-web-data
# Re-enable: change replicas back to 2 after both gaps close. The image tag namespace: fc-devicemgmt
# in this file (v20260512-cx5) MAY also need a refresh — it predates the labels:
# Sprint 34 Cl-3 operator fix; Web may have an analogous bug. app: fc-devicemgmt-web
app.kubernetes.io/name: fc-devicemgmt-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1 apiVersion: apps/v1
kind: Deployment kind: Deployment
metadata: metadata:
@@ -36,8 +50,13 @@ metadata:
annotations: annotations:
flowercore.io/traceability-standard: k8s-pod-ownership-and-traceability-standard flowercore.io/traceability-standard: k8s-pod-ownership-and-traceability-standard
spec: spec:
replicas: 0 replicas: 1
revisionHistoryLimit: 3 revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
selector: selector:
matchLabels: matchLabels:
app: fc-devicemgmt-web app: fc-devicemgmt-web
@@ -52,6 +71,8 @@ spec:
flowercore.io/tenant-id: system flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra flowercore.io/created-by: bluejay-infra
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics" prometheus.io/path: "/metrics"
@@ -62,11 +83,12 @@ spec:
fsGroupChangePolicy: OnRootMismatch fsGroupChangePolicy: OnRootMismatch
containers: containers:
- name: web - name: web
image: localhost/fc-devicemgmt-web:v20260512-cx5 image: localhost/fc-devicemgmt-web:v20260614-regroup-c5b8f82
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- name: http - name: http
containerPort: 8080 containerPort: 8080
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: "http://+:8080" value: "http://+:8080"
@@ -74,29 +96,21 @@ spec:
value: "Production" value: "Production"
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT - name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false" value: "false"
- name: HOME
value: "/data"
- name: FlowerCore__Service__Name - name: FlowerCore__Service__Name
value: "FlowerCore.DeviceManagement.Web" value: "FlowerCore.DeviceManagement.Web"
- name: FlowerCore__DeviceManagement__DefaultTenantId - name: FlowerCore__DeviceManagement__DefaultTenantId
value: "system" value: "system"
- name: FlowerCore__Database__Provider - name: FlowerCore__Database__Provider
value: "MySql" value: "Sqlite"
- name: FlowerCore__Database__Host - name: FlowerCore__Database__ConnectionStrings__Sqlite
value: "mysql.fc-mysql.svc" value: "Data Source=/data/devicemgmt.db"
- name: FlowerCore__Database__Database
value: "flowercore_devicemgmt"
- name: FlowerCore__Database__User
value: "fc_devicemgmt"
- name: FlowerCore__Database__Password - name: FlowerCore__Database__Password
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
name: fc-devicemgmt-runtime name: fc-devicemgmt-runtime
key: DB-Password key: DB-Password
- name: FlowerCore__DeviceManagement__AgentMtls__CaPath
value: "/secrets/devicemgmt-mtls/mtls-ca.pem"
- name: FlowerCore__DeviceManagement__AgentMtls__ClientCertificatePath
value: "/secrets/devicemgmt-mtls/mtls-client.crt"
- name: FlowerCore__DeviceManagement__AgentMtls__ClientKeyPath
value: "/secrets/devicemgmt-mtls/mtls-client.key"
- name: FlowerCore__EventBus__Redis__Configuration - name: FlowerCore__EventBus__Redis__Configuration
value: "redis.fc-redis.svc:6379" value: "redis.fc-redis.svc:6379"
resources: resources:
@@ -133,19 +147,17 @@ spec:
drop: drop:
- ALL - ALL
volumeMounts: volumeMounts:
- name: data
mountPath: /data
- name: tmp - name: tmp
mountPath: /tmp mountPath: /tmp
- name: logs - name: logs
mountPath: /app/logs mountPath: /app/logs
- name: devicemgmt-mtls
mountPath: /secrets/devicemgmt-mtls
readOnly: true
volumes: volumes:
- name: data
persistentVolumeClaim:
claimName: fc-devicemgmt-web-data
- name: tmp - name: tmp
emptyDir: {} emptyDir: {}
- name: logs - name: logs
emptyDir: {} emptyDir: {}
- name: devicemgmt-mtls
secret:
secretName: fc-devicemgmt-runtime
defaultMode: 0400

View File

@@ -74,6 +74,14 @@ metadata:
spec: spec:
itemPath: "vaults/IAmWorkin/items/FlowerCore Edition Signing Key - edition:aistation-field" itemPath: "vaults/IAmWorkin/items/FlowerCore Edition Signing Key - edition:aistation-field"
--- ---
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: distribution-oidc-client
namespace: fc-distribution
spec:
itemPath: "vaults/IAmWorkin/items/distribution-oidc-client"
---
apiVersion: apps/v1 apiVersion: apps/v1
kind: Deployment kind: Deployment
metadata: metadata:
@@ -101,6 +109,7 @@ spec:
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics" prometheus.io/path: "/metrics"
flowercore.io/healthz-auth-policy: "allow-anonymous"
spec: spec:
# Synology NFS export `/volume1/kubernetes` ACL only allows rke2-server # Synology NFS export `/volume1/kubernetes` ACL only allows rke2-server
# (10.0.56.11) right now. Until the ACL is widened in DSM (admin only), # (10.0.56.11) right now. Until the ACL is widened in DSM (admin only),
@@ -118,7 +127,7 @@ spec:
# dotnet.exe publish -c Release -o deploy/app \ # dotnet.exe publish -c Release -o deploy/app \
# src/FlowerCore.Distribution.Web/FlowerCore.Distribution.Web.csproj # src/FlowerCore.Distribution.Web/FlowerCore.Distribution.Web.csproj
# podman build -t localhost/fc-distribution:v<tag> -f deploy/Dockerfile.deploy deploy # podman build -t localhost/fc-distribution:v<tag> -f deploy/Dockerfile.deploy deploy
image: localhost/fc-distribution:v202605061948 image: localhost/fc-distribution:v20260604-oidc-root-anon
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 8080 - containerPort: 8080
@@ -130,6 +139,25 @@ spec:
value: "Production" value: "Production"
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT - name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false" value: "false"
# Authentik/OIDC enforcement. Public read/entitlement + the
# dist.flowercore.io Method() allowlist stay open; OIDC gates the
# operator/admin surface while /healthz remains anonymous.
- name: FlowerCore__Auth__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Authority
value: "https://id.iamworkin.lan/application/o/distribution/"
- name: FlowerCore__Auth__Oidc__Audience
value: "distribution"
- name: FlowerCore__Auth__Oidc__ClientId
value: "distribution"
- name: FlowerCore__Auth__Oidc__ClientSecret
valueFrom:
secretKeyRef:
name: distribution-oidc-client
key: client_secret
optional: true
# SQLite connection (catalog + data-protection keys via FlowerCoreDbContext). # SQLite connection (catalog + data-protection keys via FlowerCoreDbContext).
# Read by Data/DatabaseProviderExtensions.cs in precedence order; Sqlite key wins. # Read by Data/DatabaseProviderExtensions.cs in precedence order; Sqlite key wins.
- name: FlowerCore__Database__Provider - name: FlowerCore__Database__Provider

View File

@@ -0,0 +1,45 @@
# FlowerCore Divoom DM Pi Device
Source-controlled Puppet/Hiera deployment contract for registering the edge2
Divoom MiniToo panel as a FlowerCore DeviceManagement-managed Pi device.
This is not a Kubernetes application. The live panel remains the existing
edge2 `flowercore-divoom.service` managed by `FlowerCore.Puppet`
`profile::pi::service::divoom`, with the .NET payload deployed out of band
and `/opt/flowercore/divoom/data` plus the Bluetooth shell wrappers preserved.
Because edge2 is already Hiera-driven through `profile::pi::service::apps`,
the deploy home is additive `profile::pi::service` data/profile source, not
`profile::edge::service::apps` and not an ArgoCD/K8s app.
## Scope
- Stage DeviceManagement registration metadata for the edge2 Divoom MiniToo.
- Stage a separate, disabled-by-default DM Agent executor unit for privileged
Bluetooth operations once the DM-RPC lane lands.
- Keep `flowercore-divoom.service` and `flowercore-divoom-bt.service`
untouched: no service replacement, no restart subscription, no K8s surface.
- Preserve the current wrapper contract:
`/opt/flowercore/divoom/bt-link.sh`,
`/opt/flowercore/divoom/bt-reset.sh`, and
`/opt/flowercore/divoom/audio-link.sh`.
- Keep FM radio disabled and require visible render proof; device-info echo is
not render proof.
## Artifact Map
| Path | Use |
| --- | --- |
| `hiera/edge2-divoom-dm-device.overlay.yaml` | Additive Hiera overlay for edge2. Merge into the existing node YAML without removing `fc-pimanager` or `fc-divoom`. |
| `puppet/profile/pi/service/divoom_dm_device.pp` | Puppet profile shape to vendor into `FlowerCore.Puppet` after the DM-RPC executor binary exists. |
| `puppet/templates/divoom-device-registration.json.epp` | DM device registration metadata rendered on edge2. |
| `puppet/templates/flowercore-divoom-dm-agent.service.epp` | Separate DM Agent systemd unit. Defaults are stopped and disabled until a later cutover. |
## Rollout Notes
1. Land these artifacts in bluejay-infra as the deploy contract.
2. Vendor the Puppet profile and EPP templates into `FlowerCore.Puppet`.
3. Merge the Hiera overlay into `data/nodes/edge2.iamworkin.lan.yaml`.
4. Run Puppet in noop first, preferably with a node-local validation directory
under `~/.fcv` rather than `/tmp`.
5. Only enable the DM Agent service after the DeviceManagement BT executor has
landed and passed operator-eyeball render proof.

View File

@@ -0,0 +1,32 @@
---
# Merge into FlowerCore.Puppet data/nodes/edge2.iamworkin.lan.yaml.
# Additive overlay only: keep the existing fc-pimanager version/tarball entry,
# keep fc-divoom enabled, and do not move Divoom into Kubernetes.
profile::pi::service::apps:
fc-pimanager:
binary: 'FlowerCore.PiManager.Web'
install_dir: '/opt/fc-pimanager'
port: 5000
environment: 'edge2'
version: '2026.05.28.1646'
tarball_source: 'puppet:///modules/profile/pi/builds/fc-pimanager.tar.gz'
fc-divoom:
enabled: true
profile::pi::service::divoom_dm_device::ensure: 'present'
profile::pi::service::divoom_dm_device::service_enabled: false
profile::pi::service::divoom_dm_device::service_ensure: 'stopped'
profile::pi::service::divoom_dm_device::device_id: 'edge2-divoom-minitoo'
profile::pi::service::divoom_dm_device::display_name: 'edge2 Divoom MiniToo'
profile::pi::service::divoom_dm_device::host_fqdn: 'edge2.iamworkin.lan'
profile::pi::service::divoom_dm_device::dm_web_url: 'https://devicemgmt.iamworkin.lan'
profile::pi::service::divoom_dm_device::divoom_install_dir: '/opt/flowercore/divoom'
profile::pi::service::divoom_dm_device::agent_install_dir: '/opt/flowercore/devicemanagement-agent'
profile::pi::service::divoom_dm_device::bt_candidate_channels:
- '1'
- '10'
profile::pi::service::divoom_dm_device::default_bt_channel: '1'
profile::pi::service::divoom_dm_device::a2dp_default_state: 'off'
profile::pi::service::divoom_dm_device::fm_radio_enabled: false
profile::pi::service::divoom_dm_device::visible_render_proof_required: true

View File

@@ -0,0 +1,140 @@
# Drop into FlowerCore.Puppet site-modules/profile/manifests/pi/service/divoom_dm_device.pp.
# This profile is additive to profile::pi::service::divoom. It must not manage,
# restart, replace, or subscribe the existing flowercore-divoom.service.
class profile::pi::service::divoom_dm_device (
Enum['present', 'absent'] $ensure = 'present',
Boolean $service_enabled = false,
Enum['running', 'stopped'] $service_ensure = 'stopped',
String $service_name = 'flowercore-divoom-dm-agent',
String $device_id = 'edge2-divoom-minitoo',
String $display_name = 'edge2 Divoom MiniToo',
String $host_fqdn = 'edge2.iamworkin.lan',
String $dm_web_url = 'https://devicemgmt.iamworkin.lan',
String $divoom_install_dir = '/opt/flowercore/divoom',
String $agent_install_dir = '/opt/flowercore/devicemanagement-agent',
String $agent_binary = 'FlowerCore.DeviceManagement.Agent',
Array[String] $bt_candidate_channels = ['1', '10'],
String $default_bt_channel = '1',
Enum['on', 'off'] $a2dp_default_state = 'off',
Boolean $fm_radio_enabled = false,
Boolean $visible_render_proof_required = true,
) {
include profile::workstation::safe_account_exclusion
$safe_account = $profile::workstation::safe_account_exclusion::safe_account
$config_dir = '/etc/flowercore/device-management/devices'
$state_dir = '/var/lib/flowercore/divoom-dm-agent'
$log_dir = '/var/log/flowercore/divoom-dm-agent'
$registration_path = "${config_dir}/${device_id}.json"
$agent_binary_path = "${agent_install_dir}/${agent_binary}"
$bt_channels_json = inline_template('[<%= @bt_candidate_channels.map { |c| "\"#{c}\"" }.join(", ") %>]')
if $safe_account {
notify { 'fc-divoom-dm-device safe-account exclusion':
message => 'SAFE-ACCOUNT-EXCLUSION: Divoom DM Pi device profile refused to apply on operator workstation',
}
if $facts['os']['family'] != 'windows' {
ensure_resource('file', '/var/log/flowercore-audit', {
'ensure' => 'directory',
'owner' => 'root',
'group' => 'root',
'mode' => '0755',
})
file { '/var/log/flowercore-audit/safe-account-noop-fc-divoom-dm-device.log':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
content => "noop: divoom dm pi device profile refused to apply on safe-account host\n",
require => File['/var/log/flowercore-audit'],
}
}
} elsif $ensure == 'absent' {
service { $service_name:
ensure => stopped,
enable => false,
}
file { [
"/etc/systemd/system/${service_name}.service",
$registration_path,
]:
ensure => absent,
}
exec { 'fc-divoom-dm-agent-systemd-reload':
command => '/usr/bin/systemctl daemon-reload',
refreshonly => true,
path => ['/usr/bin', '/bin'],
}
} else {
case $facts['os']['family'] {
'Debian': {}
default: { fail("profile::pi::service::divoom_dm_device only supports Debian-family OS, got ${facts['os']['family']}") }
}
file { [$config_dir, $state_dir, $log_dir]:
ensure => directory,
owner => 'root',
group => 'root',
mode => '0755',
}
file { $registration_path:
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
content => epp('profile/pi/fc_divoom_dm/divoom-device-registration.json.epp', {
'device_id' => $device_id,
'display_name' => $display_name,
'host_fqdn' => $host_fqdn,
'divoom_install_dir' => $divoom_install_dir,
'bt_channels_json' => $bt_channels_json,
'default_bt_channel' => $default_bt_channel,
'a2dp_default_state' => $a2dp_default_state,
'fm_radio_enabled' => $fm_radio_enabled,
'visible_render_proof_required' => $visible_render_proof_required,
}),
require => File[$config_dir],
}
file { "/etc/systemd/system/${service_name}.service":
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
content => epp('profile/pi/fc_divoom_dm/flowercore-divoom-dm-agent.service.epp', {
'service_name' => $service_name,
'device_id' => $device_id,
'dm_web_url' => $dm_web_url,
'registration_path' => $registration_path,
'divoom_install_dir' => $divoom_install_dir,
'agent_install_dir' => $agent_install_dir,
'agent_binary_path' => $agent_binary_path,
'state_dir' => $state_dir,
'log_dir' => $log_dir,
}),
notify => Exec['fc-divoom-dm-agent-systemd-reload'],
require => File[$registration_path],
}
exec { 'fc-divoom-dm-agent-systemd-reload':
command => '/usr/bin/systemctl daemon-reload',
refreshonly => true,
path => ['/usr/bin', '/bin'],
}
service { $service_name:
ensure => $service_ensure,
enable => $service_enabled,
require => [
File["/etc/systemd/system/${service_name}.service"],
File[$registration_path],
Exec['fc-divoom-dm-agent-systemd-reload'],
],
}
}
}

View File

@@ -0,0 +1,34 @@
{
"deviceId": "<%= $device_id %>",
"displayName": "<%= $display_name %>",
"hostFqdn": "<%= $host_fqdn %>",
"kind": "DivoomMiniToo",
"managedBy": "FlowerCore.DeviceManagement",
"executionMode": "Pi",
"transport": {
"kind": "BluetoothSerial",
"candidateChannels": <%= $bt_channels_json %>,
"defaultChannel": "<%= $default_bt_channel %>",
"deviceInfoIsRenderProof": false,
"visibleRenderProofRequired": <%= $visible_render_proof_required %>
},
"paths": {
"divoomInstallDir": "<%= $divoom_install_dir %>",
"btLink": "<%= $divoom_install_dir %>/bt-link.sh",
"btReset": "<%= $divoom_install_dir %>/bt-reset.sh",
"audioLink": "<%= $divoom_install_dir %>/audio-link.sh"
},
"capabilities": {
"supportsBluetoothSerial": true,
"supportsBtChannelRedetect": true,
"supportsBtHardReset": true,
"supportsBtAudioProfileSwitch": true,
"a2dpDefaultState": "<%= $a2dp_default_state %>",
"fmRadioEnabled": <%= $fm_radio_enabled %>
},
"safety": {
"preserveExistingService": "flowercore-divoom.service",
"preserveDataDirectory": "<%= $divoom_install_dir %>/data",
"doNotEnableFmRadio": true
}
}

View File

@@ -0,0 +1,36 @@
[Unit]
Description=FlowerCore Divoom DM Agent Bluetooth executor
Documentation=https://github.com/astoltz/FlowerCore.Notes/blob/master/docs/standards/divoom-tv-hdmi-multitarget-render-substrate.md
Wants=network-online.target
After=network-online.target bluetooth.service
Requires=bluetooth.service
ConditionPathExists=<%= $agent_binary_path %>
ConditionPathExists=<%= $registration_path %>
ConditionPathExists=<%= $divoom_install_dir %>/bt-link.sh
ConditionPathExists=<%= $divoom_install_dir %>/bt-reset.sh
ConditionPathExists=<%= $divoom_install_dir %>/audio-link.sh
[Service]
Type=simple
User=stoltz
Group=stoltz
WorkingDirectory=<%= $agent_install_dir %>
Environment=DOTNET_CLI_TELEMETRY_OPTOUT=1
Environment=FLOWERCORE_DM_DEVICE_REGISTRATION=<%= $registration_path %>
Environment=Divoom__Bluetooth__DeviceInfoIsRenderProof=false
Environment=Divoom__Bluetooth__VisibleRenderProofRequired=true
Environment=Divoom__Bluetooth__A2dpDefaultState=off
ExecStart=<%= $agent_binary_path %> --mode=Pi --device-id=<%= $device_id %> --dm-web-url=<%= $dm_web_url %> --registration=<%= $registration_path %>
Restart=on-failure
RestartSec=10s
StartLimitBurst=3
StartLimitIntervalSec=300s
SupplementaryGroups=bluetooth audio dialout
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=<%= $state_dir %> <%= $log_dir %>
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,44 @@
# FlowerCore Divoom TV Pi HDMI
Source-controlled deploy shape for the native `FlowerCore.Divoom.Tv`
Avalonia HDMI renderer on a Raspberry Pi connected to a TV.
This is a Puppet/systemd appliance bundle, not a Kubernetes application. It
mirrors the existing `fc-signage-pi-player` pattern: bluejay-infra carries the
systemd units, scripts, Hiera shape, and Puppet profile source that
`FlowerCore.Puppet` vendors and installs.
## Scope
- Launch the future `FlowerCore.Divoom.Tv` linux-arm64 self-contained payload
from `/opt/flowercore/divoom-tv/FlowerCore.Divoom.Tv`.
- Prefer `cage` as the Wayland fullscreen compositor, with direct app launch as
a fallback for development images.
- Restart the app after HDMI hotplug with a 2 second DRM settle delay.
- Keep all runtime state local: `/var/lib/fc-divoom-tv` and
`/var/log/fc-divoom-tv`.
- Avoid CDN/runtime fetches; the app renders the in-house Divoom scene catalog
locally.
## Artifact Map
| Path | Use |
| --- | --- |
| `systemd/flowercore-divoom-tv.service` | Fullscreen Avalonia HDMI app service. |
| `systemd/flowercore-divoom-tv-hdmi.service` | HDMI hotplug responder service. |
| `systemd/99-flowercore-divoom-tv-hdmi.rules` | DRM udev hotplug rule. |
| `scripts/flowercore-divoom-tv-prelaunch.sh` | Preflight checks and local directory creation. |
| `scripts/flowercore-divoom-tv-launch.sh` | Cage-first fullscreen launcher. |
| `scripts/flowercore-divoom-tv-hdmi-respond.sh` | Hotplug settle and restart script. |
| `puppet/profile/pi/service/divoom_tv.pp` | Puppet profile shape to vendor into `FlowerCore.Puppet`. |
| `hiera/example-divoom-tv-pi.iamworkin.lan.yaml` | Example node Hiera for a Divoom TV Pi. |
## Rollout Notes
1. Build `FlowerCore.Divoom.Tv` with `dotnet.exe publish -c Release -r linux-arm64 --self-contained`.
2. Stage the payload to `/opt/flowercore/divoom-tv/` through the standard noc1
jump path and avoid `/tmp` for unprivileged Pi scratch.
3. Vendor the profile and static files into `FlowerCore.Puppet`.
4. Run Puppet noop, then apply on the target Pi.
5. Prove deployment with `systemctl is-active flowercore-divoom-tv.service`,
journal lines showing frames presented, and a visible HDMI display check.

View File

@@ -0,0 +1,19 @@
---
# Example node data for a dedicated Pi -> HDMI -> TV Divoom renderer.
# Copy into FlowerCore.Puppet data/nodes/<hostname>.iamworkin.lan.yaml only
# after the Pi has a static DHCP/DNS entry and the linux-arm64 payload exists.
facts:
role: pi_prototype
profile::motd::role: 'Divoom TV HDMI Renderer'
profile::pi::service::divoom_tv::ensure: 'present'
profile::pi::service::divoom_tv::service_enabled: true
profile::pi::service::divoom_tv::service_ensure: 'running'
profile::pi::service::divoom_tv::install_dir: '/opt/flowercore/divoom-tv'
profile::pi::service::divoom_tv::state_dir: '/var/lib/fc-divoom-tv'
profile::pi::service::divoom_tv::log_dir: '/var/log/fc-divoom-tv'
profile::pi::service::divoom_tv::presentation_mode: 'PillarboxSquare'
profile::pi::service::divoom_tv::startup_scene: 'bluejay-clock'
profile::pi::service::divoom_tv::reduced_motion: false

View File

@@ -0,0 +1,149 @@
# Drop into FlowerCore.Puppet site-modules/profile/manifests/pi/service/divoom_tv.pp.
# Static files come from profile/pi/fc_divoom_tv/ after this bluejay-infra
# bundle is vendored into the Puppet control repo.
class profile::pi::service::divoom_tv (
Enum['present', 'absent'] $ensure = 'present',
Boolean $service_enabled = false,
Enum['running', 'stopped'] $service_ensure = 'stopped',
String $service_name = 'flowercore-divoom-tv',
String $user = 'fc-divoom-tv',
String $group = 'fc-divoom-tv',
String $install_dir = '/opt/flowercore/divoom-tv',
String $state_dir = '/var/lib/fc-divoom-tv',
String $log_dir = '/var/log/fc-divoom-tv',
String $presentation_mode = 'PillarboxSquare',
String $startup_scene = 'bluejay-clock',
Boolean $reduced_motion = false,
) {
include profile::workstation::safe_account_exclusion
$safe_account = $profile::workstation::safe_account_exclusion::safe_account
if $safe_account {
notify { 'fc-divoom-tv safe-account exclusion':
message => 'SAFE-ACCOUNT-EXCLUSION: Divoom TV Pi profile refused to apply on operator workstation',
}
} elsif $ensure == 'absent' {
service { $service_name:
ensure => stopped,
enable => false,
}
file { [
"/etc/systemd/system/${service_name}.service",
"/etc/systemd/system/${service_name}-hdmi.service",
'/etc/udev/rules.d/99-flowercore-divoom-tv-hdmi.rules',
'/usr/local/bin/flowercore-divoom-tv-prelaunch.sh',
'/usr/local/bin/flowercore-divoom-tv-launch.sh',
'/usr/local/bin/flowercore-divoom-tv-hdmi-respond.sh',
'/etc/flowercore/divoom-tv.env',
]:
ensure => absent,
}
} else {
case $facts['os']['family'] {
'Debian': {}
default: { fail("profile::pi::service::divoom_tv only supports Debian-family OS, got ${facts['os']['family']}") }
}
package { ['cage', 'libgbm1', 'libdrm2', 'libxkbcommon0', 'fonts-dejavu-core']:
ensure => installed,
}
group { $group:
ensure => present,
system => true,
}
user { $user:
ensure => present,
system => true,
gid => $group,
home => $state_dir,
managehome => false,
shell => '/usr/sbin/nologin',
require => Group[$group],
}
file { [$install_dir, $state_dir, $log_dir, '/etc/flowercore']:
ensure => directory,
owner => $user,
group => $group,
mode => '0755',
}
file { '/etc/flowercore/divoom-tv.env':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
content => "FC_DIVOOM_TV_PRESENTATION_MODE=${presentation_mode}\nFC_DIVOOM_TV_START_SCENE=${startup_scene}\nFC_DIVOOM_TV_REDUCED_MOTION=${reduced_motion}\n",
require => File['/etc/flowercore'],
}
$script_map = {
'/usr/local/bin/flowercore-divoom-tv-prelaunch.sh' => 'profile/pi/fc_divoom_tv/flowercore-divoom-tv-prelaunch.sh',
'/usr/local/bin/flowercore-divoom-tv-launch.sh' => 'profile/pi/fc_divoom_tv/flowercore-divoom-tv-launch.sh',
'/usr/local/bin/flowercore-divoom-tv-hdmi-respond.sh' => 'profile/pi/fc_divoom_tv/flowercore-divoom-tv-hdmi-respond.sh',
}
$script_map.each |$dest, $src| {
file { $dest:
ensure => file,
owner => 'root',
group => 'root',
mode => '0755',
source => "puppet:///modules/${src}",
}
}
$unit_map = {
"/etc/systemd/system/${service_name}.service" => 'profile/pi/fc_divoom_tv/flowercore-divoom-tv.service',
"/etc/systemd/system/${service_name}-hdmi.service" => 'profile/pi/fc_divoom_tv/flowercore-divoom-tv-hdmi.service',
}
$unit_map.each |$dest, $src| {
file { $dest:
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => "puppet:///modules/${src}",
notify => Exec['fc-divoom-tv-systemd-reload'],
}
}
file { '/etc/udev/rules.d/99-flowercore-divoom-tv-hdmi.rules':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/profile/pi/fc_divoom_tv/99-flowercore-divoom-tv-hdmi.rules',
notify => Exec['fc-divoom-tv-udev-reload'],
}
exec { 'fc-divoom-tv-systemd-reload':
command => '/usr/bin/systemctl daemon-reload',
refreshonly => true,
path => ['/usr/bin', '/bin'],
}
exec { 'fc-divoom-tv-udev-reload':
command => '/usr/bin/udevadm control --reload-rules',
refreshonly => true,
path => ['/usr/bin', '/bin'],
}
service { $service_name:
ensure => $service_ensure,
enable => $service_enabled,
require => [
File["/etc/systemd/system/${service_name}.service"],
File['/etc/flowercore/divoom-tv.env'],
File['/usr/local/bin/flowercore-divoom-tv-prelaunch.sh'],
File['/usr/local/bin/flowercore-divoom-tv-launch.sh'],
Exec['fc-divoom-tv-systemd-reload'],
],
}
}
}

View File

@@ -0,0 +1,5 @@
#!/usr/bin/env bash
set -euo pipefail
sleep 2
systemctl restart flowercore-divoom-tv.service

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
set -euo pipefail
APP_BIN="${FC_DIVOOM_TV_BIN:-/opt/flowercore/divoom-tv/FlowerCore.Divoom.Tv}"
STATE_DIR="${FC_DIVOOM_TV_STATE_DIR:-/var/lib/fc-divoom-tv}"
LOG_DIR="${FC_DIVOOM_TV_LOG_DIR:-/var/log/fc-divoom-tv}"
PRESENTATION_MODE="${FC_DIVOOM_TV_PRESENTATION_MODE:-PillarboxSquare}"
START_SCENE="${FC_DIVOOM_TV_START_SCENE:-bluejay-clock}"
REDUCED_MOTION="${FC_DIVOOM_TV_REDUCED_MOTION:-false}"
COMMON_ARGS=(
"--target=hdmi"
"--presentation-mode=${PRESENTATION_MODE}"
"--startup-scene=${START_SCENE}"
"--reduced-motion=${REDUCED_MOTION}"
"--state-dir=${STATE_DIR}"
"--log-dir=${LOG_DIR}"
)
if command -v cage >/dev/null 2>&1; then
exec cage -- "${APP_BIN}" "${COMMON_ARGS[@]}" "$@"
fi
echo "[$(date -Is)] cage not found; launching FlowerCore.Divoom.Tv directly" >&2
exec "${APP_BIN}" "${COMMON_ARGS[@]}" "$@"

View File

@@ -0,0 +1,23 @@
#!/usr/bin/env bash
set -euo pipefail
APP_BIN="${FC_DIVOOM_TV_BIN:-/opt/flowercore/divoom-tv/FlowerCore.Divoom.Tv}"
STATE_DIR="${FC_DIVOOM_TV_STATE_DIR:-/var/lib/fc-divoom-tv}"
LOG_DIR="${FC_DIVOOM_TV_LOG_DIR:-/var/log/fc-divoom-tv}"
mkdir -p "${STATE_DIR}" "${LOG_DIR}"
if [[ ! -x "${APP_BIN}" ]]; then
echo "[$(date -Is)] missing executable ${APP_BIN}" >&2
exit 1
fi
if [[ -d /sys/class/drm ]] && ! find /sys/class/drm -maxdepth 1 -name 'card*-HDMI-A-*' -print -quit | grep -q .; then
echo "[$(date -Is)] no HDMI connector visible yet; continuing so the app can wait for display" >&2
fi
if command -v cage >/dev/null 2>&1; then
echo "[$(date -Is)] cage available for fullscreen Wayland launch"
else
echo "[$(date -Is)] cage not installed; direct launch fallback will be used" >&2
fi

View File

@@ -0,0 +1,2 @@
# Settle DRM for 2s before restarting the fullscreen Avalonia renderer.
SUBSYSTEM=="drm", KERNEL=="card?-HDMI-A-?", ACTION=="change", RUN+="/usr/bin/systemctl start flowercore-divoom-tv-hdmi.service"

View File

@@ -0,0 +1,7 @@
[Unit]
Description=FlowerCore Divoom TV HDMI hotplug responder
DefaultDependencies=no
[Service]
Type=oneshot
ExecStart=/usr/local/bin/flowercore-divoom-tv-hdmi-respond.sh

View File

@@ -0,0 +1,40 @@
[Unit]
Description=FlowerCore Divoom TV HDMI Renderer (Avalonia fullscreen)
Documentation=https://github.com/astoltz/FlowerCore.Notes/blob/master/docs/standards/divoom-tv-hdmi-multitarget-render-substrate.md
Wants=network-online.target
After=network-online.target systemd-user-sessions.service
ConditionPathExists=/opt/flowercore/divoom-tv/FlowerCore.Divoom.Tv
[Service]
Type=simple
User=fc-divoom-tv
Group=fc-divoom-tv
WorkingDirectory=/opt/flowercore/divoom-tv
EnvironmentFile=-/etc/flowercore/divoom-tv.env
Environment=DOTNET_CLI_TELEMETRY_OPTOUT=1
Environment=XDG_RUNTIME_DIR=/run/fc-divoom-tv
RuntimeDirectory=fc-divoom-tv
RuntimeDirectoryMode=0700
ExecStartPre=/usr/local/bin/flowercore-divoom-tv-prelaunch.sh
ExecStart=/usr/local/bin/flowercore-divoom-tv-launch.sh
Restart=always
RestartSec=10s
StartLimitBurst=5
StartLimitIntervalSec=300s
MemoryMax=2G
MemoryHigh=1500M
PrivateTmp=true
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/fc-divoom-tv /var/log/fc-divoom-tv /run/fc-divoom-tv
TTYPath=/dev/tty1
StandardInput=tty
StandardOutput=journal
StandardError=journal
TTYReset=yes
TTYVHangup=yes
TTYVTDisallocate=yes
[Install]
WantedBy=graphical.target

View File

@@ -30,3 +30,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: dms-web-tls secretName: dms-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose dms-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: dms-web-public
# namespace: fc-dms
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`dms.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: dms-web-public-profile-header # injects entitlement profile
# services:
# - name: dms-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

481
apps/fc-dns/fc-dns.yaml Normal file
View File

@@ -0,0 +1,481 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: fc-dns
labels:
app.kubernetes.io/part-of: flowercore
---
# 1Password-backed Secret for the pfSense admin password.
# The operator watches this CRD, resolves the vault item, and produces a
# K8s Secret of the same name with each 1P field as a key. The `password`
# field of the "pfSense Admin" item becomes Secret key `password`.
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: pfsense-admin
namespace: fc-dns
spec:
itemPath: "vaults/IAmWorkin/items/pfSense Admin"
---
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: dns-oidc-client
namespace: fc-dns
spec:
itemPath: "vaults/IAmWorkin/items/dns-oidc-client"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dns-web-data
namespace: fc-dns
spec:
accessModes: [ReadWriteOnce]
storageClassName: longhorn
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: dns-web-config
namespace: fc-dns
data:
appsettings.Production.json: |
{
"FlowerCore": {
"Auth": {
"Enabled": false,
"Oidc": {
"Enabled": true,
"Audience": "dns",
"RequireHttpsMetadata": true
}
},
"Database": {
"Provider": "Sqlite",
"ConnectionStrings": {
"Sqlite": "Data Source=/data/dns.db"
}
},
"Tenant": {
"DefaultTenantId": "default",
"JwtClaimsEnabled": false,
"DefaultTenantHosts": [
"dns.iamworkin.lan"
]
},
"Audit": {
"HashChain": {
"BridgeSensitivity": {
"Distribution": "Warn"
}
}
}
}
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dns-web
namespace: fc-dns
labels:
app.kubernetes.io/name: dns-web
app.kubernetes.io/managed-by: flowercore
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app.kubernetes.io/name: dns-web
template:
metadata:
labels:
app.kubernetes.io/name: dns-web
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "5320"
prometheus.io/path: "/metrics/prometheus"
flowercore.io/healthz-auth-policy: "allow-anonymous"
spec:
serviceAccountName: dns-web
securityContext:
runAsNonRoot: true
runAsUser: 1654
runAsGroup: 1654
fsGroup: 1654
containers:
- name: dns-web
image: localhost/fc-dns-web:v20260614-wave5-isolation-6124856
imagePullPolicy: Never
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
ports:
- containerPort: 5320
env:
# pfSense admin password resolved by the 1Password operator.
# `FallbackPassword` is the Slice A seam exposed by
# OptionsFallbackPasswordResolver; Slice B will replace it with
# a pull-at-runtime 1P Connect resolver once Shared.Vault ships.
- name: FlowerCore__Dns__Providers__PfSenseUnbound__FallbackPassword
valueFrom:
secretKeyRef:
name: pfsense-admin
key: password
- name: FlowerCore__Auth__Oidc__Authority
valueFrom:
secretKeyRef:
name: dns-oidc-client
key: issuer_url
optional: true
- name: FlowerCore__Auth__Oidc__ClientId
valueFrom:
secretKeyRef:
name: dns-oidc-client
key: client_id
optional: true
- name: FlowerCore__Auth__Oidc__ClientSecret
valueFrom:
secretKeyRef:
name: dns-oidc-client
key: client_secret
optional: true
- name: FlowerCore__Auth__Enabled
value: "false"
- name: FlowerCore__Auth__Oidc__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Audience
value: "dns"
volumeMounts:
- name: data
mountPath: /data
- name: tmp
mountPath: /tmp
- name: logs
mountPath: /app/logs
- name: config
mountPath: /app/appsettings.Production.json
subPath: appsettings.Production.json
readOnly: true
resources:
requests:
cpu: 50m
memory: 96Mi
limits:
cpu: 300m
memory: 384Mi
readinessProbe:
httpGet:
path: /healthz
port: 5320
initialDelaySeconds: 10
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 5320
initialDelaySeconds: 20
periodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: dns-web-data
- name: tmp
emptyDir: {}
- name: logs
emptyDir: {}
- name: config
configMap:
name: dns-web-config
---
apiVersion: v1
kind: Service
metadata:
name: dns-web
namespace: fc-dns
spec:
selector:
app.kubernetes.io/name: dns-web
ports:
- port: 5320
targetPort: 5320
type: ClusterIP
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: dns-web
namespace: fc-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: dns-web
rules:
- apiGroups: [""]
resources: ["namespaces", "pods", "services", "secrets", "configmaps"]
verbs: ["get", "list", "watch"]
- apiGroups: ["cert-manager.io"]
resources: ["certificates"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dns-web
subjects:
- kind: ServiceAccount
name: dns-web
namespace: fc-dns
roleRef:
kind: ClusterRole
name: dns-web
apiGroup: rbac.authorization.k8s.io
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dns-web-cert
namespace: fc-dns
spec:
secretName: dns-web-tls
issuerRef:
name: step-ca-dns01
kind: ClusterIssuer
dnsNames:
- dns.iamworkin.lan
duration: 720h
renewBefore: 240h
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: dns-web
namespace: fc-dns
spec:
entryPoints: [websecure]
routes:
- match: Host(`dns.iamworkin.lan`)
kind: Rule
services:
- name: dns-web
port: 5320
tls:
secretName: dns-web-tls
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: dns-acme-webhook
namespace: fc-dns
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dns-acme-webhook
namespace: fc-dns
labels:
app.kubernetes.io/name: dns-acme-webhook
app.kubernetes.io/managed-by: flowercore
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: dns-acme-webhook
template:
metadata:
labels:
app.kubernetes.io/name: dns-acme-webhook
spec:
serviceAccountName: dns-acme-webhook
securityContext:
runAsNonRoot: true
runAsUser: 1654
runAsGroup: 1654
fsGroup: 1654
containers:
- name: dns-acme-webhook
image: localhost/fc-dns-acme-webhook:v20260614-wave5-isolation-6124856
imagePullPolicy: Never
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
ports:
- containerPort: 9443
name: https
env:
- name: ASPNETCORE_URLS
value: https://+:9443
- name: Kestrel__Certificates__Default__Path
value: /tls/tls.crt
- name: Kestrel__Certificates__Default__KeyPath
value: /tls/tls.key
- name: FlowerCore__Dns__AcmeWebhook__ServiceBaseUrl
value: http://dns-web:5320
- name: FlowerCore__Dns__AcmeWebhook__GroupName
value: acme.flowercore.io
- name: FlowerCore__Dns__AcmeWebhook__SolverName
value: flowercore-dns
- name: FlowerCore__Dns__AcmeWebhook__Version
value: v1alpha1
volumeMounts:
- name: tls
mountPath: /tls
readOnly: true
- name: tmp
mountPath: /tmp
- name: logs
mountPath: /app/logs
resources:
requests:
cpu: 25m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
readinessProbe:
httpGet:
scheme: HTTPS
path: /readyz
port: https
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 5
livenessProbe:
httpGet:
scheme: HTTPS
path: /healthz
port: https
initialDelaySeconds: 10
periodSeconds: 20
timeoutSeconds: 5
volumes:
- name: tls
secret:
secretName: dns-acme-webhook-tls
- name: tmp
emptyDir: {}
- name: logs
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: dns-acme-webhook
namespace: fc-dns
spec:
selector:
app.kubernetes.io/name: dns-acme-webhook
ports:
- port: 443
targetPort: https
name: https
type: ClusterIP
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: dns-acme-webhook-selfsigned
namespace: fc-dns
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dns-acme-webhook-ca
namespace: fc-dns
spec:
secretName: dns-acme-webhook-ca
duration: 43800h
issuerRef:
name: dns-acme-webhook-selfsigned
commonName: ca.dns-acme-webhook.fc-dns
isCA: true
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: dns-acme-webhook-ca-issuer
namespace: fc-dns
spec:
ca:
secretName: dns-acme-webhook-ca
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dns-acme-webhook-serving-cert
namespace: fc-dns
spec:
secretName: dns-acme-webhook-tls
duration: 8760h
issuerRef:
name: dns-acme-webhook-ca-issuer
dnsNames:
- dns-acme-webhook
- dns-acme-webhook.fc-dns
- dns-acme-webhook.fc-dns.svc
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1alpha1.acme.flowercore.io
annotations:
cert-manager.io/inject-ca-from: fc-dns/dns-acme-webhook-serving-cert
spec:
group: acme.flowercore.io
groupPriorityMinimum: 1000
service:
name: dns-acme-webhook
namespace: fc-dns
version: v1alpha1
versionPriority: 15
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: dns-acme-webhook-solver
rules:
- apiGroups: ["acme.flowercore.io"]
resources: ["flowercore-dns"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dns-acme-webhook-solver
subjects:
- kind: ServiceAccount
name: cert-manager
namespace: cert-manager
roleRef:
kind: ClusterRole
name: dns-acme-webhook-solver
apiGroup: rbac.authorization.k8s.io
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: step-ca-dns01
spec:
acme:
caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJ4RENDQVdxZ0F3SUJBZ0lSQVBZMzU3RzZvdzZ6TUFMNSs0YlMya2t3Q2dZSUtvWkl6ajBFQXdJd1FERWEKTUJnR0ExVUVDaE1SU1VGdFYyOXlhMmx1SUVGRFRVVWdRMEV4SWpBZ0JnTlZCQU1UR1VsQmJWZHZjbXRwYmlCQgpRMDFGSUVOQklGSnZiM1FnUTBFd0hoY05Nall3TXpBNE1UZ3dOekV4V2hjTk16WXdNekExTVRnd056RXhXakJBCk1Sb3dHQVlEVlFRS0V4RkpRVzFYYjNKcmFXNGdRVU5OUlNCRFFURWlNQ0FHQTFVRUF4TVpTVUZ0VjI5eWEybHUKSUVGRFRVVWdRMEVnVW05dmRDQkRRVEJaTUJNR0J5cUdTTTQ5QWdFR0NDcUdTTTQ5QXdFSEEwSUFCSjJuMDRYMQpKWm81WmRxL2kxSWR2OCtmcXdaeUF6Qmg3d2hicWowU1dzSkw4VVdSYWJDTXFZQ3M3K2RYTzB4UlN6cWt3RkRMCngrdm9vT2FpOFJnUk5oYWpSVEJETUE0R0ExVWREd0VCL3dRRUF3SUJCakFTQmdOVkhSTUJBZjhFQ0RBR0FRSC8KQWdFQk1CMEdBMVVkRGdRV0JCUm51UFBRUjZpTS9INnZPbHVpVTNTeWdheXo4akFLQmdncWhrak9QUVFEQWdOSQpBREJGQWlFQXJRSzlkWVBHbUFac2RZbmp6aXVGVlZFNU5LWlVjY2VZdkdmR0MrdExYVXNDSUF1ZEYyekpyQ1JxCjNtSzUwWlpFVC9md1RrSndpRUY0ODI0bWpQOHAxQ0tNCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
privateKeySecretRef:
name: step-ca-dns01-account-key
server: https://10.0.56.10:9443/acme/acme/directory
solvers:
- dns01:
webhook:
groupName: acme.flowercore.io
solverName: flowercore-dns

View File

@@ -0,0 +1,6 @@
# ArgoCD's bluejay-infra ApplicationSet discovers apps/* directories on main.
# The kustomization is included for local previews and single-app validation.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- fc-dns.yaml

View File

@@ -0,0 +1,195 @@
# FlowerCore.Library.Web GitOps adoption manifest.
#
# Authored from the already-live fc-library resources on 2026-06-04.
# Keep the live image tag, Service ClusterIP, and PVC volumeName unchanged so
# ArgoCD adopts in place instead of replacing the workload or data volume.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: library-web-data
namespace: fc-library
labels:
app.kubernetes.io/name: library-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-library
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: longhorn
volumeMode: Filesystem
volumeName: pvc-2690bae2-4ee0-417a-b95f-50ec5c632b63
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: library-web
namespace: fc-library
labels:
app.kubernetes.io/name: library-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-library
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 3
selector:
matchLabels:
app.kubernetes.io/name: library-web
strategy:
type: Recreate
template:
metadata:
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/health"
prometheus.io/path: /metrics/prometheus
prometheus.io/port: "5000"
prometheus.io/scrape: "true"
labels:
app.kubernetes.io/name: library-web
app.kubernetes.io/part-of: flowercore
spec:
containers:
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
- envFrom:
- configMapRef:
name: library-web-config
image: localhost/fc-library-web:v20260614-regroup-f20adc1
imagePullPolicy: Never
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 5000
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
name: library-web
ports:
- containerPort: 5000
name: http
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /health
port: 5000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: library-web-data
---
apiVersion: v1
kind: Service
metadata:
name: library-web
namespace: fc-library
labels:
app.kubernetes.io/name: library-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-library
spec:
clusterIP: 10.43.179.63
clusterIPs:
- 10.43.179.63
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
port: 80
protocol: TCP
targetPort: 5000
selector:
app.kubernetes.io/name: library-web
sessionAffinity: None
type: ClusterIP
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: library-web-tls
namespace: fc-library
labels:
app.kubernetes.io/name: library-web-tls
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-library
spec:
dnsNames:
- library.iamworkin.lan
issuerRef:
kind: ClusterIssuer
name: step-ca-acme
secretName: library-web-tls
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: library-web
namespace: fc-library
labels:
app.kubernetes.io/name: library-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-library
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`library.iamworkin.lan`)
services:
- name: library-web
port: 80
tls:
secretName: library-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose library-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: library-web-public
# namespace: fc-library
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`library.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: library-web-public-profile-header # injects entitlement profile
# services:
# - name: library-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -83,6 +83,8 @@ spec:
app.kubernetes.io/name: fc-llm-bridge app.kubernetes.io/name: fc-llm-bridge
app.kubernetes.io/part-of: flowercore app.kubernetes.io/part-of: flowercore
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics" prometheus.io/path: "/metrics"
@@ -116,6 +118,7 @@ spec:
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: "http://+:8080" value: "http://+:8080"
@@ -161,11 +164,33 @@ spec:
name: fc-llm-bridge-api-keys name: fc-llm-bridge-api-keys
key: spare-2 key: spare-2
optional: true optional: true
# Shared.Chat — Ollama (edge1 Pi 5 + AI HAT+, matches bridge default) # Shared.Chat — GX10 Ollama via the INFRA-VLAN NodePort (10.0.56.14:30976),
# NOT the PROD-VLAN MetalLB VIP (10.0.57.201:11434). The cross-VLAN path to
# the VIP MTU-black-holes LARGE requests: Agent Zero's full prompt (458-line
# system prompt + 108 MCP tool descriptions ~150KB) times out / resets mid-
# stream there ("Connection reset by peer" in OllamaClient.ChatStreamAsync),
# which made AZ loop on "you have sent the same message again". The NodePort is
# same-VLAN as the old cluster (no inter-VLAN hop) and carries 150KB fine.
# (Small chat/embed requests still work on the VIP; only big agentic prompts broke.)
- name: FlowerCore__Chat__OllamaBaseUrl - name: FlowerCore__Chat__OllamaBaseUrl
value: "http://10.0.57.17:11434" value: "http://10.0.56.14:30976"
- name: FlowerCore__Chat__HttpTimeout - name: FlowerCore__Chat__HttpTimeout
value: "00:05:00" value: "00:05:00"
# Tier routing override (Wiring A, 2026-06-14): repoint Agent Zero's
# chat (Balanced) + util (Cheap) tiers to the GX10's tool-capable
# local qwen2.5. Balanced was Anthropic Sonnet (cloud/cost, and the
# Anthropic key is currently 401); Cheap was gemma3:4b which CANNOT
# call tools (400 does not support tools) — fatal for an agentic loop.
# qwen2.5 instruct supports the tool-calling loop; GX10 has the memory.
# OllamaBaseUrl above points at the GX10 NodePort (10.0.56.14:30976).
- name: FlowerCore__Chat__ModelRouter__DefaultRoutes__Balanced__Provider
value: "Ollama"
- name: FlowerCore__Chat__ModelRouter__DefaultRoutes__Balanced__Model
value: "qwen2.5:14b"
- name: FlowerCore__Chat__ModelRouter__DefaultRoutes__Cheap__Provider
value: "Ollama"
- name: FlowerCore__Chat__ModelRouter__DefaultRoutes__Cheap__Model
value: "qwen2.5:7b"
# Shared.Chat — Anthropic # Shared.Chat — Anthropic
- name: FlowerCore__Chat__Anthropic__Enabled - name: FlowerCore__Chat__Anthropic__Enabled
value: "true" value: "true"
@@ -281,3 +306,26 @@ spec:
port: 8080 port: 8080
tls: tls:
secretName: fc-llm-bridge-tls secretName: fc-llm-bridge-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose fc-llm-bridge publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: fc-llm-bridge-public
# namespace: fc-llm-bridge
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`llm-bridge.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: fc-llm-bridge-public-profile-header # injects entitlement profile
# services:
# - name: fc-llm-bridge
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

296
apps/fc-media/fc-media.yaml Normal file
View File

@@ -0,0 +1,296 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: fc-media
labels:
app.kubernetes.io/name: fc-media
app.kubernetes.io/part-of: flowercore
---
apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
name: media-oidc-client
namespace: fc-media
labels:
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
itemPath: "vaults/IAmWorkin/items/media-oidc-client"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: fc-media-config
namespace: fc-media
labels:
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
data:
appsettings.Production.json: |
{
"DatabaseProvider": "Sqlite",
"ConnectionStrings": {
"Sqlite": "Data Source=/data/media.db"
},
"FlowerCore": {
"Auth": {
"Enabled": true,
"Oidc": {
"Authority": "https://id.iamworkin.lan/application/o/media/",
"ClientId": "media",
"ClientSecret": "",
"Audience": "media",
"RequireHttpsMetadata": true
}
},
"Tenant": {
"JwtClaimsEnabled": false,
"DefaultTenantHosts": [ "media.iamworkin.lan" ]
}
},
"Media": {
"LibraryRoot": "/media/library",
"Sources": [
{
"Name": "BlueJayNAS Video",
"Driver": "Nfs",
"MountedPath": "/media/library",
"RemotePath": "nfs://10.0.58.3/volume1/video",
"IsEnabled": true,
"IsDefault": true,
"Notes": "Synology NFS media share mounted read-only inside the cluster."
}
],
"GeneratedRoot": "/data/generated",
"TranscodeRoot": "/data/transcodes",
"InboxPath": "/media/inbox",
"InboxScanIntervalMinutes": 5,
"ScanOnStartup": false,
"ComputeChecksums": false,
"FfmpegCommand": "ffmpeg",
"FfprobeCommand": "ffprobe",
"Hls": {
"MaxConcurrentJobs": 1
},
"DefaultViewerName": "BlueJay",
"Dlna": {
"IsEnabled": true,
"MulticastAddress": "239.255.255.250",
"Port": 1900,
"DiscoveryTimeoutSeconds": 2,
"DescriptionFetchTimeoutSeconds": 2,
"MaxResponsesPerSearchTarget": 32,
"SearchTargets": [
"urn:schemas-upnp-org:device:MediaRenderer:1",
"urn:schemas-upnp-org:device:MediaServer:1"
]
}
}
}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: fc-media-data
namespace: fc-media
labels:
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: longhorn
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fc-media-web
namespace: fc-media
labels:
app: fc-media-web
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: fc-media-web
template:
metadata:
labels:
app: fc-media-web
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "5200"
prometheus.io/path: "/metrics"
flowercore.io/healthz-auth-policy: "allow-anonymous"
spec:
nodeSelector:
kubernetes.io/hostname: rke2-server
containers:
- name: fc-media-web
image: localhost/fc-media-web:v20260604-oidc-proper
imagePullPolicy: Never
ports:
- containerPort: 5200
name: http
env:
- name: ASPNETCORE_ENVIRONMENT
value: Production
- name: ASPNETCORE_URLS
value: http://+:5200
- name: FlowerCore__Auth__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Audience
value: "media"
- name: FlowerCore__Auth__Oidc__ClientId
valueFrom:
secretKeyRef:
name: media-oidc-client
key: client_id
optional: true
- name: FlowerCore__Auth__Oidc__ClientSecret
valueFrom:
secretKeyRef:
name: media-oidc-client
key: client_secret
optional: true
- name: FlowerCore__Auth__Oidc__Authority
valueFrom:
secretKeyRef:
name: media-oidc-client
key: issuer_url
optional: true
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: "4"
memory: 4Gi
volumeMounts:
- name: config
mountPath: /app/appsettings.Production.json
subPath: appsettings.Production.json
readOnly: true
- name: data
mountPath: /data
- name: transcodes
mountPath: /data/transcodes
- name: media-library
mountPath: /media/library
readOnly: true
- name: media-inbox
mountPath: /media/inbox
startupProbe:
httpGet:
path: /healthz
port: 5200
httpHeaders:
- name: X-Forwarded-Proto
value: https
failureThreshold: 18
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 5200
httpHeaders:
- name: X-Forwarded-Proto
value: https
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 5200
httpHeaders:
- name: X-Forwarded-Proto
value: https
initialDelaySeconds: 30
periodSeconds: 30
volumes:
- name: config
configMap:
name: fc-media-config
- name: data
persistentVolumeClaim:
claimName: fc-media-data
- name: transcodes
nfs:
server: 10.0.58.3
path: /volume1/kubernetes/fc-media-transcodes
- name: media-inbox
nfs:
server: 10.0.58.3
path: /volume1/kubernetes/fc-media-inbox
- name: media-library
nfs:
server: 10.0.58.3
path: /volume1/video
readOnly: true
---
apiVersion: v1
kind: Service
metadata:
name: fc-media-web
namespace: fc-media
labels:
app: fc-media-web
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
type: ClusterIP
selector:
app: fc-media-web
ports:
- port: 5200
targetPort: 5200
protocol: TCP
name: http
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: fc-media-tls
namespace: fc-media
labels:
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
secretName: fc-media-tls
issuerRef:
name: step-ca-acme
kind: ClusterIssuer
dnsNames:
- media.iamworkin.lan
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: fc-media-web
namespace: fc-media
labels:
app.kubernetes.io/name: fc-media-web
app.kubernetes.io/part-of: flowercore
spec:
entryPoints:
- websecure
routes:
- match: Host(`media.iamworkin.lan`)
kind: Rule
services:
- name: fc-media-web
port: 5200
tls:
secretName: fc-media-tls

View File

@@ -0,0 +1,6 @@
# ArgoCD's bluejay-infra ApplicationSet discovers apps/* directories on main.
# The kustomization is included for local previews and single-app validation.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- fc-media.yaml

View File

@@ -30,3 +30,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: menuboard-web-tls secretName: menuboard-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose menuboard-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: menuboard-web-public
# namespace: fc-menuboard
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`menuboard.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: menuboard-web-public-profile-header # injects entitlement profile
# services:
# - name: menuboard-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -41,6 +41,8 @@ spec:
labels: labels:
app: messageboard-web app: messageboard-web
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/health"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics/prometheus" prometheus.io/path: "/metrics/prometheus"
@@ -52,6 +54,7 @@ spec:
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
envFrom: envFrom:
- configMapRef: - configMapRef:
name: messageboard-web-config name: messageboard-web-config
@@ -141,3 +144,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: messageboard-web-tls secretName: messageboard-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose messageboard-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: messageboard-web-public
# namespace: fc-messageboard
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`messageboard.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: messageboard-web-public-profile-header # injects entitlement profile
# services:
# - name: messageboard-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -30,3 +30,26 @@ spec:
port: 5300 port: 5300
tls: tls:
secretName: mysql-web-tls secretName: mysql-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose mysql-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: mysql-web-public
# namespace: fc-mysql
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`mysql.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: mysql-web-public-profile-header # injects entitlement profile
# services:
# - name: mysql-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -0,0 +1,33 @@
# Certificate for network.iamworkin.lan.
#
# Preflight gate: network.iamworkin.lan must resolve to 10.0.56.200 before this
# Certificate is synced. step-ca ACME cannot see the CoreDNS wildcard
# (*.iamworkin.lan -> 10.0.56.200) — it does an HTTP-01 challenge against the
# resolved host. The CoreDNS wildcard template covers network.iamworkin.lan, so
# resolution exists fleet-wide; do NOT add a pfSense DNS override (this plane is
# read-only and holds no pfSense creds). If ACME backs off, confirm the wildcard
# resolves first (feedback_pfsense_dns_required_for_acme).
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: fc-network-web-tls
namespace: fc-network
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
annotations:
flowercore.io/dns-preflight: "network.iamworkin.lan must resolve to 10.0.56.200 (CoreDNS wildcard) before ACME sync"
spec:
secretName: fc-network-web-tls
issuerRef:
name: step-ca-acme
kind: ClusterIssuer
dnsNames:
- network.iamworkin.lan
duration: 720h
renewBefore: 240h

View File

@@ -0,0 +1,145 @@
# FlowerCore.Network.Web — the pfSense automation plane (read-only Phase 0, ADR-189).
#
# Phase 0 is READ-ONLY: the service holds NO pfSense credentials and has no write
# path to pfSense anywhere. The only mutating endpoint is POST /api/v1/snapshots,
# which ingests a config.xml the noc1 exporter collected READ-ONLY and stores it
# (redacted projection) on the PVC. Auth ships gate-OFF.
#
# Image localhost/fc-network-web:<tag> is built by FlowerCore.Network
# scripts/deploy-k8s.sh and imported to all schedulable RKE2 nodes (rke2-server +
# rke2-agent1; agent2 retired). imagePullPolicy: Never — bump the tag here, sync
# ArgoCD, then scale 0->1 for the RWO PVC and verify the running pod imageID.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fc-network-web
namespace: fc-network
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
annotations:
flowercore.io/traceability-standard: k8s-pod-ownership-and-traceability-standard
spec:
replicas: 1
revisionHistoryLimit: 3
# RWO PVC: a single replica can't be surged (the new pod can't mount the volume
# while the old one holds it). maxSurge 0 / maxUnavailable 1 is the rwo-safe shape;
# for image bumps scale 0->1 rather than rollout restart.
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
selector:
matchLabels:
app: fc-network-web
template:
metadata:
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true"
prometheus.io/port: "5340"
prometheus.io/path: "/metrics/prometheus"
flowercore.io/audit-trace-id: "runtime-activity-trace"
spec:
securityContext:
fsGroup: 1654
fsGroupChangePolicy: OnRootMismatch
containers:
- name: web
image: localhost/fc-network-web:v20260612-0b5b049
imagePullPolicy: Never
ports:
- name: http
containerPort: 5340
# fc-safe-to-expose: read-only plane, auth gate-OFF; X-Forwarded-Proto handled
# by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env:
- name: ASPNETCORE_URLS
value: "http://+:5340"
- name: ASPNETCORE_ENVIRONMENT
value: "Production"
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false"
- name: HOME
value: "/data"
- name: FlowerCore__Auth__Enabled
value: "false"
- name: FlowerCore__Database__Provider
value: "Sqlite"
- name: FlowerCore__Database__ConnectionStrings__Sqlite
value: "Data Source=/data/network.db"
# Snapshot store + intended-model paths MUST be absolute on the PVC —
# the default is relative to the read-only content root.
- name: FlowerCore__Network__SnapshotStore__RootDirectory
value: "/data/snapshots"
- name: FlowerCore__Network__SnapshotStore__UseGitHistory
value: "true"
- name: FlowerCore__Network__IntendedModel__FilePath
value: "/data/intended.json"
resources:
requests:
cpu: 50m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
startupProbe:
httpGet:
path: /healthz
port: 5340
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 30
readinessProbe:
httpGet:
path: /healthz
port: 5340
periodSeconds: 10
failureThreshold: 3
livenessProbe:
httpGet:
path: /healthz
port: 5340
initialDelaySeconds: 30
periodSeconds: 30
failureThreshold: 3
securityContext:
runAsNonRoot: true
runAsUser: 1654
runAsGroup: 1654
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumeMounts:
- name: data
mountPath: /data
- name: tmp
mountPath: /tmp
- name: logs
mountPath: /app/logs
volumes:
- name: data
persistentVolumeClaim:
claimName: fc-network-web-data
- name: tmp
emptyDir: {}
- name: logs
emptyDir: {}

View File

@@ -0,0 +1,32 @@
# LAN ingress for FlowerCore.Network Web (network.iamworkin.lan).
#
# RKE2 Traefik has no built-in ACME resolver; TLS certificate ownership stays in
# cert-manager Certificate/fc-network-web-tls. Phase 0 is read-only but the POST
# ingest endpoint is genuinely needed by the noc1 exporter, so this route allows
# all methods (no GET/HEAD-only restriction like fc-dns) — the service itself has
# NO pfSense write path, so allowing POST here only reaches the local snapshot
# ingest.
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: fc-network-web
namespace: fc-network
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
entryPoints:
- websecure
routes:
- match: Host(`network.iamworkin.lan`)
kind: Rule
services:
- name: fc-network-web
port: 80
tls:
secretName: fc-network-web-tls

View File

@@ -0,0 +1,11 @@
# ArgoCD's bluejay-infra ApplicationSet discovers apps/* directories on main.
# The kustomization is included for local previews and single-app validation.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- pvc.yaml
- deployment-web.yaml
- service-web.yaml
- certificate-web.yaml
- ingressroute-web.yaml

View File

@@ -0,0 +1,8 @@
apiVersion: v1
kind: Namespace
metadata:
name: fc-network
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra

27
apps/fc-network/pvc.yaml Normal file
View File

@@ -0,0 +1,27 @@
# Persistent store for FlowerCore.Network (read-only pfSense automation plane).
#
# Holds the SQLite snapshot INDEX db (network.db) AND the on-box snapshot store
# (data/snapshots): full-fidelity raw config.xml + redacted inventory sidecars +
# an on-box git history. Full-fidelity config is on-box ONLY (this PVC); the
# service DB / REST / MCP / UI only ever surface the REDACTED projection.
# RWO — single replica, scale 0->1 for updates (never rollout restart).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: fc-network-web-data
namespace: fc-network
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 2Gi

View File

@@ -0,0 +1,21 @@
apiVersion: v1
kind: Service
metadata:
name: fc-network-web
namespace: fc-network
labels:
app: fc-network-web
app.kubernetes.io/name: fc-network-web
app.kubernetes.io/component: web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra
spec:
selector:
app: fc-network-web
ports:
- name: http
port: 80
targetPort: 5340
type: ClusterIP

View File

@@ -30,3 +30,26 @@ spec:
port: 5400 port: 5400
tls: tls:
secretName: php-web-tls secretName: php-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose php-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: php-web-public
# namespace: fc-php
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`php.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: php-web-public-profile-header # injects entitlement profile
# services:
# - name: php-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -30,3 +30,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: presentations-web-tls secretName: presentations-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose presentations-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: presentations-web-public
# namespace: fc-presentations
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`presentations.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: presentations-web-public-profile-header # injects entitlement profile
# services:
# - name: presentations-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -0,0 +1,196 @@
# FlowerCore.Retail.Web GitOps adoption manifest.
#
# Authored from the already-live fc-retail resources on 2026-06-04.
# Keep the live image tag, Service ClusterIP, and PVC volumeName unchanged so
# ArgoCD adopts in place instead of replacing the workload or data volume.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: retail-web-data
namespace: fc-retail
labels:
app.kubernetes.io/name: retail-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-retail
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: longhorn
volumeMode: Filesystem
volumeName: pvc-3d40b336-eab4-41b3-812c-d5e9413ce0ab
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: retail-web
namespace: fc-retail
labels:
app.kubernetes.io/name: retail-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-retail
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 3
selector:
matchLabels:
app.kubernetes.io/name: retail-web
strategy:
type: Recreate
template:
metadata:
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
kubectl.kubernetes.io/restartedAt: "2026-06-02T01:34:08-05:00"
prometheus.io/path: /metrics/prometheus
prometheus.io/port: "5000"
prometheus.io/scrape: "true"
labels:
app.kubernetes.io/name: retail-web
app.kubernetes.io/part-of: flowercore
spec:
containers:
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
- envFrom:
- configMapRef:
name: retail-web-config
image: localhost/fc-retail-web:v20260614-regroup-6d81424
imagePullPolicy: Never
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 5000
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
name: retail-web
ports:
- containerPort: 5000
name: http
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /health
port: 5000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: retail-web-data
---
apiVersion: v1
kind: Service
metadata:
name: retail-web
namespace: fc-retail
labels:
app.kubernetes.io/name: retail-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-retail
spec:
clusterIP: 10.43.239.8
clusterIPs:
- 10.43.239.8
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
port: 80
protocol: TCP
targetPort: 5000
selector:
app.kubernetes.io/name: retail-web
sessionAffinity: None
type: ClusterIP
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: retail-web-tls
namespace: fc-retail
labels:
app.kubernetes.io/name: retail-web-tls
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-retail
spec:
dnsNames:
- retail.iamworkin.lan
issuerRef:
kind: ClusterIssuer
name: step-ca-acme
secretName: retail-web-tls
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: retail-web
namespace: fc-retail
labels:
app.kubernetes.io/name: retail-web
app.kubernetes.io/part-of: flowercore
app.kubernetes.io/managed-by: argocd
argocd.argoproj.io/instance: infra-fc-retail
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`retail.iamworkin.lan`)
services:
- name: retail-web
port: 80
tls:
secretName: retail-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose retail-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: retail-web-public
# namespace: fc-retail
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`retail.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: retail-web-public-profile-header # injects entitlement profile
# services:
# - name: retail-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -30,3 +30,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: scoreboard-web-tls secretName: scoreboard-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose scoreboard-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: scoreboard-web-public
# namespace: fc-scoreboard
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`scoreboard.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: scoreboard-web-public-profile-header # injects entitlement profile
# services:
# - name: scoreboard-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -37,3 +37,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: segmentdisplay-web-tls secretName: segmentdisplay-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose segmentdisplay-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: segmentdisplay-web-public
# namespace: fc-segmentdisplay
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`segmentdisplay.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: segmentdisplay-web-public-profile-header # injects entitlement profile
# services:
# - name: segmentdisplay-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -46,3 +46,26 @@ spec:
services: services:
- name: signage-web - name: signage-web
port: 5190 port: 5190
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose signage-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: signage-web-public
# namespace: fc-signage
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`signage.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: signage-web-public-profile-header # injects entitlement profile
# services:
# - name: signage-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -97,6 +97,7 @@ spec:
containers: containers:
- name: piper - name: piper
image: rhasspy/wyoming-piper:latest image: rhasspy/wyoming-piper:latest
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: PYTHONHTTPSVERIFY - name: PYTHONHTTPSVERIFY
value: "0" value: "0"
@@ -523,6 +524,8 @@ spec:
app.kubernetes.io/name: ttsreader-web app.kubernetes.io/name: ttsreader-web
app.kubernetes.io/part-of: flowercore app.kubernetes.io/part-of: flowercore
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/health"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "5217" prometheus.io/port: "5217"
prometheus.io/path: "/metrics" prometheus.io/path: "/metrics"
@@ -532,7 +535,7 @@ spec:
fsGroupChangePolicy: OnRootMismatch fsGroupChangePolicy: OnRootMismatch
containers: containers:
- name: web - name: web
image: localhost/fc-ttsreader-web:v20260518-sprint36-demo-finish-b132cbf image: localhost/fc-ttsreader-web:v20260614-wave5-help-2f096e3
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 5217 - containerPort: 5217
@@ -554,6 +557,8 @@ spec:
value: "/data/chapter-context.db" value: "/data/chapter-context.db"
- name: TtsReader__Jobs__Root - name: TtsReader__Jobs__Root
value: "/data/jobs" value: "/data/jobs"
- name: TtsReader__Export__LocalCasRoot
value: "/data/bundles/cas"
- name: TtsReader__Piper__Host - name: TtsReader__Piper__Host
value: "10.0.57.17" value: "10.0.57.17"
- name: TtsReader__Piper__Port - name: TtsReader__Piper__Port
@@ -600,7 +605,7 @@ spec:
- name: TtsReader__Transcription__TimeoutSeconds - name: TtsReader__Transcription__TimeoutSeconds
value: "300" value: "300"
- name: TtsReader__Ollama__BaseUrl - name: TtsReader__Ollama__BaseUrl
value: "http://10.0.57.17:11434" value: "http://10.0.57.201:11434"
- name: TtsReader__Ollama__DefaultModel - name: TtsReader__Ollama__DefaultModel
value: "gemma3:4b" value: "gemma3:4b"
- name: TtsReader__Ollama__TimeoutSeconds - name: TtsReader__Ollama__TimeoutSeconds
@@ -760,3 +765,26 @@ spec:
port: 5217 port: 5217
tls: tls:
secretName: ttsreader-tls secretName: ttsreader-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose ttsreader-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: ttsreader-web-public
# namespace: fc-ttsreader
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`ttsreader.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: ttsreader-web-public-profile-header # injects entitlement profile
# services:
# - name: ttsreader-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -52,17 +52,21 @@ spec:
app: updatecenter-web app: updatecenter-web
template: template:
metadata: metadata:
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/"
labels: labels:
app: updatecenter-web app: updatecenter-web
spec: spec:
nodeName: rke2-server nodeName: rke2-server
containers: containers:
- name: web - name: web
image: localhost/fc-updater-web:v20260509-4162dca-authgate image: localhost/fc-updater-web:v20260614-regroup-bdf4a4a
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: http://+:8080 value: http://+:8080
@@ -88,6 +92,8 @@ spec:
value: Faith AI Mike Edition value: Faith AI Mike Edition
- name: FlowerCore__Updater__PublicShares__Links__0__Description - name: FlowerCore__Updater__PublicShares__Links__0__Description
value: Private release link for Mike's Faith AI bundle. value: Private release link for Mike's Faith AI bundle.
- name: FlowerCore__Audit__Sinks__Loki__Enabled
value: "false"
- name: FlowerCore__Updater__Auth__Bootstrap__Enabled - name: FlowerCore__Updater__Auth__Bootstrap__Enabled
value: "true" value: "true"
- name: FlowerCore__Updater__Auth__Bootstrap__Username - name: FlowerCore__Updater__Auth__Bootstrap__Username

View File

@@ -12,6 +12,8 @@ All repo-scoped Linux runners use:
- `ACCESS_TOKEN` from the `github-runner-token` Secret - `ACCESS_TOKEN` from the `github-runner-token` Secret
- `RUN_AS_ROOT=false` - `RUN_AS_ROOT=false`
- `EPHEMERAL=true` - `EPHEMERAL=true`
- `DISABLE_AUTO_UPDATE=true` so the runner does not self-update and exit inside
the immutable Kubernetes pod
- `LABELS=self-hosted,linux,fc-build-linux` - `LABELS=self-hosted,linux,fc-build-linux`
- writable non-root paths under `/home/runner` for .NET, NuGet, XDG cache, and - writable non-root paths under `/home/runner` for .NET, NuGet, XDG cache, and
Actions tool cache Actions tool cache
@@ -131,3 +133,7 @@ from GitHub Actions and verify it lands on an `rke2-linux-*` runner.
value does not change. value does not change.
- `Multi-Attach` volume error: only the Common runner uses a RWO PVC and it must - `Multi-Attach` volume error: only the Common runner uses a RWO PVC and it must
stay single-replica. New multi-replica runners use `emptyDir`. stay single-replica. New multi-replica runners use `emptyDir`.
- Runner pods repeatedly registering, downloading a newer Actions runner, then
exiting with code 4: verify `DISABLE_AUTO_UPDATE=true` is present. The image
translates that into `config.sh --disableupdate`; without it, the Deployment
controller sees the expected self-update exit as CrashLoopBackOff.

File diff suppressed because it is too large Load Diff

View File

@@ -44,9 +44,32 @@ spec:
labels: labels:
app: intranet-web app: intranet-web
spec: spec:
# notes-corpus-clone: shallow-clones the Notes docs corpus into an emptyDir so
# the IntranetSearch indexer has /srv/flowercore-notes/docs to index. Uses the
# trailing-dot FQDN (gitea-clusterip.gitea.svc.cluster.local.) to bypass the
# CoreDNS *.iamworkin.lan template that otherwise resolves the in-cluster service
# name to the Traefik VIP for musl / ndots:5 pods (search-domain appending).
# Cred: gitea-corpus-cred (in-ns secret with the canonical 1P bluejay read cred;
# mirrors the imperative gitea-flowercore-notes argocd repo-cred pattern).
initContainers:
- name: notes-corpus-clone
image: alpine/git:2.45.2
imagePullPolicy: IfNotPresent
envFrom:
- secretRef:
name: gitea-corpus-cred
env:
- name: GIT_LFS_SKIP_SMUDGE
value: "1"
command: ["/bin/sh", "-c"]
args:
- 'git clone --depth 1 http://$username:$password@gitea-clusterip.gitea.svc.cluster.local.:3000/bluejay/FlowerCore.Notes.git /srv/flowercore-notes && echo "notes corpus cloned; docs entries:" && ls /srv/flowercore-notes/docs | wc -l'
volumeMounts:
- name: notes-corpus
mountPath: /srv/flowercore-notes
containers: containers:
- name: intranet-web - name: intranet-web
image: localhost/fc-intranet-web:v20260508-brochure-w1 image: localhost/fc-intranet-web:v20260614-wave5-knowledgefleet-1458b4d
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 5300 - containerPort: 5300
@@ -56,18 +79,32 @@ spec:
value: Production value: Production
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: "http://+:5300" value: "http://+:5300"
# Bulk corpus indexing on edge1 Pi 5 takes ~6s/chunk × 5665 chunks # Embed backend = edge1 Ollama BY IPv4 (10.0.57.17:11434; has
# ≈ 9 hours. BLUEJAY-WS GPU (R9700, 32GB VRAM) does the same work # nomic-embed-text). The hostname edge1.iamworkin.lan is UNUSABLE from
# in minutes. Memory: feedback_pi5_nomic_embed_slow. # cluster pods: it resolves to an unroutable IPv6 (fdbc:56:*) and the
# CoreDNS *.iamworkin.lan template maps the name to the Traefik VIP, so
# embeds failed with "No route to host". Use a bare pod-routable IPv4.
# Backend is BLUEJAY-AI's GPU node (Ollama / Vulkan Iris Xe, INFRA VLAN
# 10.0.56.132) which embeds nomic-embed-text in ~160ms vs the edge1 Pi 5's
# ~3.2s for the same ~512-token chunk (~20x faster bulk embed), proven
# pod-routable from the intranet namespace 2026-06-13. The prior edge1 Pi 5
# backend (10.0.57.17:11434) remains a working fallback if BLUEJAY-AI is
# down. Bulk embed runs in the background; /health does not depend on it.
# Memory: feedback_pi5_nomic_embed_slow.
- name: IntranetSearch__OllamaBaseUrl - name: IntranetSearch__OllamaBaseUrl
value: "http://10.0.56.20:11434" value: "http://10.0.57.201:11434"
# Sprint E Phase 2α — JSON-file-backed PageReadingOverride persistence # Notes docs corpus IS now mounted at /srv/flowercore-notes (see the
# on the writable PVC at /data. Without this env var the # notes-corpus-clone initContainer + notes-corpus-sync sidecar), so the
# intranet falls back to the in-memory store (loses state on # IntranetSearch indexer is ENABLED. First-boot bulk embed of the corpus
# pod restart). Master's PageReadingOverrideOptions binds # runs in the background via the edge1 Ollama backend above (~6s/chunk on
# PageReadingOverrides:FilePath. # the Pi 5); /health readiness does not depend on it, so the pod stays Ready.
- name: PageReadingOverrides__FilePath - name: IntranetSearch__Enabled
value: "/data/page-reading-overrides.json" value: "true"
# Page-reading override SQLite persistence on the writable PVC at
# /data. This backs pronunciation, notes, corrections, and
# page-profile metadata across pod restarts.
- name: PageReadingOverrides__DatabasePath
value: "/data/page-reading-overrides.db"
- name: KnowledgeFleetSearch__BaseUrl - name: KnowledgeFleetSearch__BaseUrl
value: "https://knowledge.iamworkin.lan" value: "https://knowledge.iamworkin.lan"
- name: KnowledgeFleetSearch__ApiKey - name: KnowledgeFleetSearch__ApiKey
@@ -104,10 +141,40 @@ spec:
volumeMounts: volumeMounts:
- name: vector-store - name: vector-store
mountPath: /data mountPath: /data
- name: notes-corpus
mountPath: /srv/flowercore-notes
readOnly: true
# notes-corpus-sync: keeps the mounted corpus fresh between pod restarts by
# pulling the Notes repo every 30 min (best-effort; the initContainer guarantees
# a fresh clone at pod start). Reuses the clone's origin (trailing-dot host + creds).
- name: notes-corpus-sync
image: alpine/git:2.45.2
imagePullPolicy: IfNotPresent
envFrom:
- secretRef:
name: gitea-corpus-cred
env:
- name: GIT_LFS_SKIP_SMUDGE
value: "1"
command: ["/bin/sh", "-c"]
args:
- 'while true; do sleep 1800; git -C /srv/flowercore-notes pull --depth 1 2>&1 | sed "s/^/[notes-corpus-sync] /" || true; done'
resources:
requests:
memory: "32Mi"
cpu: "10m"
limits:
memory: "128Mi"
cpu: "200m"
volumeMounts:
- name: notes-corpus
mountPath: /srv/flowercore-notes
volumes: volumes:
- name: vector-store - name: vector-store
persistentVolumeClaim: persistentVolumeClaim:
claimName: intranet-vector-store claimName: intranet-vector-store
- name: notes-corpus
emptyDir: {}
--- ---
apiVersion: v1 apiVersion: v1
kind: Service kind: Service

View File

@@ -90,9 +90,12 @@ spec:
app.kubernetes.io/name: knowledge-web app.kubernetes.io/name: knowledge-web
app.kubernetes.io/part-of: bluejay-infra app.kubernetes.io/part-of: bluejay-infra
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics" prometheus.io/path: "/metrics"
flowercore.io/healthz-auth-policy: "allow-anonymous"
spec: spec:
securityContext: securityContext:
runAsNonRoot: true runAsNonRoot: true
@@ -102,7 +105,7 @@ spec:
- name: web - name: web
# Placeholder tag — bump to the image you built + imported to ALL # Placeholder tag — bump to the image you built + imported to ALL
# RKE2 nodes via scripts/deploy-knowledge.sh before applying. # RKE2 nodes via scripts/deploy-knowledge.sh before applying.
image: localhost/fc-knowledge-web:v20260429232635 image: localhost/fc-knowledge-web:v20260603-oidc-authentik-auditfix
imagePullPolicy: Never imagePullPolicy: Never
command: command:
- /bin/sh - /bin/sh
@@ -116,6 +119,7 @@ spec:
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: "http://+:8080" value: "http://+:8080"
@@ -123,6 +127,25 @@ spec:
value: "Production" value: "Production"
- name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT - name: DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
value: "false" value: "false"
# AuthentiK/OIDC is enforced. /healthz stays anonymous by contract;
# see flowercore.io/healthz-auth-policy above and the Sprint 58
# OIDC readiness probe audit.
- name: FlowerCore__Auth__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Enabled
value: "true"
- name: FlowerCore__Auth__Oidc__Authority
value: "https://id.iamworkin.lan/application/o/knowledge/"
- name: FlowerCore__Auth__Oidc__Audience
value: "knowledge"
- name: FlowerCore__Auth__Oidc__ClientId
value: "knowledge"
- name: FlowerCore__Auth__Oidc__ClientSecret
valueFrom:
secretKeyRef:
name: knowledge-oidc-client
key: client_secret
optional: true
# Vector-store directory + embedding model + edition profile dir. # Vector-store directory + embedding model + edition profile dir.
# Profile JSON is baked into the image at /home/app/editions via the # Profile JSON is baked into the image at /home/app/editions via the
# csproj Content-link from FlowerCore.Common/editions/. # csproj Content-link from FlowerCore.Common/editions/.
@@ -134,6 +157,8 @@ spec:
value: "5" value: "5"
- name: Knowledge__MaxLimit - name: Knowledge__MaxLimit
value: "50" value: "50"
- name: Knowledge__Federation__DatabasePath
value: "/data/vector-stores/knowledge-federation.db"
- name: FlowerCore__Editions__ProfileDirectory - name: FlowerCore__Editions__ProfileDirectory
value: "/home/app/editions" value: "/home/app/editions"
# Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster # Embed via edge1 Pi 5 + AI HAT+ (10.0.57.17:11434). Cluster
@@ -143,7 +168,7 @@ spec:
# need a separate ingestion lane that can opt into the # need a separate ingestion lane that can opt into the
# workstation GPU when present. # workstation GPU when present.
- name: FlowerCore__Ollama__BaseUrl - name: FlowerCore__Ollama__BaseUrl
value: "http://10.0.57.17:11434" value: "http://10.0.57.201:11434"
- name: FlowerCore__Mcp__ApiKey__Key - name: FlowerCore__Mcp__ApiKey__Key
valueFrom: valueFrom:
secretKeyRef: secretKeyRef:
@@ -264,3 +289,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: knowledge-tls secretName: knowledge-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose knowledge-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: knowledge-web-public
# namespace: knowledge
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`knowledge.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: knowledge-web-public-profile-header # injects entitlement profile
# services:
# - name: knowledge-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -25,7 +25,7 @@ metadata:
role: github-actions-runner role: github-actions-runner
flowercore.io/managed-by: bluejay-infra flowercore.io/managed-by: bluejay-infra
spec: spec:
runStrategy: Always runStrategy: Halted
template: template:
metadata: metadata:
labels: labels:

View File

@@ -207,20 +207,13 @@ spec:
- port: 993 - port: 993
targetPort: 993 targetPort: 993
name: imaps name: imaps
--- # --- mail-tls Certificate REMOVED 2026-06-01 ---
# TLS Certificate via cert-manager # mail-tls is now managed OUTSIDE cert-manager: issued from step-ca's JWK 'admin'
apiVersion: cert-manager.io/v1 # provisioner and auto-renewed by a systemd timer on noc1 (step ca renew), which
kind: Certificate # writes the mail-tls secret directly. step-ca-acme only has an HTTP-01 (Traefik)
metadata: # solver, but mail.iamworkin.lan must resolve to the dedicated MetalLB IP 10.0.56.202
name: mail-tls # (SMTP/IMAP), so HTTP-01 cannot validate. Do NOT re-add a cert-manager Certificate
namespace: mail # here unless a DNS-01 solver is deployed for step-ca-acme.
spec:
secretName: mail-tls
issuerRef:
name: step-ca-acme
kind: ClusterIssuer
dnsNames:
- mail.iamworkin.lan
--- ---
# Traefik IngressRoute - Webmail placeholder # Traefik IngressRoute - Webmail placeholder
apiVersion: traefik.io/v1alpha1 apiVersion: traefik.io/v1alpha1

View File

@@ -216,19 +216,24 @@ data:
- job_name: "pimanager-app" - job_name: "pimanager-app"
scrape_interval: 15s scrape_interval: 15s
metrics_path: /metrics metrics_path: /metrics
scheme: https
tls_config:
insecure_skip_verify: true
static_configs: static_configs:
- targets: ["10.0.58.25:5000"] - targets: ["piez.iamworkin.lan"]
labels: labels:
instance: "piez" instance: "piez"
service: "pimanager" service: "signalcontrol"
vlan: "home" vlan: "home"
device: "pi4-ezconnect" device: "pi4-ezconnect"
- targets: ["10.0.58.113:5100"] rig: "signal-b"
- targets: ["pirelay.iamworkin.lan"]
labels: labels:
instance: "pirelay" instance: "pirelay"
service: "pimanager" service: "signalcontrol"
vlan: "home" vlan: "home"
device: "pi3-ks0212" device: "pi3-ks0212"
rig: "signal-a"
# Epson ET-3750 EcoTank Printer SNMP # Epson ET-3750 EcoTank Printer SNMP
- job_name: "snmp-printer" - job_name: "snmp-printer"
@@ -479,24 +484,33 @@ data:
- "https://gitea.iamworkin.lan/" - "https://gitea.iamworkin.lan/"
- "https://argocd.iamworkin.lan/" - "https://argocd.iamworkin.lan/"
- "https://intranet.iamworkin.lan/" - "https://intranet.iamworkin.lan/"
- "https://signage.iamworkin.lan/" - "https://signage.iamworkin.lan/healthz" # root 401 auth-gated 2026-06-01; /healthz anon 200
- "https://kiosk.iamworkin.lan/" - "https://kiosk.iamworkin.lan/"
- "https://media.iamworkin.lan/" - "https://media.iamworkin.lan/healthz" # root auth-gated by OIDC; /healthz anonymous 200
- "https://mysql.iamworkin.lan/" - "https://mysql.iamworkin.lan/healthz" # root 401 auth-gated 2026-06-01; /healthz anon 200
- "https://php.iamworkin.lan/" - "https://php.iamworkin.lan/healthz" # root 401 auth-gated 2026-06-01; /healthz anon 200
- "https://zabbix.iamworkin.lan/" - "https://zabbix.iamworkin.lan/"
- "https://desktop.iamworkin.lan/" - "https://desktop.iamworkin.lan/"
- "https://print.iamworkin.lan/" - "https://print.iamworkin.lan/healthz" # root 401 behind API key auth; /healthz anonymous 200
- "https://dns.iamworkin.lan/" - "https://dns.iamworkin.lan/healthz" # root auth-gated by OIDC; /healthz anonymous 200
- "https://chat.iamworkin.lan/" - "https://signalcontrol.iamworkin.lan/health" # FlowerCore.SignalControl Pi control plane
- "https://dist.iamworkin.lan/" - "https://flowercore.iamworkin.lan/healthz" # FlowerCore landing
- "https://dms.iamworkin.lan/" - "https://replay.iamworkin.lan/healthz" # FlowerCore.Signage replay surface
- "https://worldbuilder.iamworkin.lan/healthz" # FlowerCore.WorldBuilder
- "https://updates.iamworkin.lan/api/v1/manifests/_schema" # UpdateCenter plural LAN alias
- "https://updatecenter-internal.iamworkin.lan/api/v1/manifests/_schema" # internal UC schema route
- "https://chat.iamworkin.lan/healthz" # OIDC staged; keep blackbox off root before enforcement flips
- "https://dist.iamworkin.lan/healthz" # root/admin auth-gated by OIDC; /healthz anonymous 200
- "https://dms.iamworkin.lan/healthz" # future OIDC posture; health route is already anonymous/live
- "https://menuboard.iamworkin.lan/" - "https://menuboard.iamworkin.lan/"
- "https://messageboard.iamworkin.lan/" - "https://messageboard.iamworkin.lan/"
- "https://presentations.iamworkin.lan/" - "https://presentations.iamworkin.lan/"
- "https://retail.iamworkin.lan/" - "https://retail.iamworkin.lan/"
- "https://ttsreader.iamworkin.lan/" - "https://ttsreader.iamworkin.lan/"
# Explicit healthcheck paths # Explicit healthcheck paths
- "https://library.iamworkin.lan/health"
- "https://aistation.iamworkin.lan/healthz"
- "https://knowledge.iamworkin.lan/healthz"
- "https://fc-llm-bridge.iamworkin.lan/healthz" - "https://fc-llm-bridge.iamworkin.lan/healthz"
- "https://acme.iamworkin.lan/health" - "https://acme.iamworkin.lan/health"
# NOTE: services intentionally NOT in this probe surface # NOTE: services intentionally NOT in this probe surface
@@ -908,12 +922,13 @@ data:
# of idle and SNMP times out, so 5m for: would page nightly. A # of idle and SNMP times out, so 5m for: would page nightly. A
# genuine printer outage (jam, disconnected) lasts well over 30m. # genuine printer outage (jam, disconnected) lasts well over 30m.
- alert: EpsonPrinterDown - alert: EpsonPrinterDown
expr: up{job="snmp-printer"} == 0 expr: (max_over_time(up{job="snmp-printer"}[35m]) == bool 0) == 1 and (hour() >= 13 or hour() < 1)
for: 30m for: 30m
labels: labels:
severity: warning severity: info
alert_channel: irc
annotations: annotations:
summary: "Epson ET-3750 SNMP unreachable for >30m (likely actual fault, not sleep)" summary: "Epson ET-3750 SNMP unreachable during waking hours (30m)"
- alert: SynologyDiskLow - alert: SynologyDiskLow
expr: hrStorageUsed{job="snmp-nas"} / hrStorageSize{job="snmp-nas"} * 100 > 85 expr: hrStorageUsed{job="snmp-nas"} / hrStorageSize{job="snmp-nas"} * 100 > 85
@@ -1020,7 +1035,12 @@ data:
- name: kubernetes-state - name: kubernetes-state
rules: rules:
- alert: KubeContainerRestartingFrequently - alert: KubeContainerRestartingFrequently
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5 # Exclude github-runner: ephemeral runners register, run one job,
# exit cleanly, and restart by design. Also require kube_pod_info so
# deleted rollout pods do not keep firing from retained restart series.
expr: |
increase(kube_pod_container_status_restarts_total{namespace!="github-runner"}[1h]) > 5
and on(namespace, pod) kube_pod_info
for: 15m for: 15m
labels: labels:
severity: warning severity: warning
@@ -1029,7 +1049,12 @@ data:
description: "Container {{ $labels.container }} in pod {{ $labels.namespace }}/{{ $labels.pod }} has restarted {{ $value | printf \"%.0f\" }} times in the last hour. Check 'kubectl describe pod' + last-state termination reason." description: "Container {{ $labels.container }} in pod {{ $labels.namespace }}/{{ $labels.pod }} has restarted {{ $value | printf \"%.0f\" }} times in the last hour. Check 'kubectl describe pod' + last-state termination reason."
- alert: KubeContainerCrashLooping - alert: KubeContainerCrashLooping
expr: increase(kube_pod_container_status_restarts_total[15m]) > 3 # Same github-runner/delete-retention exclusions as the hourly
# restart rule above; real runner failures are covered by the
# dedicated LinuxRunnerOffline/MacMiniRunnerOffline alerts.
expr: |
increase(kube_pod_container_status_restarts_total{namespace!="github-runner"}[15m]) > 3
and on(namespace, pod) kube_pod_info
for: 5m for: 5m
labels: labels:
severity: critical severity: critical
@@ -1057,7 +1082,10 @@ data:
description: "Pod can't pull image. Check the image ref (often a stale tag or unreachable registry) and clean up if it's an orphan." description: "Pod can't pull image. Check the image ref (often a stale tag or unreachable registry) and clean up if it's an orphan."
- alert: KubeDeploymentReplicasMismatch - alert: KubeDeploymentReplicasMismatch
expr: kube_deployment_spec_replicas != kube_deployment_status_replicas_available # github-runner has explicit runner-offline alerts; the generic
# replica-mismatch rule should not page on intentionally ephemeral
# 0/1 runner churn between CI jobs.
expr: kube_deployment_spec_replicas{namespace!="github-runner"} != kube_deployment_status_replicas_available{namespace!="github-runner"}
for: 15m for: 15m
labels: labels:
severity: warning severity: warning

View File

@@ -1,7 +1,8 @@
# FlowerCore.Telephony - Blazor Server + REST API + Twilio IVR # FlowerCore.Telephony - Blazor Server + REST API + Twilio IVR
# ArgoCD managed - BlueJay Lab # ArgoCD managed - BlueJay Lab
# Credentials: 1Password → OnePasswordItem CRD → K8s Secret (twilio-credentials) # Credentials: 1Password → OnePasswordItem CRD → K8s Secret (twilio-credentials)
# TTS: Piper on edge1 (10.0.57.17:8500) — endpoint /tts with {"text":"..."} # TTS: Piper on GX10 (10.0.56.14:30850, en_US-amy-medium) — endpoint /tts with {"text":"..."}
# edge1 (10.0.57.17:8500, amy-low) kept as warm fallback (revert PiperUrl to roll back)
# Public: telephony.flowercore.io via Cloudflare origin cert # Public: telephony.flowercore.io via Cloudflare origin cert
--- ---
apiVersion: v1 apiVersion: v1
@@ -62,7 +63,8 @@ data:
"Password": "bluejay-asterisk-ari", "Password": "bluejay-asterisk-ari",
"Application": "flowercore-pbx", "Application": "flowercore-pbx",
"ReconnectDelaySeconds": 5, "ReconnectDelaySeconds": 5,
"MaxReconnectDelaySeconds": 60 "MaxReconnectDelaySeconds": 60,
"WebSocketKeepAliveIntervalSeconds": 30
}, },
"Sip": { "Sip": {
"Domain": "10.0.56.207", "Domain": "10.0.56.207",
@@ -70,7 +72,7 @@ data:
"Transport": "udp" "Transport": "udp"
}, },
"Tts": { "Tts": {
"PiperUrl": "http://10.0.57.17:8500", "PiperUrl": "http://10.0.56.14:30850",
"DefaultEngine": "piper", "DefaultEngine": "piper",
"SampleRate": 8000 "SampleRate": 8000
}, },
@@ -114,6 +116,9 @@ spec:
app: telephony-web app: telephony-web
template: template:
metadata: metadata:
annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/health"
labels: labels:
app: telephony-web app: telephony-web
spec: spec:
@@ -151,7 +156,7 @@ spec:
topologyKey: kubernetes.io/hostname topologyKey: kubernetes.io/hostname
containers: containers:
- name: telephony-web - name: telephony-web
image: localhost/fc-telephony-web:v202604252156 image: localhost/fc-telephony-web:v20260614-arifix
imagePullPolicy: Never imagePullPolicy: Never
securityContext: securityContext:
readOnlyRootFilesystem: true readOnlyRootFilesystem: true
@@ -161,6 +166,7 @@ spec:
ports: ports:
- containerPort: 5100 - containerPort: 5100
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: Telephony__Twilio__AccountSid - name: Telephony__Twilio__AccountSid
valueFrom: valueFrom:
@@ -180,6 +186,16 @@ spec:
name: twilio-credentials name: twilio-credentials
key: DefaultFromNumber key: DefaultFromNumber
optional: true optional: true
# Env vars OVERRIDE appsettings.Production.json in ASP.NET Core config.
# These were previously applied live-only (kubectl) and drifted from git;
# codified here so git is the source of truth. Tts__PiperUrl is the real
# TTS cutover lever (the configmap "Tts" block is shadowed by this env).
- name: Tts__PiperUrl
value: "http://10.0.56.14:30850" # GX10 amy-medium; edge1 10.0.57.17:8500 = rollback
- name: Ari__Username
value: "flowercore"
- name: Ari__Password
value: "bluejay-asterisk-ari"
volumeMounts: volumeMounts:
- name: telephony-config - name: telephony-config
mountPath: /app/appsettings.Production.json mountPath: /app/appsettings.Production.json
@@ -316,7 +332,14 @@ spec:
protocol: UDP protocol: UDP
- port: 53 - port: 53
protocol: TCP protocol: TCP
# Allow Piper TTS on edge1 (10.0.57.17:8500) # Allow Piper TTS on GX10 (10.0.56.14:30850) — primary
- to:
- ipBlock:
cidr: 10.0.56.14/32
ports:
- port: 30850
protocol: TCP
# Allow Piper TTS on edge1 (10.0.57.17:8500) — warm fallback / rollback target
- to: - to:
- ipBlock: - ipBlock:
cidr: 10.0.57.17/32 cidr: 10.0.57.17/32
@@ -387,4 +410,3 @@ spec:

View File

@@ -12,28 +12,27 @@ Source: `D:\git\FlowerCore\FlowerCore.WorldBuilder` (master)
in pfSense Unbound before this manifest is applied, or cert-manager in pfSense Unbound before this manifest is applied, or cert-manager
HTTP-01 silently exponential-backs-off ~2h. HTTP-01 silently exponential-backs-off ~2h.
Memory: `feedback_pfsense_dns_required_for_acme`. Memory: `feedback_pfsense_dns_required_for_acme`.
2. **Image import to ALL RKE2 nodes** — pod can schedule to any of 2. **Image import to ALL Ready RKE2 nodes** — pod can currently schedule to
`rke2-server` (10.0.56.11), `rke2-agent1` (10.0.56.12), `rke2-server` (10.0.56.11) and `rke2-agent1` (10.0.56.12). Build with:
`rke2-agent2` (10.0.56.13). Build with:
```bash ```bash
bash deploy/build.sh # in FlowerCore.WorldBuilder repo bash deploy/build.sh # in FlowerCore.WorldBuilder repo
podman save localhost/fc-worldbuilder:v<TAG> -o /tmp/fc-worldbuilder-v<TAG>.tar mkdir -p artifacts/deploy
for h in 10.0.56.11 10.0.56.12 10.0.56.13; do podman save localhost/fc-worldbuilder:v<TAG> -o artifacts/deploy/fc-worldbuilder-v<TAG>.tar
scp /tmp/fc-worldbuilder-v<TAG>.tar fcadmin@$h:/tmp/ for h in 10.0.56.11 10.0.56.12; do
ssh fcadmin@$h "mkdir -p /home/fcadmin/.fcv"
scp artifacts/deploy/fc-worldbuilder-v<TAG>.tar fcadmin@$h:/home/fcadmin/.fcv/
ssh fcadmin@$h \ ssh fcadmin@$h \
"sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock \ "sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock \
-n k8s.io images import /tmp/fc-worldbuilder-v<TAG>.tar" -n k8s.io images import /home/fcadmin/.fcv/fc-worldbuilder-v<TAG>.tar"
done done
``` ```
Memory: `feedback_rke2_image_import_per_node_scp`. Memory: `feedback_rke2_image_import_per_node_scp`.
3. **Bump image tag** in `worldbuilder.yaml` and git push. 3. **Bump image tag** in `worldbuilder.yaml` and git push.
ArgoCD ApplicationSet picks up within ~3 minutes. ArgoCD ApplicationSet picks up within ~3 minutes.
4. **First production render** — open 4. **First production render** — verify
`https://worldbuilder.iamworkin.lan/studio/c32e0000-0000-4000-8000-000000000004` `https://worldbuilder.iamworkin.lan/healthz`, open
and confirm the Cyberpunk Blue Jay demo prompt loads with five seeded fake `https://worldbuilder.iamworkin.lan/settings`, and confirm the image backend
generated images. This Sprint 32 visitor-safe profile uses reports ComfyUI before running an operator-owned render lane.
`ClientMode=fake`; switch the image-generation env vars back to ComfyUI only
for an operator-owned GPU render lane.
## Health probes ## Health probes
@@ -56,13 +55,8 @@ Source: `D:\git\FlowerCore\FlowerCore.WorldBuilder` (master)
## Image generation backend ## Image generation backend
Sprint 32 pins the Kubernetes profile to The live internal profile now uses
`FlowerCore:WorldBuilder:ImageGeneration:ClientMode=fake` with `FlowerCore:WorldBuilder:ImageGeneration:ClientMode=comfyui` with
`BaseUrl=http://127.0.0.1:1`. That keeps the public/internal visitor demo `BaseUrl=http://10.0.56.20:8188` on BLUEJAY-WS (R9700 / gfx1201 / ROCm 7.2).
deterministic, avoids GPU exposure, and still exercises the studio/gallery Keep the public host pre-staging disabled unless the five safe-to-expose gates
surface with persisted generated-image metadata. are rechecked; the live GPU lane is operator-owned and internal-only.
The previous ComfyUI backend target was `http://10.0.56.20:8188` on
BLUEJAY-WS (R9700 / gfx1201 / ROCm 7.2.1). Re-enable it only in an
operator-owned follow-up that also verifies workstation reachability and image
import freshness.

View File

@@ -5,10 +5,10 @@
# #
# Image build (BLUEJAY-WS): # Image build (BLUEJAY-WS):
# bash deploy/build.sh # in FlowerCore.WorldBuilder repo # bash deploy/build.sh # in FlowerCore.WorldBuilder repo
# podman save localhost/fc-worldbuilder:v<TAG> -o /tmp/fc-worldbuilder-v<TAG>.tar # podman save localhost/fc-worldbuilder:v<TAG> -o artifacts/deploy/fc-worldbuilder-v<TAG>.tar
# for h in 10.0.56.11 10.0.56.12 10.0.56.13; do # for h in 10.0.56.11 10.0.56.12; do
# scp /tmp/fc-worldbuilder-v<TAG>.tar fcadmin@$h:/tmp/ # scp artifacts/deploy/fc-worldbuilder-v<TAG>.tar fcadmin@$h:/home/fcadmin/.fcv/
# ssh fcadmin@$h "sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/fc-worldbuilder-v<TAG>.tar" # ssh fcadmin@$h "sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /home/fcadmin/.fcv/fc-worldbuilder-v<TAG>.tar"
# done # done
--- ---
apiVersion: v1 apiVersion: v1
@@ -77,6 +77,8 @@ spec:
flowercore.io/tenant-id: system flowercore.io/tenant-id: system
flowercore.io/created-by: bluejay-infra flowercore.io/created-by: bluejay-infra
annotations: annotations:
fc.flowercore.io/healthz-anon: "true"
fc.flowercore.io/probe-path: "/healthz"
prometheus.io/scrape: "true" prometheus.io/scrape: "true"
prometheus.io/port: "8080" prometheus.io/port: "8080"
prometheus.io/path: "/metrics/prometheus" prometheus.io/path: "/metrics/prometheus"
@@ -88,11 +90,12 @@ spec:
containers: containers:
- name: web - name: web
# Bump tag for each rebuild. Initial deploy: v202605062048 # Bump tag for each rebuild. Initial deploy: v202605062048
image: localhost/fc-worldbuilder:v202605062048 image: localhost/fc-worldbuilder:v20260613-e4-about-edd6efc
imagePullPolicy: Never imagePullPolicy: Never
ports: ports:
- containerPort: 8080 - containerPort: 8080
name: http name: http
# fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178) before any future public/OIDC flip.
env: env:
- name: ASPNETCORE_URLS - name: ASPNETCORE_URLS
value: "http://+:8080" value: "http://+:8080"
@@ -114,14 +117,16 @@ spec:
value: "/data/gallery" value: "/data/gallery"
- name: FlowerCore__WorldBuilder__Export__RootPath - name: FlowerCore__WorldBuilder__Export__RootPath
value: "/data/exports" value: "/data/exports"
# Visitor-safe Sprint 32 profile: fake backend keeps public demo # Operator-approved live GPU lane. Internal-only host targets
# rendering deterministic and avoids exposing BLUEJAY-WS GPU. # BLUEJAY-WS ComfyUI; keep public host pre-staging disabled below.
- name: FlowerCore__WorldBuilder__ImageGeneration__BaseUrl - name: FlowerCore__WorldBuilder__ImageGeneration__BaseUrl
value: "http://127.0.0.1:1" value: "http://10.0.56.20:8188"
- name: FlowerCore__WorldBuilder__ImageGeneration__ClientMode - name: FlowerCore__WorldBuilder__ImageGeneration__ClientMode
value: "fake" value: "comfyui"
- name: FlowerCore__WorldBuilder__ImageGeneration__BackendId - name: FlowerCore__WorldBuilder__ImageGeneration__BackendId
value: "fake" value: "comfyui"
- name: FlowerCore__WorldBuilder__ImageGeneration__VisitorSafe
value: "false"
resources: resources:
# Cluster CPU-request budget runs hot (99% on all 3 nodes at deploy # Cluster CPU-request budget runs hot (99% on all 3 nodes at deploy
# time) while actual CPU usage is well below capacity. Idle Blazor # time) while actual CPU usage is well below capacity. Idle Blazor
@@ -254,3 +259,26 @@ spec:
port: 80 port: 80
tls: tls:
secretName: worldbuilder-web-tls secretName: worldbuilder-web-tls
# ---- PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only) ----
# When the operator decides to expose worldbuilder-web publicly, uncomment + update the host,
# then verify the five safe-to-expose gates (authentik-safe-to-expose-readiness-2026-06-07.md section 2).
#
# --- IngressRoute ---
# apiVersion: traefik.io/v1alpha1
# kind: IngressRoute
# metadata:
# name: worldbuilder-web-public
# namespace: worldbuilder
# spec:
# entryPoints: [websecure]
# routes:
# - match: Host(`worldbuilder.flowercore.io`) && (Method(`GET`) || Method(`HEAD`))
# kind: Rule
# middlewares:
# - name: worldbuilder-web-public-profile-header # injects entitlement profile
# services:
# - name: worldbuilder-web
# port: 80
# tls: {}
# # POST/PUT/PATCH/DELETE miss every route -> Traefik 404 -> no admin writes on the public surface.
# # Reference pattern: dist.flowercore.io (already live + method-gated; do not edit that one).

View File

@@ -0,0 +1,129 @@
# authentik-tenant-mapping-sync — GATED manifest staging
**Status:** GATED (suspended). **ADR:** ADR-198 §2.A P1 (Au-1 / Au-3 substrate). **Pairs:** Codex **Cx2-7**.
This directory is a **Notes staging area**, NOT a deploy target. The orchestrator relocates
`cronjob.yaml` into a `gated/` path **outside** `bluejay-infra/apps/` so ArgoCD's `apps/*`
directory generator never picks it up. Nothing here runs until the activation steps below.
## What this is
A nightly Kubernetes `CronJob` that runs
[`scripts/authentik/authentik-tenant-mapping-sync.py`](../../../scripts/authentik/authentik-tenant-mapping-sync.py)
(Notes repo). The script:
- reads the 1Password Document **`flowercore-tenant-mapping`** (vault `IAmWorkin`, field
`mapping`) via **1Password Connect REST** — never the 1Password CLI/desktop (operator hard rule);
- parses + light-validates the mapping JSON (schema: [`authentik-oidc-tenant-mapping-schema.md`](../../standards/authentik-oidc-tenant-mapping-schema.md) — `version==1`, `mappings[]` with `authentikGroup` / `fcTenantId` / `fcRole`);
- reconciles each distinct `authentikGroup` into Authentik `/api/v3/core/groups/`:
create-if-missing, PATCH-managed-markers-on-drift, **never delete or disable unmanaged groups**;
- emits structured (Serilog-shaped JSON) logs and exits 0 on success.
It is the **slow nightly fix-up path**. The **<1s hot path** stays the MCP tool
`authentik_sync_tenant_mapping` (schema doc §6.2 force-broadcast). This CronJob does NOT
broadcast SignalR — group reconcile is its only side effect; services pick up mapping changes
on their own 5-minute 1P refresh.
## Why it is GATED (two locks)
1. **`spec.suspend: true`** in `cronjob.yaml` — belt-and-suspenders so even if applied it never fires.
2. **Lives outside `apps/`** — staged here in Notes; ArgoCD does not manage it.
Both must be cleared to go live. This pairs Codex **Cx2-7**: do not activate ahead of the Au-3
public-go for tenant self-registration.
## Files
| File | Purpose |
|------|---------|
| `cronjob.yaml` | The suspended `CronJob` + the script-delivery `ConfigMap` (placeholder body). |
| `README.md` | This file. |
| `scripts/authentik/authentik-tenant-mapping-sync.py` | The reconcile script (canonical source; NOT in this dir). |
## Secrets (referenced, not invented)
No secret **values** appear in `cronjob.yaml` — only `secretKeyRef`s:
- **`AUTHENTIK_TOKEN`** ← `Secret authentik/authentik-credentials` key `BOOTSTRAP_ADMIN_TOKEN`
(already exists; the same token `provision-oidc-client.py` reads). **Au-9 caveat:** this is the
never-rotated bootstrap token — when `/rotate-password rotate authentik` (Au-9) lands, this
CronJob is one of its fan-out consumers.
- **`OP_TOKEN`** ← `Secret authentik/tenant-mapping-sync-op-token` key `token`.
### OP_TOKEN cross-namespace
The canonical 1P Connect token Secret is `onepassword-system/onepassword-token`, but this
CronJob runs in the `authentik` namespace and K8s Secrets are namespace-scoped. Pick one at
activation:
- **Option A (copy, simplest).** Mint a same-namespace copy right before un-suspending:
```sh
kubectl get secret onepassword-token -n onepassword-system -o jsonpath='{.data.token}' \
| base64 -d \
| kubectl create secret generic tenant-mapping-sync-op-token -n authentik \
--from-file=token=/dev/stdin --dry-run=client -o yaml | kubectl apply -f -
```
(Re-run whenever the Connect token rotates — add this CronJob to the **Au-10** Connect-token
fan-out checklist so the copy can't go stale.)
- **Option B (CRD, preferred long-term).** Use an `OnePasswordItem` CRD
(`feedback_1password_operator_pattern`) so the 1P operator mints/refreshes
`authentik/tenant-mapping-sync-op-token` automatically — no manual copy, rotation-safe.
> If neither secret exists yet, that's fine **while suspended** — the job never schedules.
## How to ACTIVATE (at Au-3 public-go)
1. **Pre-flight (workstation dry-run, writes nothing):**
```sh
export AUTHENTIK_TOKEN=... # or let it read authentik/authentik-credentials via kubectl
export OP_TOKEN=... # or rely on credential-helper.sh get_op_token (fcadmin@noc1)
python scripts/authentik/authentik-tenant-mapping-sync.py --dry-run --verbose
```
Confirm the planned create/update set matches the 1P mapping document.
2. **Provide `OP_TOKEN` in-cluster** — Option A or B above.
3. **Materialize the script ConfigMap from the canonical file** (do NOT hand-edit a copy into
`cronjob.yaml` — the embedded body is a deliberate placeholder):
```sh
kubectl create configmap authentik-tenant-mapping-sync-script -n authentik \
--from-file=authentik-tenant-mapping-sync.py=scripts/authentik/authentik-tenant-mapping-sync.py \
--dry-run=client -o yaml | kubectl apply -f -
```
(Or, in the imaged future per ADR-198 §2.B P3, bake the script into `fc-runtime-base` and
drop the ConfigMap volume.)
4. **Relocate into bluejay-infra** — move `cronjob.yaml` into a `gated/` (or `apps/`) path in
`bluejay-infra` per the orchestrator's placement decision. If under `apps/`, ArgoCD will sync it.
5. **Un-suspend** — set `spec.suspend: false` (commit in `bluejay-infra` so ArgoCD selfHeal
doesn't revert), or one-off:
```sh
kubectl patch cronjob authentik-tenant-mapping-sync -n authentik \
-p '{"spec":{"suspend":false}}'
```
6. **Smoke (VG-A1):** trigger an immediate run and check the structured logs:
```sh
kubectl create job --from=cronjob/authentik-tenant-mapping-sync tms-smoke -n authentik
kubectl logs -n authentik job/tms-smoke
```
Then edit a mapping entry in 1P and confirm the next run reconciles the group; the <1s
propagation still comes from the MCP `authentik_sync_tenant_mapping` force-broadcast.
## Rollback
Re-suspend (`spec.suspend: true`) or delete the CronJob. The script never deletes Authentik
groups, so a bad run can only over-create groups present in the mapping — remove any unwanted
group by hand in the Authentik admin UI. No data loss path.
## Idempotency / safety summary
- Re-running is a no-op when groups already match (mirrors `provision-oidc-client.py`).
- Only the managed attribute block (`fc:managed-by` / `fc:tenant` / `fc:role` / optional
`fc:label` / `fc:regulated` / `fc:strict-mode`) is asserted; group parent/users/roles are
never touched.
- Wildcard SuperAdmin entries (`fcTenantId: "*"`) do not create a per-tenant group.
- `--dry-run` prints the plan and writes nothing — always run it first.
## Cross-links
- [`docs/standards/auth-acl-unattended-lifecycle-plan.md`](../../standards/auth-acl-unattended-lifecycle-plan.md) — ADR-198; Au-1/Au-3 lanes, VG-A1/A2.
- [`docs/standards/authentik-oidc-tenant-mapping-schema.md`](../../standards/authentik-oidc-tenant-mapping-schema.md) — the mapping JSON shape + 1P item layout (§2/§3).
- [`scripts/authentik/provision-oidc-client.py`](../../../scripts/authentik/provision-oidc-client.py) — sibling idempotent provisioner (same API + posture).
- [`scripts/credential-helper.sh`](../../../scripts/credential-helper.sh) — `get_op_token` 1P Connect bootstrap (fcadmin@noc1).

View File

@@ -0,0 +1,151 @@
# =====================================================================================
# authentik-tenant-mapping-sync — GATED nightly CronJob (Au-3 / ADR-198 §2.A P1)
#
# STATUS: GATED. spec.suspend: true (belt-and-suspenders). This manifest lives in a Notes
# STAGING path (docs/gated-manifests/) and is NOT under bluejay-infra apps/, so ArgoCD
# does not deploy it. It does NOTHING until Au-3 public-go (see README.md in this dir).
#
# WHAT IT RUNS: scripts/authentik/authentik-tenant-mapping-sync.py (Notes repo) — reads the
# 1Password Document `flowercore-tenant-mapping` via Connect REST and reconciles its
# mappings[].authentikGroup entries into Authentik groups (idempotent; never deletes
# unmanaged groups). Pairs Codex Cx2-7.
#
# SECRETS (referenced, NOT invented — no secret VALUES in this file):
# AUTHENTIK_TOKEN <- Secret authentik/authentik-credentials key BOOTSTRAP_ADMIN_TOKEN (exists)
# OP_TOKEN <- Secret authentik/tenant-mapping-sync-op-token key token
# (a copy of onepassword-system/onepassword-token — see README "OP_TOKEN
# cross-namespace" for the one-liner that mints it; OR mint via the
# OnePasswordItem CRD per feedback_1password_operator_pattern).
#
# The script is delivered via the ConfigMap below (same pattern as guacamole guac-k8s-sync).
# When this lane is libraryized/imaged later (ADR-198 §2.B P3) this ConfigMap can be replaced
# by a baked image; for now ConfigMap-delivery keeps the script the single source of truth.
# =====================================================================================
apiVersion: batch/v1
kind: CronJob
metadata:
name: authentik-tenant-mapping-sync
namespace: authentik
labels:
app.kubernetes.io/name: authentik-tenant-mapping-sync
app.kubernetes.io/component: sync
app.kubernetes.io/part-of: flowercore-identity
flowercore.io/adr: "198"
flowercore.io/gated: "true"
annotations:
flowercore.io/gate: "Au-3 public-go — suspended until tenant self-registration goes live"
flowercore.io/pairs-with: "Codex Cx2-7"
spec:
# GATE: suspended so it never fires until an operator un-suspends at Au-3 public-go.
suspend: true
# Nightly at 03:17 (off-peak; jittered minute to avoid colliding with other 03:00 jobs).
schedule: "17 3 * * *"
concurrencyPolicy: Forbid
startingDeadlineSeconds: 600
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
backoffLimit: 2
activeDeadlineSeconds: 600
template:
metadata:
labels:
app.kubernetes.io/name: authentik-tenant-mapping-sync
app.kubernetes.io/component: sync
spec:
restartPolicy: OnFailure
securityContext:
runAsNonRoot: true
runAsUser: 65532
runAsGroup: 65532
fsGroup: 65532
seccompProfile:
type: RuntimeDefault
containers:
- name: sync
# python:3.12-slim is sufficient: the script uses only the stdlib (urllib/json/ssl).
# No pip install needed. Pin a digest at activation time for air-gap reproducibility.
image: python:3.12-slim
imagePullPolicy: IfNotPresent
command:
- python3
- /scripts/authentik-tenant-mapping-sync.py
# NOTE: no --dry-run here -> this is the real reconcile. Operators wanting a
# dry-run first should `kubectl create job --from=cronjob/... ` with the arg
# appended, or run the script from a workstation. See README.
env:
- name: AUTHENTIK_URL
value: "https://id.iamworkin.lan"
- name: OP_CONNECT_URL
value: "http://10.0.56.10:8180/v1" # port 8180, NOT 8443
- name: OP_VAULT_ID
value: "qaphopopkryhbg353ukzhhuqoq" # IAmWorkin
- name: TENANT_MAPPING_ITEM
value: "flowercore-tenant-mapping"
- name: TENANT_MAPPING_FIELD
value: "mapping"
- name: AUTHENTIK_TOKEN
valueFrom:
secretKeyRef:
name: authentik-credentials
key: BOOTSTRAP_ADMIN_TOKEN
- name: OP_TOKEN
valueFrom:
secretKeyRef:
# A same-namespace copy of onepassword-system/onepassword-token.
# See README "OP_TOKEN cross-namespace". Until Au-3 this Secret need
# not exist (the job is suspended).
name: tenant-mapping-sync-op-token
key: token
resources:
requests:
cpu: 25m
memory: 64Mi
limits:
cpu: 250m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
volumeMounts:
- name: script
mountPath: /scripts
readOnly: true
volumes:
- name: script
configMap:
name: authentik-tenant-mapping-sync-script
defaultMode: 0555
---
# The reconcile script, delivered as a ConfigMap (single source of truth = the Notes repo
# scripts/authentik/authentik-tenant-mapping-sync.py). At activation, regenerate this
# ConfigMap from the live script so the two never drift, e.g.:
# kubectl create configmap authentik-tenant-mapping-sync-script -n authentik \
# --from-file=authentik-tenant-mapping-sync.py=scripts/authentik/authentik-tenant-mapping-sync.py \
# --dry-run=client -o yaml > docs/gated-manifests/authentik-tenant-sync/configmap.script.yaml
# (kept as a placeholder body here so the manifest set is self-describing; the real body is
# the script file — DO NOT hand-edit a divergent copy into this ConfigMap.)
apiVersion: v1
kind: ConfigMap
metadata:
name: authentik-tenant-mapping-sync-script
namespace: authentik
labels:
app.kubernetes.io/name: authentik-tenant-mapping-sync
app.kubernetes.io/component: sync
flowercore.io/gated: "true"
annotations:
flowercore.io/source: "scripts/authentik/authentik-tenant-mapping-sync.py (Notes repo) — regenerate at activation, do not hand-edit"
data:
authentik-tenant-mapping-sync.py: |
# PLACEHOLDER — regenerate from the canonical script at activation (see annotation above).
# The Notes repo file scripts/authentik/authentik-tenant-mapping-sync.py is the source of
# truth; embedding a hand-copy here would drift. The orchestrator (or the activation
# runbook) materializes this ConfigMap from the live script via `kubectl create configmap
# ... --from-file=...` before un-suspending the CronJob.
import sys
sys.exit("authentik-tenant-mapping-sync ConfigMap not materialized from the canonical "
"script — regenerate with kubectl create configmap --from-file before activation.")

View File

@@ -0,0 +1,39 @@
# Public-TLS substrate (gated)
**Lane:** Cl-infra-2 (deep-regroup 2026-06-13). **Status:** authored, **NOT applied** — operator-gated.
This directory holds the Let's Encrypt + isolation substrate for **public** multi-tenant
web hosting. It lives **outside `apps/`** on purpose: the bluejay-infra ApplicationSet only
reconciles `apps/*`, so nothing here is auto-applied. Applying a cert-manager ACME
`ClusterIssuer` registers an ACME account immediately, so these stay inert until the
operator opens the web-hosting public-exposure gate (**R-1**).
## What's here
| File | What | Activate when |
|---|---|---|
| `letsencrypt-issuers.yaml` | `letsencrypt-staging` + `letsencrypt-prod` ClusterIssuers (HTTP-01 via Traefik; DNS-01 stub for wildcards) | Public-go. Move to `apps/cluster-issuers/`, **staging first**. |
| `tenant-networkpolicy-template.yaml` | Per-tenant default-deny + allowlist NetworkPolicy (Traefik ingress, CoreDNS, own-DB egress only) | Rendered per tenant at provision time (Wh-C2 isolation). |
## The gate
Public exposure is **NO-GO** until the §6 go/no-go checklist in
[`docs/standards/web-hosting-production-readiness-plan.md`](../../../FlowerCore.Notes/docs/standards/web-hosting-production-readiness-plan.md)
is green (currently 14/14 red) **and** the operator explicitly opens R-1. Internal
`*.iamworkin.lan` TLS stays on **step-ca** (`apps/fc-dns/fc-dns.yaml``step-ca-dns01`);
these LE issuers are **only** for public tenant domains.
## Pairing
- **Codex Wh-C1** consumes `letsencrypt-staging`/`-prod` for hybrid public TLS on
FlowerCore.PHP/MySQL/DNS.
- **Codex Wh-C2** consumes the NetworkPolicy template for cross-tenant isolation suites.
## Activation checklist (public-go)
1. Wire a public DNS-01 solver (Cloudflare/Namecheap webhook) **or** confirm public tenant
domains route HTTP-01 to the cluster ingress.
2. `git mv gated/public-tls/letsencrypt-issuers.yaml apps/cluster-issuers/` — staging only.
3. Issue one **staging** cert for a throwaway public domain; verify the chain in a browser.
4. Flip that tenant's Certificate `issuerRef` to `letsencrypt-prod`; mind LE rate limits.
5. Render `tenant-networkpolicy-template.yaml` per tenant; run the Wh-C2 negative suites.

View File

@@ -0,0 +1,78 @@
# ============================================================================
# Let's Encrypt ClusterIssuers — PUBLIC TLS substrate (Cl-infra-2, deep-regroup 2026-06-13)
# ============================================================================
# GATED. This file lives OUTSIDE apps/ on purpose, so the bluejay-infra
# ApplicationSet does NOT auto-apply it. Applying a cert-manager ACME
# ClusterIssuer registers an ACME account immediately, so we keep these inert
# until the operator opens the web-hosting public-exposure gate (R-1; the §6
# go/no-go checklist in docs/standards/web-hosting-production-readiness-plan.md
# is currently 14/14 red).
#
# Pairs with Codex Wh-C1 (FlowerCore.PHP/MySQL/DNS hybrid public TLS) and
# Wh-C2 (isolation). Internal *.iamworkin.lan certs STAY on step-ca
# (apps/fc-dns/fc-dns.yaml: ClusterIssuer step-ca-dns01) — these LE issuers are
# ONLY for public tenant domains.
#
# TO ACTIVATE (operator public-go):
# 1. Confirm a public DNS-01 solver is wired (Cloudflare/Namecheap webhook) OR
# that public tenant domains route HTTP-01 to the cluster's public ingress.
# 2. Move this file to apps/cluster-issuers/ (the ApplicationSet will create
# infra-cluster-issuers and apply it), staging FIRST.
# 3. Issue ONE staging cert for a throwaway public domain, verify the chain,
# THEN switch that tenant's Certificate issuerRef to letsencrypt-prod.
# 4. Mind LE prod rate limits (50 certs/registered-domain/week, 5 dupes/week).
#
# Registration email is for expiry notices only — adjust to a role address if
# desired (astoltz@iamwork.in is the current operator contact).
# ----------------------------------------------------------------------------
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
acme:
# LE STAGING — untrusted certs, generous limits. Use this first, always.
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: astoltz@iamwork.in
privateKeySecretRef:
name: letsencrypt-staging-account-key
solvers:
# HTTP-01 via Traefik. Requires the public tenant domain's :80 traffic to
# reach the cluster ingress. For wildcard / apex without inbound :80, swap
# to the dns01 solver block below (needs a public DNS provider webhook).
- http01:
ingress:
class: traefik
# --- DNS-01 alternative for wildcards (uncomment + wire a public DNS webhook) ---
# - dns01:
# webhook:
# groupName: acme.flowercore.io # or the cloudflare/namecheap solver
# solverName: <public-dns-solver>
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
acme:
# LE PRODUCTION — trusted certs, strict rate limits. Only after staging proves out.
server: https://acme-v02.api.letsencrypt.org/directory
email: astoltz@iamwork.in
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
- http01:
ingress:
class: traefik
# - dns01:
# webhook:
# groupName: acme.flowercore.io
# solverName: <public-dns-solver>

View File

@@ -0,0 +1,59 @@
# ============================================================================
# Per-tenant NetworkPolicy TEMPLATE — web-hosting isolation (Cl-infra-2 / Wh-C2)
# ============================================================================
# GATED substrate (outside apps/, not auto-applied). Modeled on the canonical
# default-deny + allowlist shape in apps/fc-devicemgmt/network-policy.yaml.
#
# Purpose: when a public multi-tenant site is provisioned, each tenant's pods
# get a NetworkPolicy that (a) default-denies all ingress/egress, then allows
# only Traefik ingress + CoreDNS + that tenant's own DB. This enforces the
# cross-tenant isolation Wh-C2 verifies with negative suites.
#
# Replace the {{TENANT}} placeholders and apply alongside the tenant's workload
# (the MySQL/PHP managers should emit this when they create a tenant, or a
# templating step in apps/ should render it). Kept here as the reference shape.
# ----------------------------------------------------------------------------
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: tenant-{{TENANT}}-isolation
namespace: fc-tenant-{{TENANT}}
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/tenant-id: "{{TENANT}}"
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
podSelector: {} # all pods in the tenant namespace
policyTypes: [Ingress, Egress]
ingress:
# Only Traefik may reach tenant pods (public traffic terminates at Traefik).
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: traefik-system
ports:
- { protocol: TCP, port: 80 }
- { protocol: TCP, port: 443 }
- { protocol: TCP, port: 8080 }
egress:
# CoreDNS resolution.
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- { protocol: UDP, port: 53 }
- { protocol: TCP, port: 53 }
# This tenant's OWN MySQL only (NOT other tenants' DBs — that's the isolation).
- to:
- podSelector:
matchLabels:
flowercore.io/tenant-id: "{{TENANT}}"
app.kubernetes.io/name: mysql
ports:
- { protocol: TCP, port: 3306 }
# NOTE: deliberately NO blanket egress. Add per-tenant allowances explicitly
# (object storage, mail relay, etc.) so a compromised tenant pod cannot reach
# the rest of the fleet or other tenants.

15
gx10/platform/README.md Normal file
View File

@@ -0,0 +1,15 @@
# GX10 cluster platform layer (NOT old-cluster ArgoCD)
These manifests bootstrap the GX10 RKE2 cluster's platform layer for the NUC→GX10
migration. They are **direct-applied** to the GX10 (its own kubectl) during
bootstrap, and live under `gx10/` (NOT `apps/`) so the OLD cluster's bluejay-infra
ApplicationSet (whose `apps/*` generator targets the OLD cluster) does NOT
auto-deploy them there. Once ArgoCD is stood up on the GX10, a GX10-only
ApplicationSet (`apps-gx10/*`) will own these.
- `step-ca-acme.yaml` — cert-manager ClusterIssuer (ACME → noc1 step-ca, in-spec caBundle). APPLIED + Ready.
- `traefik-helmchart.yaml` — Traefik v3.6.10 (chart 39.0.5) via the RKE2 HelmChart CRD, LoadBalancer VIP 10.0.57.202 (prod-pool; temp parallel-run VIP — canonical .200 reclaimed at cutover). APPLIED.
cert-manager v1.17.2 was installed separately (upstream static manifest). See
`docs/ai-agents/gx10-migration-continuation-2026-06-14.md` + memory
`project_gx10_ai_node_2026_06_13`.

View File

@@ -0,0 +1,14 @@
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: step-ca-acme
spec:
acme:
server: https://10.0.56.10:9443/acme/acme/directory
caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJ4RENDQVdxZ0F3SUJBZ0lSQVBZMzU3RzZvdzZ6TUFMNSs0YlMya2t3Q2dZSUtvWkl6ajBFQXdJd1FERWEKTUJnR0ExVUVDaE1SU1VGdFYyOXlhMmx1SUVGRFRVVWdRMEV4SWpBZ0JnTlZCQU1UR1VsQmJWZHZjbXRwYmlCQgpRMDFGSUVOQklGSnZiM1FnUTBFd0hoY05Nall3TXpBNE1UZ3dOekV4V2hjTk16WXdNekExTVRnd056RXhXakJBCk1Sb3dHQVlEVlFRS0V4RkpRVzFYYjNKcmFXNGdRVU5OUlNCRFFURWlNQ0FHQTFVRUF4TVpTVUZ0VjI5eWEybHUKSUVGRFRVVWdRMEVnVW05dmRDQkRRVEJaTUJNR0J5cUdTTTQ5QWdFR0NDcUdTTTQ5QXdFSEEwSUFCSjJuMDRYMQpKWm81WmRxL2kxSWR2OCtmcXdaeUF6Qmg3d2hicWowU1dzSkw4VVdSYWJDTXFZQ3M3K2RYTzB4UlN6cWt3RkRMCngrdm9vT2FpOFJnUk5oYWpSVEJETUE0R0ExVWREd0VCL3dRRUF3SUJCakFTQmdOVkhSTUJBZjhFQ0RBR0FRSC8KQWdFQk1CMEdBMVVkRGdRV0JCUm51UFBRUjZpTS9INnZPbHVpVTNTeWdheXo4akFLQmdncWhrak9QUVFEQWdOSQpBREJGQWlFQXJRSzlkWVBHbUFac2RZbmp6aXVGVlZFNU5LWlVjY2VZdkdmR0MrdExYVXNDSUF1ZEYyekpyQ1JxCjNtSzUwWlpFVC9md1RrSndpRUY0ODI0bWpQOHAxQ0tNCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
privateKeySecretRef:
name: step-ca-acme-account-key
solvers:
- http01:
ingress:
ingressClassName: traefik

View File

@@ -0,0 +1,81 @@
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: traefik
namespace: kube-system
spec:
chart: traefik
repo: https://traefik.github.io/charts
version: "39.0.5"
targetNamespace: traefik-system
createNamespace: true
valuesContent: |
deployment:
replicas: 1
additionalArguments:
- "--api.dashboard=true"
- "--log.level=INFO"
- "--providers.kubernetescrd"
- "--providers.kubernetesingress"
- "--providers.kubernetescrd.allowEmptyServices=true"
- "--providers.kubernetesingress.allowEmptyServices=true"
- "--providers.kubernetesingress.ingressendpoint.publishedservice=traefik-system/traefik"
ingressRoute:
dashboard:
enabled: false
rbac:
enabled: true
service:
type: LoadBalancer
annotations:
metallb.io/loadBalancerIPs: "10.0.57.202"
metallb.io/address-pool: "prod-pool"
ports:
web:
port: 8000
exposedPort: 80
protocol: TCP
websecure:
port: 8443
exposedPort: 443
protocol: TCP
tls:
enabled: true
irc:
port: 6667
exposedPort: 6667
protocol: TCP
expose:
default: true
irctls:
port: 6697
exposedPort: 6697
protocol: TCP
expose:
default: true
traefik:
port: 8080
exposedPort: 8080
protocol: TCP
expose:
default: false
metrics:
port: 9100
exposedPort: 9100
protocol: TCP
expose:
default: false
metrics:
prometheus:
entryPoint: metrics
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"

31
gx10/tts/Dockerfile Normal file
View File

@@ -0,0 +1,31 @@
# GX10 Piper TTS — linux/arm64 (built natively on the GX10 / DGX Spark, aarch64).
# Serves the telephony /tts contract: POST {"text"} -> 16 kHz/16-bit/mono WAV.
# Voice baked into the image so there is no runtime HuggingFace dependency.
FROM python:3.12-slim
# espeak-ng is the phonemizer backend piper-tts uses at synthesis time.
RUN apt-get update \
&& apt-get install -y --no-install-recommends espeak-ng ca-certificates curl \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir piper-tts flask numpy
# Bake the voice model (en_US-amy-medium, 22.05 kHz native) into the image.
ARG PIPER_VOICE=en_US-amy-medium
ARG VOICE_BASE=https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium
RUN mkdir -p /voices \
&& curl -sSL -o "/voices/${PIPER_VOICE}.onnx" "${VOICE_BASE}/${PIPER_VOICE}.onnx" \
&& curl -sSL -o "/voices/${PIPER_VOICE}.onnx.json" "${VOICE_BASE}/${PIPER_VOICE}.onnx.json" \
&& test -s "/voices/${PIPER_VOICE}.onnx" \
&& test -s "/voices/${PIPER_VOICE}.onnx.json"
COPY tts_service.py /app/tts_service.py
WORKDIR /app
ENV TTS_PORT=8500 \
PIPER_VOICE=en_US-amy-medium \
VOICES_DIR=/voices \
TARGET_RATE=16000
EXPOSE 8500
CMD ["python", "tts_service.py"]

59
gx10/tts/README.md Normal file
View File

@@ -0,0 +1,59 @@
# GX10 Piper TTS — telephony `/tts` endpoint
CPU Piper TTS serving the telephony `/tts` contract on the **GX10 RKE2 cluster**
(ASUS Ascent GX10 / NVIDIA DGX Spark, ARM64, `10.0.56.14`). This is the
telephony-TTS-port-to-GX10 (P1) baseline: edge1 parity at higher quality, zero
GPU/aarch64 risk, frees telephony off the slow edge1 Pi 5.
## What it is
- `tts_service.py` — Flask app: `POST /tts {"text"}`**16 kHz / 16-bit / mono WAV**
(canonical 44-byte header) + `GET /health`. Voice `en_US-amy-medium` (22.05 kHz
native) is numpy-resampled to 16 kHz so it drops straight onto Asterisk's
`.sln16` path (telephony strips the 44-byte header). Same wire contract as the
edge1 `speech-pipeline` `/tts`, just the TTS half (no STT/Wyoming).
- `Dockerfile``linux/arm64`, voice baked in (no runtime HuggingFace dep).
- `gx10-tts.yaml` — Namespace `tts` + Deployment (CPU-only, **no GPU request** so it
co-resides with the GPU-holding Ollama pod) + NodePort Service.
## This cluster is NOT under the old-cluster ArgoCD (yet)
Apply manually with the GX10's own kubectl:
```bash
ssh -J noc1 -i ~/.ssh/fcadmin_ed25519 bluejay@10.0.56.14
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
K=/var/lib/rancher/rke2/bin/kubectl
$K apply -f gx10-tts.yaml
```
## Build + import (native arm64 on the GX10)
```bash
docker build -t localhost/fc-gx10-tts:v20260614 .
docker save localhost/fc-gx10-tts:v20260614 -o /tmp/t.tar
sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/t.tar
# manifest uses imagePullPolicy: Never (image lives in containerd, no registry)
```
## Telephony cutover (reversible)
Endpoint telephony hits: **`http://10.0.56.14:30850`** (NodePort, MGMT VLAN 56).
In `apps/telephony/telephony.yaml`:
1. Deployment env `Tts__PiperUrl=http://10.0.56.14:30850`**this is the real lever**;
env vars override `appsettings.Production.json`, so the configmap `Tts` block alone
is inert (it was shadowed by a drifted live env `Tts__PiperUrl=edge1`).
2. NetworkPolicy egress to `10.0.56.14/32:30850` (telephony-web is `hostNetwork`, so this
only matters for non-hostNetwork pods; harmless either way).
3. edge1 (`10.0.57.17:8500`) stays warm — **rollback = set `Tts__PiperUrl` back to it**.
The TTS circuit breaker + `MapTextToSound` canned-prompt fallback mean a bad endpoint
degrades gracefully, never to silence.
## Verify (not a manual call)
```bash
FLOWERCORE_SIP_TEST_MODE=required dotnet.exe test \
FlowerCore.Telephony/tests/FlowerCore.Telephony.SipTests/FlowerCore.Telephony.SipTests.csproj \
--filter FullyQualifiedName~Call_Star100_ReceivesAudibleAudioStream
```
A passing audible test alone is NOT sufficient (edge1 also produces audible audio) —
confirm the **GX10 TTS pod's own access log** (`kubectl -n tts logs deploy/gx10-tts`)
shows `POST /tts 200` during the call, and telephony-web logs target `10.0.56.14:30850`.
## Voice upgrade (follow-on)
Operator's pick is **Kokoro**; needs GPU time-slicing (Ollama holds the GB10 GPU; MPS is
refuted on GB10) OR Kokoro-CPU behind a `/tts` shim. This Piper baseline stays as the floor.

81
gx10/tts/gx10-tts.yaml Normal file
View File

@@ -0,0 +1,81 @@
# GX10 Piper TTS — telephony /tts endpoint on the GX10 RKE2 cluster.
# Applied DIRECTLY via the GX10's own kubectl (KUBECONFIG=/etc/rancher/rke2/rke2.yaml);
# the GX10 cluster is NOT yet under the old-cluster ArgoCD. CPU-only (no GPU request)
# so it co-resides with the GPU-holding Ollama pod without contending for the GB10.
# Image is imported into RKE2 containerd (imagePullPolicy: Never).
# Telephony reaches it at http://10.0.56.14:30850 (NodePort, MGMT VLAN 56).
apiVersion: v1
kind: Namespace
metadata:
name: tts
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: gx10-tts
namespace: tts
labels:
app: gx10-tts
spec:
replicas: 1
selector:
matchLabels:
app: gx10-tts
template:
metadata:
labels:
app: gx10-tts
spec:
containers:
- name: tts
image: localhost/fc-gx10-tts:v20260614
imagePullPolicy: Never
ports:
- containerPort: 8500
name: http
env:
- name: TTS_PORT
value: "8500"
- name: PIPER_VOICE
value: "en_US-amy-medium"
- name: TARGET_RATE
value: "16000"
readinessProbe:
httpGet:
path: /health
port: 8500
initialDelaySeconds: 3
periodSeconds: 5
timeoutSeconds: 3
livenessProbe:
httpGet:
path: /health
port: 8500
initialDelaySeconds: 10
periodSeconds: 20
timeoutSeconds: 5
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "4"
memory: "2Gi"
---
apiVersion: v1
kind: Service
metadata:
name: gx10-tts
namespace: tts
labels:
app: gx10-tts
spec:
type: NodePort
selector:
app: gx10-tts
ports:
- name: http
port: 8500
targetPort: 8500
nodePort: 30850
protocol: TCP

153
gx10/tts/tts_service.py Normal file
View File

@@ -0,0 +1,153 @@
#!/usr/bin/env python3
"""GX10 Piper TTS microservice — telephony /tts contract.
POST /tts {"text": "..."} -> 16 kHz / 16-bit / mono WAV (canonical 44-byte header)
GET /health -> JSON status
The telephony AsteriskProvider strips the 44-byte WAV header and writes the
remainder as a `.sln16` (signed-linear 16 kHz) file that Asterisk transcodes to
any codec. So the response MUST be 16 kHz / 16-bit / mono. The en_US-amy-medium
voice is 22.05 kHz native, so we resample to 16 kHz (a 22.05 kHz stream treated
as 16 kHz plays ~1.38x too fast). This is a drop-in upgrade over edge1's
en_US-amy-low (16 kHz native, lower quality), keeping the exact wire contract.
"""
import io
import logging
import os
import sys
import threading
import wave
import numpy as np
from flask import Flask, Response, jsonify, request
API_PORT = int(os.environ.get("TTS_PORT", "8500"))
PIPER_VOICE = os.environ.get("PIPER_VOICE", "en_US-amy-medium")
VOICES_DIR = os.environ.get("VOICES_DIR", "/voices")
TARGET_RATE = int(os.environ.get("TARGET_RATE", "16000"))
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
stream=sys.stdout,
)
log = logging.getLogger("gx10-tts")
piper_voice_obj = None
piper_loaded = False
piper_lock = threading.Lock()
native_rate = None
app = Flask(__name__)
def load_piper():
"""Load the Piper voice model once at startup (shared, lock-guarded)."""
global piper_voice_obj, piper_loaded
try:
from piper import PiperVoice
model_path = os.path.join(VOICES_DIR, f"{PIPER_VOICE}.onnx")
if not os.path.isfile(model_path):
log.error("Piper voice model not found at %s — TTS disabled", model_path)
piper_loaded = False
return
log.info("Loading Piper voice %s from %s", PIPER_VOICE, model_path)
piper_voice_obj = PiperVoice.load(model_path)
piper_loaded = True
log.info("Piper voice loaded")
except Exception as exc: # noqa: BLE001 — fail-soft, /health reports it
log.error("Failed to load Piper: %s", exc)
piper_loaded = False
def synthesize_chunks(text):
"""Run Piper synthesis under a lock because the loaded voice is shared."""
with piper_lock:
return list(piper_voice_obj.synthesize(text))
def resample_i16(pcm_i16, src_rate, dst_rate):
"""Linear-interpolation resample of int16 PCM (matches edge1's STT resample)."""
if src_rate == dst_rate or len(pcm_i16) == 0:
return pcm_i16
audio = pcm_i16.astype(np.float32)
target_len = int(round(len(audio) * dst_rate / src_rate))
if target_len <= 0:
return np.zeros(0, dtype=np.int16)
idx = np.linspace(0, len(audio) - 1, target_len)
res = np.interp(idx, np.arange(len(audio)), audio)
return np.clip(np.round(res), -32768, 32767).astype(np.int16)
@app.route("/health", methods=["GET"])
def health():
return jsonify({
"status": "ok",
"voice": PIPER_VOICE,
"loaded": piper_loaded,
"target_rate": TARGET_RATE,
"native_rate": native_rate,
})
@app.route("/tts", methods=["POST"])
def tts():
"""Text -> 16 kHz/16-bit/mono WAV. Mirrors the edge1 speech-pipeline contract."""
if not piper_loaded:
return jsonify({"error": "Piper TTS model not loaded"}), 503
data = request.get_json(silent=True)
if not data or "text" not in data:
return jsonify({"error": "Missing required field: text"}), 400
text = data["text"].strip()
if not text:
return jsonify({"error": "Text field is empty"}), 400
if len(text) > 10000:
return jsonify({"error": "Text too long (max 10000 characters)"}), 400
try:
chunks = synthesize_chunks(text)
if not chunks:
return jsonify({"error": "No audio produced"}), 500
global native_rate
first = chunks[0]
native_rate = first.sample_rate
if first.sample_width != 2 or first.sample_channels != 1:
return jsonify({
"error": f"Unexpected PCM format: width={first.sample_width} "
f"channels={first.sample_channels} (need 16-bit mono)"
}), 500
pcm = np.frombuffer(
b"".join(c.audio_int16_bytes for c in chunks), dtype=np.int16
)
out = resample_i16(pcm, native_rate, TARGET_RATE)
wav_buffer = io.BytesIO()
with wave.open(wav_buffer, "wb") as wav_file:
wav_file.setnchannels(1)
wav_file.setsampwidth(2)
wav_file.setframerate(TARGET_RATE)
wav_file.writeframes(out.tobytes())
wav_buffer.seek(0)
return Response(
wav_buffer.read(),
mimetype="audio/wav",
headers={"Content-Disposition": 'inline; filename="speech.wav"'},
)
except Exception as exc: # noqa: BLE001
log.error("TTS synthesis failed: %s", exc)
return jsonify({"error": f"Synthesis failed: {exc}"}), 500
if __name__ == "__main__":
log.info(
"GX10 TTS starting on port %d (voice=%s -> %d Hz)",
API_PORT, PIPER_VOICE, TARGET_RATE,
)
load_piper()
app.run(host="0.0.0.0", port=API_PORT, threaded=True)

View File

@@ -0,0 +1,206 @@
using FluentAssertions;
using Xunit;
namespace BluejayInfraLint.Tests;
[Trait("Category", "Unit")]
public sealed class DivoomPiDeployArtifactTests
{
private static readonly string Root = FindRepoRoot();
private static readonly string DmRoot = Path.Combine(Root, "apps", "fc-divoom-dm-pi-device");
private static readonly string TvRoot = Path.Combine(Root, "apps", "fc-divoom-tv-pi");
public static TheoryData<string> DmRequiredArtifacts => new()
{
"README.md",
"hiera/edge2-divoom-dm-device.overlay.yaml",
"puppet/profile/pi/service/divoom_dm_device.pp",
"puppet/templates/divoom-device-registration.json.epp",
"puppet/templates/flowercore-divoom-dm-agent.service.epp",
};
public static TheoryData<string> TvRequiredArtifacts => new()
{
"README.md",
"hiera/example-divoom-tv-pi.iamworkin.lan.yaml",
"puppet/profile/pi/service/divoom_tv.pp",
"systemd/flowercore-divoom-tv.service",
"systemd/flowercore-divoom-tv-hdmi.service",
"systemd/99-flowercore-divoom-tv-hdmi.rules",
"scripts/flowercore-divoom-tv-prelaunch.sh",
"scripts/flowercore-divoom-tv-launch.sh",
"scripts/flowercore-divoom-tv-hdmi-respond.sh",
};
[Theory]
[MemberData(nameof(DmRequiredArtifacts))]
public void DmDeviceArtifacts_ArePresent(string relativePath)
{
File.Exists(Path.Combine(DmRoot, relativePath.Replace('/', Path.DirectorySeparatorChar))).Should().BeTrue(relativePath);
}
[Theory]
[MemberData(nameof(TvRequiredArtifacts))]
public void TvPiArtifacts_ArePresent(string relativePath)
{
File.Exists(Path.Combine(TvRoot, relativePath.Replace('/', Path.DirectorySeparatorChar))).Should().BeTrue(relativePath);
}
[Fact]
public void DmDeviceReadme_DeclaresPuppetSystemdNotKubernetes()
{
var readme = ReadDm("README.md");
readme.Should().Contain("not a Kubernetes application");
readme.Should().Contain("profile::pi::service::divoom");
readme.Should().Contain("no K8s surface");
}
[Fact]
public void DmHieraOverlay_PreservesExistingEdge2DivoomService()
{
var hiera = ReadDm("hiera/edge2-divoom-dm-device.overlay.yaml");
hiera.Should().Contain("fc-pimanager:");
hiera.Should().Contain("fc-divoom:");
hiera.Should().Contain("enabled: true");
hiera.Should().Contain("profile::pi::service::divoom_dm_device::service_enabled: false");
hiera.Should().Contain("profile::pi::service::divoom_dm_device::service_ensure: 'stopped'");
}
[Fact]
public void DmPuppetProfile_DefaultsToStoppedDisabledService()
{
var profile = ReadDm("puppet/profile/pi/service/divoom_dm_device.pp");
profile.Should().Contain("Boolean $service_enabled = false");
profile.Should().Contain("Enum['running', 'stopped'] $service_ensure = 'stopped'");
profile.Should().Contain("service { $service_name:");
profile.Should().Contain("ensure => $service_ensure");
profile.Should().Contain("enable => $service_enabled");
}
[Fact]
public void DmPuppetProfile_DoesNotManageLiveDivoomWebUnit()
{
var profile = ReadDm("puppet/profile/pi/service/divoom_dm_device.pp");
profile.Should().NotContain("Service['flowercore-divoom.service']");
profile.Should().NotContain("service { 'flowercore-divoom.service'");
profile.Should().NotContain("notify => Service");
}
[Fact]
public void DmAgentUnit_IsSeparateAndGatedByExistingWrappers()
{
var unit = ReadDm("puppet/templates/flowercore-divoom-dm-agent.service.epp");
unit.Should().Contain("ConditionPathExists=<%= $divoom_install_dir %>/bt-link.sh");
unit.Should().Contain("ConditionPathExists=<%= $divoom_install_dir %>/bt-reset.sh");
unit.Should().Contain("ConditionPathExists=<%= $divoom_install_dir %>/audio-link.sh");
unit.Should().Contain("ExecStart=<%= $agent_binary_path %> --mode=Pi");
unit.Should().NotContain("flowercore-divoom.service");
}
[Fact]
public void DmRegistration_CarriesRenderProofAndSafetyPolicy()
{
var registration = ReadDm("puppet/templates/divoom-device-registration.json.epp");
registration.Should().Contain("\"candidateChannels\": <%= $bt_channels_json %>");
registration.Should().Contain("\"deviceInfoIsRenderProof\": false");
registration.Should().Contain("\"visibleRenderProofRequired\": <%= $visible_render_proof_required %>");
registration.Should().Contain("\"preserveExistingService\": \"flowercore-divoom.service\"");
registration.Should().Contain("\"doNotEnableFmRadio\": true");
}
[Fact]
public void TvService_UsesAvaloniaHdmiSafetyGates()
{
var unit = ReadTv("systemd/flowercore-divoom-tv.service");
unit.Should().Contain("ConditionPathExists=/opt/flowercore/divoom-tv/FlowerCore.Divoom.Tv");
unit.Should().Contain("Environment=XDG_RUNTIME_DIR=/run/fc-divoom-tv");
unit.Should().Contain("RuntimeDirectoryMode=0700");
unit.Should().Contain("ExecStartPre=/usr/local/bin/flowercore-divoom-tv-prelaunch.sh");
unit.Should().Contain("ExecStart=/usr/local/bin/flowercore-divoom-tv-launch.sh");
unit.Should().Contain("MemoryMax=2G");
unit.Should().Contain("PrivateTmp=true");
unit.Should().NotContain("/tmp");
}
[Fact]
public void TvLauncher_PrefersCageAndFallsBackToDirectLaunch()
{
var script = ReadTv("scripts/flowercore-divoom-tv-launch.sh");
script.Should().Contain("command -v cage");
script.Should().Contain("exec cage --");
script.Should().Contain("launching FlowerCore.Divoom.Tv directly");
script.Should().Contain("--target=hdmi");
script.Should().Contain("--presentation-mode=${PRESENTATION_MODE}");
}
[Fact]
public void TvHotplugRule_SettlesAndRestartsRenderer()
{
var rule = ReadTv("systemd/99-flowercore-divoom-tv-hdmi.rules");
var responder = ReadTv("scripts/flowercore-divoom-tv-hdmi-respond.sh");
rule.Should().Contain("KERNEL==\"card?-HDMI-A-?\"");
rule.Should().Contain("start flowercore-divoom-tv-hdmi.service");
responder.Should().Contain("sleep 2");
responder.Should().Contain("systemctl restart flowercore-divoom-tv.service");
}
[Fact]
public void TvPuppetProfile_InstallsCageAndStaticArtifacts()
{
var profile = ReadTv("puppet/profile/pi/service/divoom_tv.pp");
profile.Should().Contain("package { ['cage', 'libgbm1', 'libdrm2', 'libxkbcommon0', 'fonts-dejavu-core']");
profile.Should().Contain("'profile/pi/fc_divoom_tv/flowercore-divoom-tv.service'");
profile.Should().Contain("'profile/pi/fc_divoom_tv/flowercore-divoom-tv-launch.sh'");
profile.Should().Contain("profile/pi/fc_divoom_tv/99-flowercore-divoom-tv-hdmi.rules");
profile.Should().Contain("Boolean $service_enabled = false");
}
[Fact]
public void DivoomArtifacts_DoNotAddKubernetesWorkloads()
{
var allText = Directory.GetFiles(DmRoot, "*", SearchOption.AllDirectories)
.Concat(Directory.GetFiles(TvRoot, "*", SearchOption.AllDirectories))
.Select(File.ReadAllText);
foreach (var text in allText)
{
text.Should().NotContain("kind: Deployment");
text.Should().NotContain("kind: IngressRoute");
text.Should().NotContain("kind: Certificate");
text.Should().NotContain("kind: OnePasswordItem");
}
}
private static string ReadDm(string relativePath)
=> File.ReadAllText(Path.Combine(DmRoot, relativePath.Replace('/', Path.DirectorySeparatorChar)));
private static string ReadTv(string relativePath)
=> File.ReadAllText(Path.Combine(TvRoot, relativePath.Replace('/', Path.DirectorySeparatorChar)));
private static string FindRepoRoot()
{
var current = new DirectoryInfo(AppContext.BaseDirectory);
while (current is not null)
{
if (Directory.Exists(Path.Combine(current.FullName, "apps"))
&& File.Exists(Path.Combine(current.FullName, "README.md")))
{
return current.FullName;
}
current = current.Parent;
}
throw new DirectoryNotFoundException("Could not find bluejay-infra root.");
}
}

View File

@@ -15,24 +15,19 @@ public sealed class FleetManifestLintTests
{ {
"brochure.flowercore.io", "brochure.flowercore.io",
"dist.flowercore.io", "dist.flowercore.io",
"dns.iamworkin.lan",
}; };
// Public hosts that allow a tightly bounded write surface in addition to // Hosts that allow a tightly bounded write surface in addition to GET/HEAD.
// GET/HEAD. updatecenter.iamworkin.lan accepts POST /api/v1/checkin/{id} // updatecenter.iamworkin.lan accepts POST /api/v1/checkin/{id}
// (bootstrap-JWT) so its allowlist is GET||HEAD||POST||OPTIONS — but // (bootstrap-JWT) so its allowlist is GET||HEAD||POST||OPTIONS — but
// PUT/PATCH/DELETE must still 404 at the route. Anything wider than this // PUT/PATCH/DELETE must still 404 at the route. Public
// set should fail this lint. // update.flowercore.io remains a GET/HEAD download surface in the
// // FlowerCore.Updater sibling manifest and is covered by the general
// PUB-1 (2026-05-06): update.flowercore.io / updates.flowercore.io were // public-method allowlist lint instead of this write-surface rule.
// added for the Cloudflare-proxied public Update Center edge. They use the
// same bounded read-write allowlist as the LAN pair.
private static readonly HashSet<string> PublicReadWriteAllowlistHosts = new(StringComparer.Ordinal) private static readonly HashSet<string> PublicReadWriteAllowlistHosts = new(StringComparer.Ordinal)
{ {
"updatecenter.iamworkin.lan", "updatecenter.iamworkin.lan",
"updates.iamworkin.lan", "updates.iamworkin.lan",
"update.flowercore.io",
"updates.flowercore.io",
}; };
private static readonly HashSet<string> ApiKeyProtectedDeployments = new(StringComparer.Ordinal) private static readonly HashSet<string> ApiKeyProtectedDeployments = new(StringComparer.Ordinal)
@@ -70,7 +65,7 @@ public sealed class FleetManifestLintTests
["github-runner-updater"] = "https://github.com/astoltz/FlowerCore.Updater", ["github-runner-updater"] = "https://github.com/astoltz/FlowerCore.Updater",
}; };
private static readonly HashSet<string> ScaledLinuxRunnerDeployments = new(StringComparer.Ordinal) private static readonly HashSet<string> RepoScopedLinuxRunnerDeployments = new(StringComparer.Ordinal)
{ {
"github-runner-sharedpos", "github-runner-sharedpos",
"github-runner-puppet", "github-runner-puppet",
@@ -84,6 +79,44 @@ public sealed class FleetManifestLintTests
"github-runner-updater", "github-runner-updater",
}; };
private static readonly IReadOnlyDictionary<string, (string Deployment, string ProbePath)> BroaderHardeningDeployments =
new Dictionary<string, (string Deployment, string ProbePath)>(StringComparer.Ordinal)
{
["fc-aistation"] = ("aistation-web", "/healthz"),
["fc-chat"] = ("chat-web", "/healthz"),
["fc-devicemgmt"] = ("fc-devicemgmt-web", "/healthz"),
["fc-library"] = ("library-web", "/health"),
["fc-llm-bridge"] = ("fc-llm-bridge", "/healthz"),
["fc-messageboard"] = ("messageboard-web", "/health"),
["fc-retail"] = ("retail-web", "/healthz"),
["fc-ttsreader"] = ("ttsreader-web", "/health"),
["fc-updater"] = ("updatecenter-web", "/"),
["knowledge"] = ("knowledge-web", "/healthz"),
["telephony"] = ("telephony-web", "/health"),
["worldbuilder"] = ("worldbuilder-web", "/healthz"),
};
private static readonly HashSet<string> BroaderHardeningInternalPrestageApps = new(StringComparer.Ordinal)
{
"fc-aistation",
"fc-desktop",
"fc-dms",
"fc-library",
"fc-llm-bridge",
"fc-menuboard",
"fc-messageboard",
"fc-mysql",
"fc-php",
"fc-presentations",
"fc-retail",
"fc-scoreboard",
"fc-segmentdisplay",
"fc-signage",
"fc-ttsreader",
"knowledge",
"worldbuilder",
};
private static readonly IReadOnlyDictionary<string, string> WritableRunnerEnv = new Dictionary<string, string>(StringComparer.Ordinal) private static readonly IReadOnlyDictionary<string, string> WritableRunnerEnv = new Dictionary<string, string>(StringComparer.Ordinal)
{ {
["HOME"] = "/home/runner", ["HOME"] = "/home/runner",
@@ -236,9 +269,10 @@ public sealed class FleetManifestLintTests
{ {
deployments.Should().ContainKey(expectedRunner.Key); deployments.Should().ContainKey(expectedRunner.Key);
var container = deployments[expectedRunner.Key].ContainerMappings().Should().ContainSingle().Subject; var container = deployments[expectedRunner.Key].MainContainerMappings().Should().ContainSingle().Subject;
EnvValue(container, "REPO_URL").Should().Be(expectedRunner.Value); EnvValue(container, "REPO_URL").Should().Be(expectedRunner.Value);
EnvValue(container, "EPHEMERAL").Should().Be("true"); EnvValue(container, "EPHEMERAL").Should().Be("true");
EnvValue(container, "DISABLE_AUTO_UPDATE").Should().Be("true", $"{expectedRunner.Key} must not self-update inside immutable Kubernetes runner pods");
EnvValue(container, "LABELS").Should().Be("self-hosted,linux,fc-build-linux"); EnvValue(container, "LABELS").Should().Be("self-hosted,linux,fc-build-linux");
EnvValue(container, "RUN_AS_ROOT").Should().Be("false"); EnvValue(container, "RUN_AS_ROOT").Should().Be("false");
EnvValue(container, "ACCESS_TOKEN").Should().BeNull("ACCESS_TOKEN must come from github-runner-token Secret, not a literal"); EnvValue(container, "ACCESS_TOKEN").Should().BeNull("ACCESS_TOKEN must come from github-runner-token Secret, not a literal");
@@ -252,7 +286,7 @@ public sealed class FleetManifestLintTests
{ {
foreach (var deployment in GitHubRunnerDeployments().Values) foreach (var deployment in GitHubRunnerDeployments().Values)
{ {
var container = deployment.ContainerMappings().Should().ContainSingle().Subject; var container = deployment.MainContainerMappings().Should().ContainSingle().Subject;
foreach (var expectedEnv in WritableRunnerEnv) foreach (var expectedEnv in WritableRunnerEnv)
{ {
@@ -272,14 +306,17 @@ public sealed class FleetManifestLintTests
} }
[Fact] [Fact]
public void GitHubRunnerFleet_MustAvoidRwoMultiAttachForScaledDeployments() public void GitHubRunnerFleet_MustAvoidRwoMultiAttachForRepoScopedDeployments()
{ {
var deployments = GitHubRunnerDeployments(); var deployments = GitHubRunnerDeployments();
foreach (var deploymentName in ScaledLinuxRunnerDeployments) foreach (var deploymentName in RepoScopedLinuxRunnerDeployments)
{ {
var deployment = deployments[deploymentName]; var deployment = deployments[deploymentName];
ReplicaCount(deployment).Should().Be(2); // Sprint 34 ops trimmed runner load while the cluster was degraded
// to two healthy nodes. Repo-scoped runners can be tuned back above
// one replica, but they must stay RWO-safe before that happens.
ReplicaCount(deployment).Should().BeGreaterOrEqualTo(1, $"{deploymentName} must keep at least one repo-scoped runner online");
var volumes = deployment.MappingSequence("spec", "template", "spec", "volumes"); var volumes = deployment.MappingSequence("spec", "template", "spec", "volumes");
var claimNames = volumes var claimNames = volumes
@@ -287,7 +324,7 @@ public sealed class FleetManifestLintTests
.Where(value => !string.IsNullOrWhiteSpace(value)) .Where(value => !string.IsNullOrWhiteSpace(value))
.ToList(); .ToList();
claimNames.Should().BeEmpty($"{deploymentName} is scaled and must not share a RWO PVC"); claimNames.Should().BeEmpty($"{deploymentName} must remain ready for safe multi-replica scaling without sharing a RWO PVC");
volumes.Should().Contain(volume => volumes.Should().Contain(volume =>
string.Equals(ManifestNodeExtensions.Scalar(volume, "name"), "nuget-cache", StringComparison.Ordinal) string.Equals(ManifestNodeExtensions.Scalar(volume, "name"), "nuget-cache", StringComparison.Ordinal)
&& ManifestNodeExtensions.Mapping(volume, "emptyDir") != null); && ManifestNodeExtensions.Mapping(volume, "emptyDir") != null);
@@ -305,6 +342,108 @@ public sealed class FleetManifestLintTests
.Be("github-runner-nuget-cache"); .Be("github-runner-nuget-cache");
} }
[Fact]
public void Runners_MustNotPinToOperatorWorkstationHosts()
{
// CRITICAL SAFETY (operator directive 2026-05-26): BLUEJAY-WS is the
// operator's primary workstation — host of the 1Password Connect
// bearer token, fcadmin SSH keys to noc1, signing CA private keys,
// and source for every FC repo. A self-hosted GitHub Actions runner
// there would execute arbitrary PR code with that local access.
// Build-side analog of the Sprint 9 NEW safe-account exclusion gate
// (Puppet GPO/AppLocker/WDAC/audit-forwarder modules refuse to apply
// on BLUEJAY-WS). This lint asserts no GitHub-runner Deployment in
// apps/github-runner/ pins to a forbidden operator-workstation host
// via nodeName, nodeSelector, nodeAffinity, or tolerations.
// Existing legacy `bluejay-ws-sandbox-1` GitHub-registered runner is
// out of scope here (it's a runtime registration, not a K8s
// Deployment) — see CLAUDE.md "Common Mistakes" entry and
// feedback_bluejay_ws_never_public_runner.md.
var forbiddenHostPatterns = new[]
{
"bluejay-ws",
"BLUEJAY-WS",
"bluejay-ws.iamworkin.lan",
"iamworkin-ws",
};
bool ContainsForbidden(string? value) =>
!string.IsNullOrWhiteSpace(value)
&& forbiddenHostPatterns.Any(pattern => value!.Contains(pattern, StringComparison.OrdinalIgnoreCase));
var violations = GitHubRunnerDeployments().Values.SelectMany(deployment =>
{
var local = new List<string>();
var podSpec = ManifestNodeExtensions.Mapping(deployment.Root, "spec", "template", "spec");
if (podSpec is null)
{
return local;
}
// nodeName: pins the pod to a specific node by name.
var nodeName = ManifestNodeExtensions.Scalar(podSpec, "nodeName");
if (ContainsForbidden(nodeName))
{
local.Add($"{deployment.Name} sets nodeName='{nodeName}' which targets a forbidden operator-workstation host.");
}
// nodeSelector: dict of label → value pinning the pod to nodes
// carrying matching labels. Examples that would trip this:
// kubernetes.io/hostname: bluejay-ws
// flowercore.io/host: bluejay-ws.iamworkin.lan
var nodeSelector = ManifestNodeExtensions.Mapping(podSpec, "nodeSelector");
if (nodeSelector is not null)
{
foreach (var entry in nodeSelector.Children)
{
var key = entry.Key is YamlScalarNode keyScalar ? keyScalar.Value : null;
var value = entry.Value is YamlScalarNode valueScalar ? valueScalar.Value : null;
if (ContainsForbidden(value))
{
local.Add($"{deployment.Name} has nodeSelector entry '{key}: {value}' which targets a forbidden operator-workstation host.");
}
}
}
// nodeAffinity: matchExpressions over node labels.
foreach (var term in ManifestNodeExtensions.MappingSequence(podSpec, "affinity", "nodeAffinity", "requiredDuringSchedulingIgnoredDuringExecution", "nodeSelectorTerms"))
{
foreach (var expr in ManifestNodeExtensions.MappingSequence(term, "matchExpressions"))
{
var key = ManifestNodeExtensions.Scalar(expr, "key");
foreach (var valueNode in ManifestNodeExtensions.ScalarSequence(expr, "values"))
{
if (ContainsForbidden(valueNode))
{
local.Add($"{deployment.Name} has nodeAffinity matchExpression '{key}' value '{valueNode}' which targets a forbidden operator-workstation host.");
}
}
}
}
// tolerations: scheduling onto a tainted operator-workstation
// node would let the runner run there. Forbid any toleration
// value that names the workstation.
foreach (var toleration in ManifestNodeExtensions.MappingSequence(podSpec, "tolerations"))
{
var key = ManifestNodeExtensions.Scalar(toleration, "key");
var value = ManifestNodeExtensions.Scalar(toleration, "value");
if (ContainsForbidden(key))
{
local.Add($"{deployment.Name} has toleration key '{key}' which targets a forbidden operator-workstation host.");
}
if (ContainsForbidden(value))
{
local.Add($"{deployment.Name} has toleration value '{value}' which targets a forbidden operator-workstation host.");
}
}
return local;
}).ToList();
violations.Should().BeEmpty("BLUEJAY-WS / iamworkin-ws must never host a fleet GitHub Actions runner; see CLAUDE.md 'Registering BLUEJAY-WS as a fleet GitHub Actions runner' and feedback_bluejay_ws_never_public_runner.md");
}
[Fact] [Fact]
public void Monitoring_MustAlertWhenLinuxRunnerDeploymentIsUnavailable() public void Monitoring_MustAlertWhenLinuxRunnerDeploymentIsUnavailable()
{ {
@@ -319,6 +458,82 @@ public sealed class FleetManifestLintTests
monitoring.Should().Contain("alert_channel: irc"); monitoring.Should().Contain("alert_channel: irc");
} }
[Fact]
public void Monitoring_GenericKubernetesAlerts_MustExcludeEphemeralGithubRunnerNamespace()
{
var monitoring = File.ReadAllText(Path.Combine(Inventory.BluejayRoot, "apps", "monitoring", "noc-monitoring.yaml"));
monitoring.Should().Contain("kube_pod_container_status_restarts_total{namespace!=\"github-runner\"}");
monitoring.Should().Contain("and on(namespace, pod) kube_pod_info");
monitoring.Should().Contain("kube_deployment_spec_replicas{namespace!=\"github-runner\"} != kube_deployment_status_replicas_available{namespace!=\"github-runner\"}");
monitoring.Should().Contain("dedicated LinuxRunnerOffline/MacMiniRunnerOffline alerts");
}
[Fact]
public void Monitoring_BlackboxTargetsForOidcSensitiveServices_MustUseAnonymousHealthRoutesWhenAvailable()
{
var monitoring = File.ReadAllText(Path.Combine(Inventory.BluejayRoot, "apps", "monitoring", "noc-monitoring.yaml"));
monitoring.Should().Contain("https://chat.iamworkin.lan/healthz");
monitoring.Should().Contain("https://dist.iamworkin.lan/healthz");
monitoring.Should().Contain("https://dms.iamworkin.lan/healthz");
monitoring.Should().Contain("https://print.iamworkin.lan/healthz");
monitoring.Should().Contain("https://knowledge.iamworkin.lan/healthz");
monitoring.Should().Contain("https://library.iamworkin.lan/health");
monitoring.Should().Contain("https://aistation.iamworkin.lan/healthz");
monitoring.Should().NotContain("https://print.iamworkin.lan/\"");
}
[Fact]
public void OidcEnforcedDeployments_WithHttpHealthzProbes_MustDeclareAnonymousHealthzContract()
{
var violations = Inventory.Documents
.Where(document => document.Kind == "Deployment")
.SelectMany(document => document.MainContainerMappings()
.Where(container => string.Equals(EnvValue(container, "FlowerCore__Auth__Enabled"), "true", StringComparison.OrdinalIgnoreCase))
.Where(container => string.Equals(EnvValue(container, "FlowerCore__Auth__Oidc__Enabled"), "true", StringComparison.OrdinalIgnoreCase))
.Where(container => ProbeHttpGetPath(container, "readinessProbe") == "/healthz"
|| ProbeHttpGetPath(container, "startupProbe") == "/healthz")
.Where(_ => !string.Equals(
PodAnnotation(document, "flowercore.io/healthz-auth-policy"),
"allow-anonymous",
StringComparison.Ordinal))
.Select(container =>
{
var containerName = ManifestNodeExtensions.Scalar(container, "name") ?? "<unnamed>";
return $"{document.Descriptor} container '{containerName}' enforces OIDC while probing /healthz but lacks flowercore.io/healthz-auth-policy: allow-anonymous.";
}))
.ToList();
violations.Should().BeEmpty();
}
[Fact]
public void Knowledge_OidcEnforcement_MustKeepHealthzAnonymousContractVisibleInManifest()
{
var knowledge = Inventory.Documents
.Single(document => document.Kind == "Deployment" && document.Namespace == "knowledge" && document.Name == "knowledge-web");
var container = knowledge.MainContainerMappings().Should().ContainSingle().Subject;
EnvValue(container, "FlowerCore__Auth__Enabled").Should().Be("true");
EnvValue(container, "FlowerCore__Auth__Oidc__Enabled").Should().Be("true");
ProbeHttpGetPath(container, "readinessProbe").Should().Be("/healthz");
PodAnnotation(knowledge, "flowercore.io/healthz-auth-policy").Should().Be("allow-anonymous");
}
[Fact]
public void Distribution_OidcEnforcement_MustKeepHealthzAnonymousContractVisibleInManifest()
{
var distribution = Inventory.Documents
.Single(document => document.Kind == "Deployment" && document.Namespace == "fc-distribution" && document.Name == "fc-distribution");
var container = distribution.MainContainerMappings().Should().ContainSingle().Subject;
EnvValue(container, "FlowerCore__Auth__Oidc__Enabled").Should().Be("true");
EnvValue(container, "FlowerCore__Auth__Enabled").Should().Be("true");
ProbeHttpGetPath(container, "readinessProbe").Should().Be("/healthz");
PodAnnotation(distribution, "flowercore.io/healthz-auth-policy").Should().Be("allow-anonymous");
}
[Fact] [Fact]
public void StatefulSets_WithVolumeClaimTemplates_MustDeclareFilesystemDefaults() public void StatefulSets_WithVolumeClaimTemplates_MustDeclareFilesystemDefaults()
{ {
@@ -432,10 +647,10 @@ public sealed class FleetManifestLintTests
var expectedFiles = new[] var expectedFiles = new[]
{ {
"1password-item.yaml", "1password-item.yaml",
"argocd-application.yaml",
"certificate-web.yaml", "certificate-web.yaml",
"clusterrole-operator.yaml", "clusterrole-operator.yaml",
"clusterrolebinding-operator.yaml", "clusterrolebinding-operator.yaml",
"crds.yaml",
"deployment-operator.yaml", "deployment-operator.yaml",
"deployment-web.yaml", "deployment-web.yaml",
"ingressroute-web.yaml", "ingressroute-web.yaml",
@@ -525,7 +740,8 @@ public sealed class FleetManifestLintTests
.Single(document => document.Kind == "ClusterRole" && document.Name == "fc-devicemgmt-operator"); .Single(document => document.Kind == "ClusterRole" && document.Name == "fc-devicemgmt-operator");
var allScalars = clusterRole.AllScalars().ToList(); var allScalars = clusterRole.AllScalars().ToList();
allScalars.Should().Contain("devices.flowercore.io"); allScalars.Should().Contain("flowercore.io");
allScalars.Should().NotContain("devices.flowercore.io");
allScalars.Should().Contain("*"); allScalars.Should().Contain("*");
allScalars.Should().Contain("deployments"); allScalars.Should().Contain("deployments");
allScalars.Should().Contain("get"); allScalars.Should().Contain("get");
@@ -554,7 +770,7 @@ public sealed class FleetManifestLintTests
FcDeviceManagementDocuments().Should().NotContain(document => document.Kind == "Secret"); FcDeviceManagementDocuments().Should().NotContain(document => document.Kind == "Secret");
appText.Should().Contain("secretKeyRef:"); appText.Should().Contain("secretKeyRef:");
appText.Should().Contain("secretName: fc-devicemgmt-runtime"); appText.Should().Contain("name: fc-devicemgmt-runtime");
appText.Should().NotContain("stringData:"); appText.Should().NotContain("stringData:");
appText.Should().NotContain("from-literal"); appText.Should().NotContain("from-literal");
appText.Should().NotContain("tls.key:"); appText.Should().NotContain("tls.key:");
@@ -588,17 +804,202 @@ public sealed class FleetManifestLintTests
} }
[Fact] [Fact]
public void FcDeviceManagement_ArgocdApplicationMustMatchApplicationSetDiscoveryConventions() public void FcDeviceManagement_MustRelyOnApplicationSetDiscovery()
{ {
var application = FcDeviceManagementDocuments() var documents = FcDeviceManagementDocuments();
.Single(document => document.Kind == "Application" && document.Name == "infra-fc-devicemgmt");
application.Namespace.Should().Be("argocd"); documents.Should().NotContain(document => document.Kind == "Application");
application.Scalar("spec", "source", "repoURL")
var ns = documents.Single(document => document.Kind == "Namespace" && document.Name == "fc-devicemgmt");
ns.FileText.Should().Contain("ArgoCD discovers this directory as Application `infra-fc-devicemgmt`.");
}
[Fact]
public void BroaderHardeningDeployments_MustAnnotateAnonymousHealthProbeIntent()
{
foreach (var expected in BroaderHardeningDeployments)
{
var deployment = AppDocuments(expected.Key)
.Single(document => document.Kind == "Deployment" && document.Name == expected.Value.Deployment);
PodAnnotation(deployment, "fc.flowercore.io/healthz-anon").Should().Be("true");
PodAnnotation(deployment, "fc.flowercore.io/probe-path").Should().Be(expected.Value.ProbePath);
}
}
[Fact]
public void BroaderHardeningDeployments_MustDocumentForwardedProtoAuthPosture()
{
foreach (var expected in BroaderHardeningDeployments)
{
var deployment = AppDocuments(expected.Key)
.Single(document => document.Kind == "Deployment" && document.Name == expected.Value.Deployment);
deployment.FileText.Should().Contain(
"fc-safe-to-expose: X-Forwarded-Proto handled by AddFlowerCoreWebAuth (ADR-178)");
}
}
[Fact]
public void BroaderHardeningInternalApps_MustOnlyPrestageCommentedPublicMethodAllowlist()
{
foreach (var app in BroaderHardeningInternalPrestageApps)
{
var documents = AppDocuments(app);
var text = string.Join(Environment.NewLine, documents.Select(document => document.FileText));
text.Should().Contain("PUBLIC HOST PRE-STAGING (DISABLED - Sprint 61+ exposure go-decision only)");
text.Should().Contain("# - match: Host(`");
text.Should().Contain("Method(`GET`) || Method(`HEAD`)");
documents
.Where(document => document.Kind == "IngressRoute")
.SelectMany(document => document.MappingSequence("spec", "routes"))
.Select(route => ManifestNodeExtensions.Scalar(route, "match") ?? string.Empty)
.Should()
.NotContain(match => match.Contains(".flowercore.io", StringComparison.Ordinal),
"Sprint 61 broader hardening only pre-stages commented public hosts for internal-only apps");
}
}
[Fact]
public void OidcFlipServices_AreGitOpsManagedWithHealthzProbes()
{
var deployments = new[]
{
(App: "fc-dns", Name: "dns-web", Slug: "dns", Secret: "dns-oidc-client", AuthEnabled: "false"),
(App: "fc-media", Name: "fc-media-web", Slug: "media", Secret: "media-oidc-client", AuthEnabled: "true"),
(App: "fc-distribution", Name: "fc-distribution", Slug: "distribution", Secret: "distribution-oidc-client", AuthEnabled: "true"),
};
foreach (var expected in deployments)
{
var deployment = AppDocuments(expected.App)
.Single(document => document.Kind == "Deployment" && document.Name == expected.Name);
var container = deployment.MainContainerMappings().Should().ContainSingle().Subject;
EnvValue(container, "FlowerCore__Auth__Enabled").Should().Be(expected.AuthEnabled);
EnvValue(container, "FlowerCore__Auth__Oidc__Enabled").Should().Be("true");
(EnvValue(container, "FlowerCore__Auth__Oidc__Audience") ?? EnvValue(container, "FlowerCore__Auth__Oidc__ClientId"))
.Should()
.Be(expected.Slug);
EnvSecretName(container, "FlowerCore__Auth__Oidc__ClientSecret").Should().Be(expected.Secret);
EnvSecretOptional(container, "FlowerCore__Auth__Oidc__ClientSecret").Should().Be("true");
ProbePath(container, "readinessProbe").Should().Be("/healthz");
if (ProbePath(container, "startupProbe") is { } startupProbePath)
{
startupProbePath.Should().Be("/healthz");
}
if (ProbePath(container, "livenessProbe") is { } livenessProbePath)
{
livenessProbePath.Should().Be("/healthz");
}
}
}
[Fact]
public void OidcFlipServices_UseOnePasswordItemClientSecrets()
{
var expectedItems = new Dictionary<string, (string Name, string ItemPath)>(StringComparer.Ordinal)
{
["fc-dns"] = ("dns-oidc-client", "vaults/IAmWorkin/items/dns-oidc-client"),
["fc-media"] = ("media-oidc-client", "vaults/IAmWorkin/items/media-oidc-client"),
["fc-distribution"] = ("distribution-oidc-client", "vaults/IAmWorkin/items/distribution-oidc-client"),
};
foreach (var expected in expectedItems)
{
var item = AppDocuments(expected.Key)
.Single(document => document.Kind == "OnePasswordItem" && document.Name == expected.Value.Name);
item.Scalar("spec", "itemPath").Should().Be(expected.Value.ItemPath);
}
}
[Fact]
public void DnsAndMediaGitOpsAdoption_PreservesLiveStorageAndImageShape()
{
var dnsDeployment = AppDocuments("fc-dns")
.Single(document => document.Kind == "Deployment" && document.Name == "dns-web");
var dnsContainer = dnsDeployment.MainContainerMappings().Should().ContainSingle().Subject;
var dnsPvc = AppDocuments("fc-dns")
.Single(document => document.Kind == "PersistentVolumeClaim" && document.Name == "dns-web-data");
ManifestNodeExtensions.Scalar(dnsContainer, "image").Should().Be("localhost/fc-dns-web:v20260613-g5-quota-aa99bd1");
dnsPvc.Scalar("spec", "storageClassName").Should().Be("longhorn");
dnsPvc.Scalar("spec", "resources", "requests", "storage").Should().Be("1Gi");
var mediaDeployment = AppDocuments("fc-media")
.Single(document => document.Kind == "Deployment" && document.Name == "fc-media-web");
var mediaContainer = mediaDeployment.MainContainerMappings().Should().ContainSingle().Subject;
var mediaPvc = AppDocuments("fc-media")
.Single(document => document.Kind == "PersistentVolumeClaim" && document.Name == "fc-media-data");
ManifestNodeExtensions.Scalar(mediaContainer, "image").Should().Be("localhost/fc-media-web:v20260604-oidc-proper");
mediaPvc.Scalar("spec", "storageClassName").Should().Be("longhorn");
mediaPvc.Scalar("spec", "resources", "requests", "storage").Should().Be("20Gi");
mediaDeployment.AllScalars().Should().Contain(new[]
{
"/volume1/kubernetes/fc-media-transcodes",
"/volume1/kubernetes/fc-media-inbox",
"/volume1/video",
});
var distributionDeployment = AppDocuments("fc-distribution")
.Single(document => document.Kind == "Deployment" && document.Name == "fc-distribution");
var distributionContainer = distributionDeployment.MainContainerMappings().Should().ContainSingle().Subject;
ManifestNodeExtensions.Scalar(distributionContainer, "image").Should().Be("localhost/fc-distribution:v20260604-oidc-root-anon");
}
[Fact]
public void MonitoringProbes_UseHealthzForOidcGatedHosts()
{
var monitoring = File.ReadAllText(Path.Combine(Inventory.BluejayRoot, "apps", "monitoring", "noc-monitoring.yaml"));
monitoring.Should().Contain("\"https://dns.iamworkin.lan/healthz\"");
monitoring.Should().Contain("\"https://dist.iamworkin.lan/healthz\"");
monitoring.Should().Contain("\"https://media.iamworkin.lan/healthz\"");
monitoring.Should().NotContain("\"https://dns.iamworkin.lan/\"");
monitoring.Should().NotContain("\"https://dist.iamworkin.lan/\"");
monitoring.Should().NotContain("\"https://media.iamworkin.lan/\"");
}
[Fact]
public void DistributionPublicIngress_KeepsGetHeadMethodAllowlist()
{
var publicIngress = AppDocuments("fc-distribution")
.Single(document => document.Kind == "IngressRoute" && document.Name == "fc-distribution-public");
var route = publicIngress.MappingSequence("spec", "routes").Should().ContainSingle().Subject;
var match = ManifestNodeExtensions.Scalar(route, "match");
match.Should().Contain("Host(`dist.flowercore.io`)");
match.Should().Contain("Method(`GET`)");
match.Should().Contain("Method(`HEAD`)");
match.Should().NotContain("Method(`POST`)");
}
[Fact]
public void DnsAndMediaIngressRoutes_MatchLiveInternalHosts()
{
var dnsRoute = AppDocuments("fc-dns")
.Single(document => document.Kind == "IngressRoute" && document.Name == "dns-web")
.MappingSequence("spec", "routes")
.Should() .Should()
.Be("http://gitea-clusterip.gitea.svc.cluster.local:3000/bluejay/bluejay-infra.git"); .ContainSingle()
application.Scalar("spec", "source", "path").Should().Be("apps/fc-devicemgmt"); .Subject;
application.Scalar("spec", "destination", "namespace").Should().Be("fc-devicemgmt"); var mediaRoute = AppDocuments("fc-media")
.Single(document => document.Kind == "IngressRoute" && document.Name == "fc-media-web")
.MappingSequence("spec", "routes")
.Should()
.ContainSingle()
.Subject;
ManifestNodeExtensions.Scalar(dnsRoute, "match").Should().Be("Host(`dns.iamworkin.lan`)");
ManifestNodeExtensions.Scalar(mediaRoute, "match").Should().Be("Host(`media.iamworkin.lan`)");
} }
private static IEnumerable<string> ProbeViolations( private static IEnumerable<string> ProbeViolations(
@@ -657,12 +1058,44 @@ public sealed class FleetManifestLintTests
: null; : null;
} }
private static string? EnvSecretOptional(YamlMappingNode container, string name)
{
return EnvMapping(container, name) is { } env
? ManifestNodeExtensions.Scalar(env, "valueFrom", "secretKeyRef", "optional")
: null;
}
private static string? ProbePath(YamlMappingNode container, string probeKey)
{
return ManifestNodeExtensions.Scalar(container, probeKey, "httpGet", "path");
}
private static IReadOnlyList<ManifestDocument> AppDocuments(string app)
{
return Inventory.Documents
.Where(document => document.RelativePath.StartsWith($"{app}/", StringComparison.Ordinal))
.ToList();
}
private static YamlMappingNode? EnvMapping(YamlMappingNode container, string name) private static YamlMappingNode? EnvMapping(YamlMappingNode container, string name)
{ {
return ManifestNodeExtensions.MappingSequence(container, "env") return ManifestNodeExtensions.MappingSequence(container, "env")
.SingleOrDefault(env => string.Equals(ManifestNodeExtensions.Scalar(env, "name"), name, StringComparison.Ordinal)); .SingleOrDefault(env => string.Equals(ManifestNodeExtensions.Scalar(env, "name"), name, StringComparison.Ordinal));
} }
private static string? PodAnnotation(ManifestDocument document, string name)
{
return document.Scalar("spec", "template", "metadata", "annotations", name);
}
private static string? ProbeHttpGetPath(YamlMappingNode container, string probeKey)
{
return ManifestNodeExtensions.TryGetMapping(container, probeKey, out var probe)
&& ManifestNodeExtensions.TryGetMapping(probe, "httpGet", out var httpGet)
? ManifestNodeExtensions.Scalar(httpGet, "path")
: null;
}
private static IReadOnlyList<ManifestDocument> FcDeviceManagementDocuments() private static IReadOnlyList<ManifestDocument> FcDeviceManagementDocuments()
{ {
return Inventory.Documents return Inventory.Documents
@@ -892,6 +1325,22 @@ internal sealed record ManifestDocument(
.ToList(); .ToList();
} }
// MainContainerMappings excludes initContainers. Use this when asserting
// properties of the primary container (env, image, volumeMounts) where an
// initContainer would be a false-positive match — e.g. the GitHub runner
// image's `setup-runner-home` initContainer should not count toward the
// single-container assertions on the runner deployments.
public IReadOnlyList<YamlMappingNode> MainContainerMappings()
{
var podSpec = PodSpec();
if (podSpec is null)
{
return Array.Empty<YamlMappingNode>();
}
return ManifestNodeExtensions.MappingSequence(podSpec, "containers").ToList();
}
public IReadOnlyList<ContainerSpec> ContainerSpecs() public IReadOnlyList<ContainerSpec> ContainerSpecs()
{ {
return ContainerMappings() return ContainerMappings()