Three additions to the monitoring ConfigMap, each targeting
FlowerCore.RemoteDesktop:
- **Scrape jobs** (2 new):
- probe-remotedesktop: blackbox http_2xx against
https://desktop.iamworkin.lan/health every 30s. Feeds the
RemoteDesktopWebDown alert.
- fc-remotedesktop: direct /metrics scrape against
desktop.iamworkin.lan for the fc_desktop_session_events_total
and fc_desktop_pool_* series.
- **Alert group `remote-desktop`** (7 rules in alerts.yml):
- RemoteDesktopWebDown (3m) — /health probe failing
- RemoteDesktopMetricsStale (10m) — absent metrics series
- RemoteDesktopPoolDepleted (5m) — pool deficit + depleted flag
- RemoteDesktopPoolDeficitSustained (10m, info) — persistent
below-desired pool size
- RemoteDesktopSessionChurnSpike (5m, info) — launch rate
>20/min
- RemoteDesktopRecordingEventsDropped (15m, info) — 30m without
recording events while launches active
- RemoteDesktopTlsExpiry (6h, critical) — <2d cert renewal
window; aligns with feedback_acme_expiry_alert_threshold
- **Grafana dashboard mount**: new volumeMounts + volumes entry for
`dashboards-remotedesktop` backed by the grafana-dashboard-remotedesktop
ConfigMap (previously added as a standalone file in d4210c8).
Folder path /var/lib/grafana/dashboards/remotedesktop — picked up
by the file-provider with foldersFromFilesStructure:true so the
dashboard shows up in a "Remotedesktop" folder in Grafana.
No CRLF churn; pure 100-line insertion into LF-normalized file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New surfaces: POST /api/v1/bible/projects (one-click whole-book render),
GET /api/v1/bible/books, GET /api/v1/bible/books/{book}/preview, MCP
tools render_tts_reader_bible_book + list_tts_reader_bible_books,
Dashboard "Render a Bible book" card. 107/107 tests, +7 from previous.
Wraps apps/monitoring/flowercore-remotedesktop-grafana-dashboard.json
as a ConfigMap manifest so ArgoCD syncs it into the cluster alongside
the existing grafana-dashboard-* ConfigMaps. Standalone file — does
NOT modify noc-monitoring.yaml. That keeps the CRLF churn on
noc-monitoring.yaml (sibling files apps/intranet/intranet.yaml and
apps/agent-zero/configmaps-bluejay.yaml also carry CRLF churn) out
of this commit.
Dashboard will be synced into the cluster but NOT loaded by Grafana
until a matching `volumes:` entry lands in the Grafana Deployment
in noc-monitoring.yaml:
- name: dashboard-remotedesktop
configMap:
name: grafana-dashboard-remotedesktop
Plus a `volumeMounts:` entry in the grafana container:
- name: dashboard-remotedesktop
mountPath: /etc/grafana/provisioning/dashboards/remotedesktop
readOnly: true
Those edits are deferred to the CRLF-normalization pass on
bluejay-infra so the review diff stays reviewable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in FlowerCore.TtsReader@9e2497f: P2.3 iTunes-namespace podcast
feed (author, summary, category, cover art, episode numbering,
duration, atom:self link, serial channel type for Bible projects) and
P2.4 ID3v2 tags on MP3 export + Vorbis comments on OGG (title, artist
with Piper voice humanized, album, track N/M, genre defaulting to
Religion & Spirituality for Bible or Audiobook for text sources,
date). Phones and podcast apps now show proper track info instead of
"Unknown - Unknown".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New Zabbix 7.2 template under `Templates/FlowerCore` that scrapes
the `/metrics` exposition from FlowerCore.RemoteDesktop and extracts:
- `fc_desktop_session_events_total` split by event (launch/connect/
disconnect/recording), with a dedicated datapoint for the
`browser_datasource="json"` slice to track delegated-auth launches.
- `fc_desktop_pool_ready` gauge sum for warm pools.
Trigger: `nodata(flowercore.remotedesktop.metrics,10m)=1` warns when
the public desktop host stops exposing metrics.
Follows the existing `flowercore-print-ollama.yaml` pattern — import
manually into Zabbix and link to the Print/Desktop host. Not a K8s
manifest; ArgoCD ignores.
Grafana dashboard JSON is drafted at
`apps/monitoring/flowercore-remotedesktop-grafana-dashboard.json`
but still needs a ConfigMap wrap + Grafana Deployment volume mount
in noc-monitoring.yaml before it ships (follow-up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in FlowerCore.TtsReader@63e6b62: P1.1 Media Session API wiring in
fc-media-session.js + quick-player.js + rendered-chapter-player.js, and
P1.2 biblical-name pronunciation lexicon auto-seed on Bible-source
project creation plus apply-bible-defaults endpoint + MCP tool for
existing projects. Tests 81 -> 97 all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live verification 2026-04-24 caught POST /blobs on dist.flowercore.io
returning 201 Created with the blob persisted — admin write operations
reachable on the public surface. Controller-level strict entitlement
was on, but that gates reads; writes weren't blocked at all.
Fix: add Method(GET) || Method(HEAD) to the Host match on the public
IngressRoute. POST/PUT/PATCH/DELETE now miss every route for
dist.flowercore.io and Traefik returns 404 before the pod sees the
request. Edge-level defense-in-depth on top of the controller's
strict-mode entitlement check.
The internal IngressRoute for dist.iamworkin.lan stays unrestricted —
admin POST /blobs + POST /manifests flows keep working from the lab.
Lights up dist.flowercore.io end-to-end:
- cf-origin-flowercore-io Secret (literal *.flowercore.io Origin Cert,
copied from the telephony/gitea-public/matrix/mail/flowercore/fc-landing
pattern — not via OnePasswordItem yet).
- Traefik Middleware dist-public-profile-header: strips any caller-supplied
X-FC-Distribution-Profile, injects 'public' so the controller's
NamedEntitlementResolverRouter routes to the strict resolver.
- IngressRoute fc-distribution-public: Host(`dist.flowercore.io`) ->
same backing Service as the internal dist.iamworkin.lan route.
Middleware attached; cert secret cf-origin-flowercore-io.
Cloudflare DNS A record dist.flowercore.io -> 74.40.140.24 (proxied)
already created 2026-04-24 via Cloudflare API (record id
e9b957511556f37ff6763f4441acbc45).
Controller entitlement config is still DefaultAllow=false + empty
PublicEditions on the 'public' profile, so every public request
returns 403 by default. Populate FlowerCore__Distribution__EntitlementPublic__PublicEditions__0
via env var when ready to expose specific editions.
- Serves GET /manifests/{edition}/{version}.cert (leaf+intermediate PEM)
- Adds CertChainPem migration on startup (nullable column)
- ManifestSignService now embeds version-specific certChainUrl
Provisioning Agent's verify step will flip from ChainNotServed (Phase 2A
soft-pass) to Valid once a fresh edition is published with this image.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The full FQDN fc-llm-bridge.fc-llm-bridge.svc.cluster.local has 4 dots,
which is less than the pod's ndots:5 threshold. The resolver then
applies every entry in the search list BEFORE falling through to the
bare FQDN, and the CoreDNS 'template iamworkin.lan' catch-all matches
"...svc.cluster.local.iamworkin.lan" and returns Traefik VIP
10.0.56.200. The egress NetworkPolicy blocks that VIP (0.0.0.0/0
EXCEPT 10.0.0.0/8), so curl hangs for 30-134s and returns HTTP 000.
Reference: feedback_coredns_ndots_template_collision memory.
Fix: use "fc-llm-bridge.fc-llm-bridge.svc" (2 dots, still <5 so search
expansion still fires, but the first suffix "...svc.cluster.local"
hits the Kubernetes plugin in CoreDNS and returns the real ClusterIP
10.43.67.125 before the iamworkin.lan template is ever consulted).
Verified: pod-exec curl fc:cheap → HTTP 200 with a real chat.completion
envelope (Ollama/gemma3:4b via bridge).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chat_model flip (62db15c) pointed Agent Zero at
fc-llm-bridge.fc-llm-bridge.svc.cluster.local:8080 but the existing
agent-zero-netpol only allowed egress to specific node IPs
(10.0.56.20:11434, 10.0.57.17:11434, 10.0.57.16:5200, 10.0.56.11:6443)
plus public-internet (with RFC1918 exclusion). ClusterIP traffic to
10.43.0.0/16 was implicitly denied, so pod-exec curl to the bridge
timed out after 134s.
Adds an egress rule allowing TCP 8080 to the fc-llm-bridge namespace
(matched by kubernetes.io/metadata.name which K8s 1.22+ sets
automatically). No ingress changes needed — fc-llm-bridge has no
NetworkPolicy, so the ingress side is already open.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flips Agent Zero's chat_model from direct local Ollama (gemma3:12b via
the 127.0.0.1:11434 sidecar proxy) to the FlowerCore LLM Bridge
(fc:balanced tier, OpenAI-compatible, Anthropic Claude Sonnet under the
hood) so chat turns are spend-tracked and can dispatch to any provider
via a single tier alias.
Scope is intentionally minimal and reversible:
- chat_model: ollama/gemma3:12b/127.0.0.1:11434
→ openai/fc:balanced/fc-llm-bridge internal service URL
- utility_model, embedding_model, browser_model: UNCHANGED
(stay on local 127.0.0.1 Ollama sidecar — no spend, low latency,
not worth routing through the bridge for small-model traffic).
Auth: new A0_SET_chat_model_api_key env var wired to the
fc-llm-bridge-api-keys Secret (field: agent-zero-k8s). The Secret is
synced by a new OnePasswordItem pointing at "FC LLM Bridge API Keys"
in the IAmWorkin vault. Bearer-token auth is now accepted by the
bridge (FlowerCore.LlmBridge@3225f1f).
Rollback: revert this commit; old image v202604231449 is still present
on all RKE2 nodes, and Agent Zero's strategy: Recreate makes the flip
atomic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ships the Bearer-token auth fix (FlowerCore.LlmBridge@3225f1f) so Agent
Zero's OpenAI provider can authenticate with Authorization: Bearer in
addition to the original X-Api-Key header.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pods in this cluster inherit ndots=5. External FQDNs with <5 dots (like
api.anthropic.com) are expanded through the search path first, and the 4th
suffix `api.anthropic.com.iamworkin.lan` matches CoreDNS' `template IN A
iamworkin.lan` wildcard — resolves to Traefik VIP 10.0.56.200. TLS connect
lands on Traefik's default cert and the AnthropicClient rejects with
RemoteCertificateNameMismatch/RemoteCertificateChainErrors.
Setting ndots=2 makes the resolver try the bare FQDN first (3 dots in
api.anthropic.com), so the search path never fires.
Reference: memory feedback_coredns_ndots_template_collision. Wider follow-up:
the CoreDNS template plugin should add fallthrough for external public suffixes,
so every FC service calling external HTTPS APIs stops hitting this trap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 1Password item "Claude API Key" stores the key in a standard Password
field (labeled `password`), so the OnePasswordItem operator creates the K8s
Secret with key `password`. Deployment was referencing `credential`, which
made the pod fail with CreateContainerConfigError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Built from FlowerCore.LlmBridge@6d285b5 (initial scaffold). Imported on all
three RKE2 nodes via podman save + ctr import. Replaces v00000000000000
placeholder — ArgoCD sync will roll the pod.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Staged but NOT applied. Do not git push until the two pre-requisites below
are done. See apps/fc-llm-bridge/README.md for the full order-of-ops.
Manifests (apps/fc-llm-bridge/fc-llm-bridge.yaml, 8 docs):
- Namespace fc-llm-bridge
- OnePasswordItem anthropic-api-key (existing Claude API Key item)
- OnePasswordItem fc-llm-bridge-api-keys (NEW item, pending creation)
- PersistentVolumeClaim fc-llm-bridge-data (2Gi longhorn)
- Deployment fc-llm-bridge (port 8080, uid 1654, readOnlyRootFilesystem,
tcpSocket probes to survive future ApiKeyAuthMiddleware reordering)
- Service fc-llm-bridge ClusterIP
- Certificate fc-llm-bridge-cert (step-ca-acme)
- IngressRoute fc-llm-bridge (fc-llm-bridge.iamworkin.lan, websecure)
Pre-requisites BEFORE git push:
1. pfSense Unbound override fc-llm-bridge.iamworkin.lan -> 10.0.56.200
(currently NXDOMAIN -- verified via nslookup and check-pfsense-dns.py).
Skipping this step puts cert-manager HTTP-01 into ~2h backoff.
2. Create 1Password item `FC LLM Bridge API Keys` in vault IAmWorkin with
password fields: agent-zero-ws, agent-zero-k8s, spare-1, spare-2.
3. Build + import localhost/fc-llm-bridge:v<tag> to rke2-server +
rke2-agent1 + rke2-agent2. Bump image tag from placeholder
v00000000000000 before committing the apply.
Related: ADR-088 (FlowerCore.Notes/ARCHITECTURE.md), design doc at
FlowerCore.Notes/docs/ai-agents/agent-zero-anthropic-bridge.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Synology NAS is configured with community bluejay_monitor
(→ snmp.yml auth 'bluejay_v2'), not public. public_v2 was returning
HTTP 500 from snmp-exporter for this target. Verified bluejay_v2
returns metrics.
Keeps printer (10.0.58.107) on public_v2 — Epson ET-3750 uses
community "public" as documented in its SNMP settings.
Same CoreDNS iamworkin.lan template + ndots:5 hijack as the irc-notify fix.
Anope services (nickserv/chanserv/memo) have been disconnected from unrealircd
for weeks ("Host is unreachable" every 3s). Thelounge server defaults pointed
at the same broken FQDN.
Short name unrealircd.irc.svc resolves to the ClusterIP directly.