Hotfix for two live render errors:
- Kokoro chapter render failed with "count ('-1') must be non-negative"
— streaming-WAV chunk-size sentinel (0xFFFFFFFF) read as -1.
- Piper render timed out on book-chapter paragraphs with no sentence
punctuation — one giant segment exceeded the 2-min timeout.
Source fix: FlowerCore.TtsReader@826589b. 153/153 tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Guacamole 1.6 renders .button.home/.button.logout/.button.reconnect icons
via an absolutely-positioned ::before pseudo-element with width:1.8em
and background-position:.5em .45em. The Blue Jay branding CSS was
clamping every .notification button::before to display:inline-flex
and width:1rem, so only the top-left sliver of the sprite rendered —
appearing as a green/purple garbage rectangle on the connection-not-found
page Home button. Reconnect escaped because it doesn't carry a
background-image on its ::before. The rule was redundant anyway: the
.notification button flex row + padding already spaces the native icon
cleanly. Only the custom .fc-embed-logout-disabled::before override
remains (intentionally dims the replacement disabled-logout pseudo-element).
v202604240140longchunk still hit 400 Bad Request from nomic-embed-text
on several batches — the chars/4 token estimate was optimistic for
code-heavy/Unicode content. Rebuilt from FlowerCore.Common@e1c28b4
which tightens MarkdownChunker hard cap (ChunkSizeTokens × 2, clamped
at 16000 chars) AND adds a character-length check in IndexBuilder's
safety filter alongside the estimated-tokens check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v202604240135longchunk image shipped with only 1 file in the baked
corpus (NEXT-SPRINT.md) because the corpus tar was accidentally built
from the Intranet.Web working directory instead of the Notes repo
root. Rebuilt from the right cwd; new image has the expected 370
*.md + *.html files at /srv/flowercore-notes/docs/.
Same long-chunk handling code as v202604240135longchunk; just a clean
rebuild.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Image bump v202604240108gpu -> v202604240135longchunk, rebuilt from
FlowerCore.Intranet.Web@feat/shared-indexing-search HEAD which transitively
picks up FlowerCore.Common@feat/shared-indexing@105af75:
- MarkdownChunker hard-caps oversized heading-bounded sections at
ChunkSizeTokens × 4 chars and splits with overlap (same pattern as
JsonArticleChunker). Stops the indexer from producing chunks above
nomic-embed-text's 8192-token input limit at the source.
- IndexBuilder gains IndexingOptions.MaxEmbeddingTokens (default 8000)
safety filter — chunks above the cap are warn-logged and dropped
before any batch is sent. New IndexBuildResult.ChunksDropped tracks
how many got skipped.
Goal: notes-md should index 2541/2541 chunks (vs. 2080/2541 last pass)
with zero "Failed to embed batch" 400s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-host routing via desktop.iamworkin.lan/guacamole has been
live-proven (curl → 200) and the Codex single-host-guacamole-wip
merge flipped RemoteDesktop.Web's GuacamolePublicUrl + defaults to
the new path. Nothing else in FlowerCore actively requires the
legacy guac.iamworkin.lan URL.
Removed from the guacamole app:
- IngressRoute `guacamole` matching Host(guac.iamworkin.lan)
- Middleware `guac-add-prefix` (only the legacy route referenced it)
- Certificate `guacamole-tls` (only covered guac.iamworkin.lan)
ArgoCD prune will delete the live resources on next sync. The
pfSense DNS override for guac.iamworkin.lan should be removed
via FlowerCore.DNS as a follow-up operator step — not managed by
this repo.
The new `guacamole-desktop-path` IngressRoute + `desktop-guacamole-path-tls`
Certificate (added in e65de29) handle all Guacamole traffic going
forward.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-part fix on top of the live Shared.Indexing rollout:
1. Image bump v202604240050corpus -> v202604240108gpu, rebuilt from
FlowerCore.Intranet.Web@feat/shared-indexing-search (HEAD includes
the FilePatterns array-merge fix in IntranetSearchOptions). At
runtime each DocCorpusRoot now sees ONLY the patterns explicitly
set in appsettings.json — notes-md gets ["*.md"], notes-html gets
["*.html"], no accidental cross-bleed.
2. New IntranetSearch__OllamaBaseUrl env var pointing at
http://10.0.56.20:11434 (BLUEJAY-WS GPU, R9700 32GB VRAM). Verified
reachable from the cluster and nomic-embed-text:latest is pulled.
This is the workaround for memory feedback_pi5_nomic_embed_slow:
edge1 Pi 5 takes ~189s per 32-chunk batch, projecting full notes-md
indexing (5665 chunks) at ~9 hours; the GPU should land it in minutes.
Edge1 stays the chat default; this env var only redirects the
indexer's bulk embedding calls.
Image distributed to all three RKE2 nodes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cluster Traefik disallows cross-namespace service refs from
IngressRoutes, so the PathPrefix(/guacamole) rule I added to
fc-desktop IngressRoute in 292528e failed with:
"service guacamole/guacamole not in the parent resource namespace
fc-desktop"
Move the /guacamole path match into the guacamole namespace where
the Service actually lives:
- apps/guacamole/guacamole.yaml adds a new `guacamole-desktop-path`
IngressRoute matching `Host(desktop.iamworkin.lan) &&
PathPrefix(/guacamole)` → guacamole:8080 (no add-prefix middleware;
the browser already sends the /guacamole/* path that Guacamole's
servlet serves at).
- New Certificate `desktop-guacamole-path-tls` for desktop.iamworkin.lan
in the guacamole namespace, issued by step-ca-acme. Separate cert
from fc-desktop's remotedesktop-web-tls because Secret refs are
also scoped per-namespace; duplicating the cert is cheaper than
enabling cross-namespace secret refs cluster-wide.
- Revert the cross-namespace attempt in apps/fc-desktop/fc-desktop.yaml
back to a Host-only route. Traefik's router matching precedence
(longer/more-specific rule wins) handles the /guacamole vs
catch-all priority without explicit priority: fields.
Closes the single-host Guacamole URL regression Codex's branch
introduced — GuacamolePublicUrl=https://desktop.iamworkin.lan/guacamole
now resolves to the Guacamole webapp end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-host Guacamole routing — Traefik matches Host=desktop.iamworkin.lan
+ PathPrefix=/guacamole first (priority 20) and forwards to the
guacamole Service in the guacamole namespace on 8080. The existing
Host-only catch-all rule drops to priority 10 so Guacamole traffic
resolves to the more-specific match.
Mirrors the IngressRoute in FlowerCore.RemoteDesktop@master (merged
as part of codex/single-host-guacamole-wip). The RemoteDesktop repo
copy is deploy-ref only — ArgoCD owns the live IngressRoute via
this manifest. Without this change, GuacamolePublicUrl=
https://desktop.iamworkin.lan/guacamole returns 404 because Traefik
routes the whole Host to remotedesktop-web.
Unblocks the per-template AAT smoke against the new public URL
path + closes the final live piece of Codex's single-host routing
work.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TtsReader__Kokoro__Enabled=true + BaseUrl=http://10.0.56.20:10401
+ TimeoutSeconds=120 so the pod routes kokoro-tagged voices to the
Kokoro-FastAPI backend running on BLUEJAY-WS. Multi-engine router
falls through to Piper for piper-tagged and untagged voices.
Requires nftables on BLUEJAY-WS to permit tcp/10401 from 10.0.56/23
and 10.42.0.0/16. Applied to the live ruleset — Puppet Hiera path is
the durable fix (kokoro_server_enabled under profile::security::firewall).
Tests 107 → 114 (+7 MultiEngineSpeechSynthesizerTests).
Companion to the Prometheus alert rules landed in e44e9a0. The
Prometheus rules were loading but never delivered — the monitoring
stack has no Alertmanager configured; **Grafana** owns alert
routing via its built-in engine + webhook contact point to
irc-notify.monitoring.svc:9119. Without a matching Grafana alert,
the Prometheus rules just show up in the Prometheus UI and page
no one.
Adds 6 Grafana alert rules in a new `RemoteDesktop` group under
the AI Stack Alerts folder:
- remotedesktop-web-down (3m) — probe_success{job="probe-remotedesktop"} < 1
- remotedesktop-metrics-stale (10m) — fc_desktop_session_events_total series absent
- remotedesktop-pool-depleted (5m) — fc_desktop_pool_depleted > 0
- remotedesktop-pool-deficit-sustained (10m info) — fc_desktop_pool_deficit > 0
- remotedesktop-session-churn-spike (5m info) — launch rate > 20/min
- remotedesktop-tls-expiry (6h critical) — cert < 2 days to expiry
Each uses the standard Grafana 3-stage pipeline (query → reduce →
threshold) matching the existing AI Stack + Infrastructure alert
patterns. Labels: service=remotedesktop + severity (warning/info/critical).
Default route is `IRC #alerts` via the existing webhook contact point.
Parity with the Prometheus rules (which already fire internally
for the Prometheus UI + any future Alertmanager integration).
Grafana restart picks up the new provisioning on next reload.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cluster egress to github.com is fronted by a step-ca TLS proxy that
returns 404 page not found for unmatched routes — git clone of the
public FlowerCore.Notes repo failed inside the pod even with
GIT_SSL_NO_VERIFY=true. Rather than chase the egress NetworkPolicy /
proxy config, bake the docs corpus directly into the image at
/srv/flowercore-notes/docs.
The corpus is just *.md + *.html (369 files, 2.7 MB uncompressed) —
small enough that re-baking on every deploy is fine and avoids any
runtime network dependency.
Manifest changes:
- Image bump: v202604240040search -> v202604240050corpus
- Removed initContainers (clone-notes-corpus is now redundant)
- Removed notes-corpus emptyDir + its volumeMounts
- Vector-store PVC mount stays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two egress allows to monitoring-netpol so Prometheus can scrape
FlowerCore.RemoteDesktop:
1. fc-desktop namespace on port 8080 — direct ClusterIP service
target (remotedesktop-web.fc-desktop:8080).
2. traefik-system namespace pods on ports 8080 + 8443 — covers the
Traefik VIP hairpin path for the `https://desktop.iamworkin.lan`
scrape target (CoreDNS wildcard resolves iamworkin.lan hostnames
to the LB VIP; after kube-proxy DNAT, egress needs the backend
pod port allowed per feedback_netpol_dnat_backend_port).
Without these, the fc-remotedesktop scrape times out with "context
deadline exceeded" even though the monitoring-netpol already allows
the 10.0.56.0/24 CIDR — post-DNAT the destination is a 10.42.x.x
pod IP, not the VIP.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cluster egress is fronted by a step-ca TLS proxy whose cert doesn't
match github.com. The init container's git clone failed with
"SSL: no alternative certificate subject name matches target hostname
'github.com'". The Notes repo is public — there is no secret to
protect on the wire — so GIT_SSL_NO_VERIFY=true is the right tradeoff
here. Tag at v202604240040search.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3 lane 1 of FlowerCore.Shared.Indexing rollout — wires the new
search consumer in FlowerCore.Intranet.Web to live infrastructure.
Manifest changes:
- Image bump: localhost/fc-intranet-web:latest -> :v202604240040search.
Built from FlowerCore.Intranet.Web@feat/shared-indexing-search and
imported into all three RKE2 nodes (rke2-server, rke2-agent1, rke2-agent2)
via ctr import. Both :latest and :v202604240040search tags are present.
- New PersistentVolumeClaim intranet-vector-store (1Gi, ReadWriteOnce,
Longhorn) mounted at /data for the SQLite vector store
(intranet-vectors.db).
- New emptyDir volume notes-corpus (1Gi sizeLimit) shared between the
init container and main container, mounted at /srv/flowercore-notes
(read-only in the main container).
- New init container clone-notes-corpus (alpine/git) that shallow-clones
https://github.com/astoltz/FlowerCore.Notes.git
(codex/notes-pimanager-live-drift) into /srv/flowercore-notes on every
pod start. Re-clone is cheap (depth=1) and re-runs of git fetch +
reset --hard are idempotent.
- Strategy switched to Recreate for the deployment, since the new RWO
PVC blocks rolling updates — see CLAUDE.md memory "RWO PVC blocks K8s
rolling updates".
- Resource bumps: memory 128Mi -> 256Mi req, 512Mi -> 1Gi limit; CPU
500m -> 1000m limit. The DocsCorpusIndexer + Ollama HTTP calls add
measurable load during the initial index build.
- initialDelaySeconds bumps on both probes (10s -> 30s liveness, 5s ->
10s readiness) to account for startup-time Ollama probing and the
slightly larger image.
The DocsCorpusIndexer waits 15s after host startup before its first
indexing pass, then loops every RescanInterval (default 1h). Its first
run will:
1. Embed all *.md under /srv/flowercore-notes/docs against
nomic-embed-text on edge1 (10.0.57.17:11434).
2. Embed all *.html under /srv/flowercore-notes/docs/dashboards.
3. Persist chunks + embeddings to /data/intranet-vectors.db.
Verify after rollout:
- kubectl -n intranet logs deploy/intranet-web -c clone-notes-corpus
(init container should show the docs/ listing).
- kubectl -n intranet logs deploy/intranet-web -f
(DocsCorpusIndexer should log "Indexing docs root 'notes-md'..." then
"Docs root 'notes-md' indexed: N files, M chunks, M stored").
- curl -sk https://intranet.iamworkin.lan/api/search/indexes
-> ["notes-html","notes-md"]
- curl -sk 'https://intranet.iamworkin.lan/api/search?q=guacamole+single+host&topK=3'
-> hits from docs/infrastructure/guacamole-customization-plan.md
Companion source on FlowerCore.Intranet.Web@feat/shared-indexing-search.
Depends on FlowerCore.Common@feat/shared-indexing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three additions to the monitoring ConfigMap, each targeting
FlowerCore.RemoteDesktop:
- **Scrape jobs** (2 new):
- probe-remotedesktop: blackbox http_2xx against
https://desktop.iamworkin.lan/health every 30s. Feeds the
RemoteDesktopWebDown alert.
- fc-remotedesktop: direct /metrics scrape against
desktop.iamworkin.lan for the fc_desktop_session_events_total
and fc_desktop_pool_* series.
- **Alert group `remote-desktop`** (7 rules in alerts.yml):
- RemoteDesktopWebDown (3m) — /health probe failing
- RemoteDesktopMetricsStale (10m) — absent metrics series
- RemoteDesktopPoolDepleted (5m) — pool deficit + depleted flag
- RemoteDesktopPoolDeficitSustained (10m, info) — persistent
below-desired pool size
- RemoteDesktopSessionChurnSpike (5m, info) — launch rate
>20/min
- RemoteDesktopRecordingEventsDropped (15m, info) — 30m without
recording events while launches active
- RemoteDesktopTlsExpiry (6h, critical) — <2d cert renewal
window; aligns with feedback_acme_expiry_alert_threshold
- **Grafana dashboard mount**: new volumeMounts + volumes entry for
`dashboards-remotedesktop` backed by the grafana-dashboard-remotedesktop
ConfigMap (previously added as a standalone file in d4210c8).
Folder path /var/lib/grafana/dashboards/remotedesktop — picked up
by the file-provider with foldersFromFilesStructure:true so the
dashboard shows up in a "Remotedesktop" folder in Grafana.
No CRLF churn; pure 100-line insertion into LF-normalized file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New surfaces: POST /api/v1/bible/projects (one-click whole-book render),
GET /api/v1/bible/books, GET /api/v1/bible/books/{book}/preview, MCP
tools render_tts_reader_bible_book + list_tts_reader_bible_books,
Dashboard "Render a Bible book" card. 107/107 tests, +7 from previous.
Wraps apps/monitoring/flowercore-remotedesktop-grafana-dashboard.json
as a ConfigMap manifest so ArgoCD syncs it into the cluster alongside
the existing grafana-dashboard-* ConfigMaps. Standalone file — does
NOT modify noc-monitoring.yaml. That keeps the CRLF churn on
noc-monitoring.yaml (sibling files apps/intranet/intranet.yaml and
apps/agent-zero/configmaps-bluejay.yaml also carry CRLF churn) out
of this commit.
Dashboard will be synced into the cluster but NOT loaded by Grafana
until a matching `volumes:` entry lands in the Grafana Deployment
in noc-monitoring.yaml:
- name: dashboard-remotedesktop
configMap:
name: grafana-dashboard-remotedesktop
Plus a `volumeMounts:` entry in the grafana container:
- name: dashboard-remotedesktop
mountPath: /etc/grafana/provisioning/dashboards/remotedesktop
readOnly: true
Those edits are deferred to the CRLF-normalization pass on
bluejay-infra so the review diff stays reviewable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in FlowerCore.TtsReader@9e2497f: P2.3 iTunes-namespace podcast
feed (author, summary, category, cover art, episode numbering,
duration, atom:self link, serial channel type for Bible projects) and
P2.4 ID3v2 tags on MP3 export + Vorbis comments on OGG (title, artist
with Piper voice humanized, album, track N/M, genre defaulting to
Religion & Spirituality for Bible or Audiobook for text sources,
date). Phones and podcast apps now show proper track info instead of
"Unknown - Unknown".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New Zabbix 7.2 template under `Templates/FlowerCore` that scrapes
the `/metrics` exposition from FlowerCore.RemoteDesktop and extracts:
- `fc_desktop_session_events_total` split by event (launch/connect/
disconnect/recording), with a dedicated datapoint for the
`browser_datasource="json"` slice to track delegated-auth launches.
- `fc_desktop_pool_ready` gauge sum for warm pools.
Trigger: `nodata(flowercore.remotedesktop.metrics,10m)=1` warns when
the public desktop host stops exposing metrics.
Follows the existing `flowercore-print-ollama.yaml` pattern — import
manually into Zabbix and link to the Print/Desktop host. Not a K8s
manifest; ArgoCD ignores.
Grafana dashboard JSON is drafted at
`apps/monitoring/flowercore-remotedesktop-grafana-dashboard.json`
but still needs a ConfigMap wrap + Grafana Deployment volume mount
in noc-monitoring.yaml before it ships (follow-up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in FlowerCore.TtsReader@63e6b62: P1.1 Media Session API wiring in
fc-media-session.js + quick-player.js + rendered-chapter-player.js, and
P1.2 biblical-name pronunciation lexicon auto-seed on Bible-source
project creation plus apply-bible-defaults endpoint + MCP tool for
existing projects. Tests 81 -> 97 all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live verification 2026-04-24 caught POST /blobs on dist.flowercore.io
returning 201 Created with the blob persisted — admin write operations
reachable on the public surface. Controller-level strict entitlement
was on, but that gates reads; writes weren't blocked at all.
Fix: add Method(GET) || Method(HEAD) to the Host match on the public
IngressRoute. POST/PUT/PATCH/DELETE now miss every route for
dist.flowercore.io and Traefik returns 404 before the pod sees the
request. Edge-level defense-in-depth on top of the controller's
strict-mode entitlement check.
The internal IngressRoute for dist.iamworkin.lan stays unrestricted —
admin POST /blobs + POST /manifests flows keep working from the lab.
Lights up dist.flowercore.io end-to-end:
- cf-origin-flowercore-io Secret (literal *.flowercore.io Origin Cert,
copied from the telephony/gitea-public/matrix/mail/flowercore/fc-landing
pattern — not via OnePasswordItem yet).
- Traefik Middleware dist-public-profile-header: strips any caller-supplied
X-FC-Distribution-Profile, injects 'public' so the controller's
NamedEntitlementResolverRouter routes to the strict resolver.
- IngressRoute fc-distribution-public: Host(`dist.flowercore.io`) ->
same backing Service as the internal dist.iamworkin.lan route.
Middleware attached; cert secret cf-origin-flowercore-io.
Cloudflare DNS A record dist.flowercore.io -> 74.40.140.24 (proxied)
already created 2026-04-24 via Cloudflare API (record id
e9b957511556f37ff6763f4441acbc45).
Controller entitlement config is still DefaultAllow=false + empty
PublicEditions on the 'public' profile, so every public request
returns 403 by default. Populate FlowerCore__Distribution__EntitlementPublic__PublicEditions__0
via env var when ready to expose specific editions.
- Serves GET /manifests/{edition}/{version}.cert (leaf+intermediate PEM)
- Adds CertChainPem migration on startup (nullable column)
- ManifestSignService now embeds version-specific certChainUrl
Provisioning Agent's verify step will flip from ChainNotServed (Phase 2A
soft-pass) to Valid once a fresh edition is published with this image.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>