Synology NAS is configured with community bluejay_monitor
(→ snmp.yml auth 'bluejay_v2'), not public. public_v2 was returning
HTTP 500 from snmp-exporter for this target. Verified bluejay_v2
returns metrics.
Keeps printer (10.0.58.107) on public_v2 — Epson ET-3750 uses
community "public" as documented in its SNMP settings.
Same CoreDNS iamworkin.lan template + ndots:5 hijack as the irc-notify fix.
Anope services (nickserv/chanserv/memo) have been disconnected from unrealircd
for weeks ("Host is unreachable" every 3s). Thelounge server defaults pointed
at the same broken FQDN.
Short name unrealircd.irc.svc resolves to the ClusterIP directly.
Adds a real README describing the 4-step deploy flow, with pfSense Unbound
host overrides as step 1 (the prerequisite that, if skipped, silently breaks
cert-manager HTTP-01 for ~2h per cert until manually diagnosed — root cause
of the 2026-04-22 cluster-wide cert outage).
Adds scripts/check-pfsense-dns.py: parses every apps/*/*.yaml, extracts
hostnames from Certificate.spec.dnsNames and Traefik IngressRoute
`Host(...)` match rules, and fails the check if any don't resolve via the
system DNS (pfSense Unbound on this LAN). Ignores IRC server-link labels,
image tags, comments — only checks hostnames cert-manager and Traefik will
actually use.
Run before `git push` or wire into pre-commit / Gitea Actions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quick Read's active-sentence highlight was changing font-weight from
regular to semibold, which shifted glyph widths and reflowed the whole
paragraph mid-playback. New image drops the weight change and uses a
1px box-shadow ring instead for a stable layout.
Built from FlowerCore.TtsReader@e77d69d.
The app's ApiKeyAuthenticationMiddleware runs BEFORE /health is mapped, so
unauthenticated probe requests get 404. tcpSocket probes verify the listener
is up without auth, which is correct for an internal K8s probe (kubelet
talks pod IP directly, not externally).
Real fix is in the app: move /health before the middleware or mark it
[AllowAnonymous]. Tracked separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The app exposes /health (Program.cs:91 maps a Healthy text response) but does
NOT expose /metrics/prometheus. K8s liveness/readiness probes against a 404
endpoint kept the pod in CrashLoopBackOff after PVC mount was added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live cluster had a Longhorn PVC `signalcontrol-data` mounted at /app/data
since 2026-04-14, but the bluejay-infra git manifest never declared it. As a
result, when ArgoCD recreated the Deployment from git (after deletion to fix
an unrelated selector-label mismatch caught during cert-manager recovery),
the new pod started without /app/data and crashed with `SQLite Error 14:
unable to open database file 'data/signalcontrol.db'`.
Bring git in line with reality: declare the PVC, mount it, and switch the
Deployment to `strategy.type: Recreate` (RWO PVC blocks rolling updates per
existing memory feedback_k8s_rwo_rollout.md).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CoreDNS iamworkin.lan template + ndots:5 was hijacking
unrealircd.irc.svc.cluster.local lookups → Traefik VIP → timeout.
Every alert since ~2026-04-09 silently failed with "IRC send failed: timed out",
which also killed the thermal-printer path (routed through irc-notify).
Same fix pattern as guacamole@28b7600.
The guac-k8s-sync CronJob has been crash-looping (exit 7) since the
2026-04-11 run. Root cause: CoreDNS has an `*.iamworkin.lan`
template wildcard, and the Kubernetes pod resolv.conf ships with
`ndots:5` plus a search list that includes `iamworkin.lan`.
Resolving `guacamole.guacamole.svc.cluster.local` (4 dots < 5) goes
through search-suffix expansion BEFORE the bare FQDN. The iamworkin.lan
suffix makes it `guacamole.guacamole.svc.cluster.local.iamworkin.lan`,
which matches the template and answers with Traefik LB VIP
10.0.56.200. That VIP has no pod-network hairpin route, so curl exits
with 'No route to host'.
Using the short name `http://guacamole:8080` keeps the query at 0
dots, search expansion runs on the bare name, and the in-namespace
`guacamole.svc.cluster.local` suffix hits the Kubernetes CoreDNS
plugin directly (ClusterIP 10.43.229.31).
Alt fixes considered but not taken: trim the CoreDNS template regex
to exclude `.svc.cluster.local.` prefixes (cross-cutting, higher
blast radius); trailing-dot FQDN in the URL (curl/Java HTTP clients
handle inconsistently).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets live SIP AATs (ext 901–904, from-internal context) dial *832 to
exercise the Victory Day workflow + Fun Menu + AsteriskGameHandler path
without routing through Twilio. Mnemonic: *832 = V-D-A (8-3-2) from the
V-D-A-Y keypad pattern.
Maps to Stasis(flowercore-pbx,inbound-pstn,+15074618329) — same call-
type classification as a real Twilio-inbound call to the VDAY DID, so
InboundPstnHandler routes to the seeded VDAY workflow identically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous commit 90deacd raced with the user's f0733ff (which had
already pinned the guacamole web Deployment to rke2-server for the
NFS ACL). That left two nodeSelector blocks on the web pod and an
inconsistent agent2 pin on guacd. Align both pods to rke2-server.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Synology NFS export at /volume1/kubernetes currently grants mount
permission only to 10.0.56.13 (rke2-agent2). rke2-agent1 gets
"access denied by server". guacd + guacamole web both need the
recordings volume, so co-locating is also efficient. Remove the
nodeSelector once the Synology NFS ACL opens to all cluster nodes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the 1Password vault JAR to the Guacamole pod so connection params
like ${OP:ItemTitle/fieldLabel} are resolved from 1Password Connect at
tunnel-open time. Credentials never land in MySQL — only token literals.
Deployment changes:
- env: OP_CONNECT_URL=http://10.0.56.10:8180, OP_VAULT_ID=..., plus
OP_CONNECT_TOKEN from secret/guacamole-1password-token/credential.
- env: ENABLE_ENVIRONMENT_PROPERTIES=true so OP_* env vars render as
op-connect-url / op-connect-token / op-vault-id properties the
extension reads.
- volumeMount for guacamole-vault-jar at
/etc/guacamole/extensions/guacamole-vault-1password-1.0.0.jar
- volumeMount for guacamole-logback so we see DEBUG token-inject lines.
- nodeSelector kubernetes.io/hostname=rke2-server — the Synology NFS
export for /volume1/kubernetes currently only allows rke2-server.
Followup: add rke2-agent1/2 to the export and remove this selector.
New ConfigMaps:
- guacamole-vault-jar (binaryData, ~312KB JAR, Gson shaded, built from
FlowerCore.Notes/k8s/guacamole/extensions/1password-vault via mvn).
- guacamole-logback with DEBUG on io.flowercore.guacamole.vault — drop
to INFO once resolution is proven stable.
Existing guacamole-properties: added onepassword-vault to extension-priority.
The guacamole-1password-token Secret is NOT in git — it holds a verbatim
copy of the onepassword-connect-operator bearer token. Followup task:
provision a scoped Connect token for Guacamole and rotate the copy out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First pass used nfs.path=/volume1/kubernetes/guacamole/recordings,
which triggered "mount.nfs: access denied by server" on rke2-agent1.
Synology NFS export is scoped to /volume1/kubernetes; match the
working fc-desktop pattern: mount the export root and select the
subdirectory via volumeMount.subPath.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 of docs/infrastructure/guacamole-customization-plan.md:
- Mount /volume1/kubernetes/guacamole/recordings (Synology 10.0.58.3)
into both guacd (writer) and guacamole web (reader) at
/var/lib/guacamole/recordings
- Set RECORDING_SEARCH_PATH env on guacamole web -- the Guacamole
Docker entrypoint treats any RECORDING_* var as an enable signal
for the history-recording-storage extension (symlinks the JAR
from /opt/guacamole/environment/RECORDING_/extensions/ into
GUACAMOLE_HOME/extensions/)
Per-connection recording still requires setting recording-path on
each connection in MySQL -- follow-up task. This commit enables
the plumbing; no sessions record yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same ArgoCD + SSA self-heal loop pattern as guacamole (20e4130):
K8s defaults volumeMode=Filesystem on volumeClaimTemplates at
creation, git omits it, argocd-controller owns the atomic list so
every reconcile sees drift, and volumeClaimTemplates is immutable
so it can never reconcile. Adding the field closes both loops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the infra-guacamole OutOfSync sync loop. K8s API sets
volumeMode=Filesystem as a default on volumeClaimTemplates at creation,
but the git manifest omitted it. ArgoCD uses ServerSideApply with
atomic ownership of volumeClaimTemplates, so every sync saw a
desired/live mismatch on that one field. volumeClaimTemplates is
immutable after creation so ArgoCD could never reconcile it --
autoHealAttemptsCount climbed to 6091. Adding the field to git
matches live and breaks the loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- fc-chat.yaml: TLS/IngressRoute only (Deployment managed by deploy script, matches fc-signage/fc-mysql/fc-kiosk pattern)
- fc-menuboard: new app bundle
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups to the Piper TTS wire-up landed in d3ffad9:
1. Telephony-web runs as uid 1654 (non-root), but the hostPath at
/tmp/tts-audio is owned by root:root 0755. Pod couldn't write .sln16
files — every Piper call would succeed at the HTTP layer and then
fall back to the sound map when File.WriteAllBytesAsync threw
"Permission denied." Extend the existing fix-data-perms initContainer
to chown the shared-tts mount too (0755 world-readable, so the
Asterisk pod — running as a different uid — can still read).
2. Pod security context now explicitly sets runAsNonRoot: true + runAsUser
1654 + runAsGroup 1654 (cluster policy), matching the pattern used
by every other FlowerCore service.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Piper was never reachable on 10.0.57.15 — edge1's actual address is
10.0.57.17 (SSH config, project_edge1_sdcard memory). Every telephony
prompt hit the 8s HttpClient timeout and fell back to the built-in sound
map (vm-advopts, vm-goodbye, beep) instead of speaking the real workflow
text. Verified from noc1: `curl http://10.0.57.17:8500/health` returns
HTTP 200 in 6ms, `POST /tts` returns a 16kHz mono WAV in 606ms.
Changes:
- apps/telephony/telephony.yaml
- `Tts.PiperUrl` → `http://10.0.57.17:8500`
- NetworkPolicy egress allow → `10.0.57.17/32:8500`
- Header comment now documents the POST /tts {"text":"..."} contract
- telephony-web pod mounts `/shared-tts` from hostPath `/tmp/tts-audio`
(rke2-agent1). This is where `AsteriskProvider.SpeakTextAsync` writes
the synthesized .sln16 before calling ARI `Play sound:tts/<name>`.
- apps/asterisk/deployment.yaml
- Asterisk pod mounts the same hostPath at
`/var/lib/asterisk/sounds/tts` so it can read and play what
telephony-web wrote. Both deployments have
`nodeSelector: kubernetes.io/hostname: rke2-agent1` so the hostPath
is guaranteed to be the same directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CoreDNS wildcard for iamworkin.lan catches unresolved names and returns
the Traefik VIP (10.0.56.200), so downloads.asterisk.org from inside a
pod returns 404 from Traefik rather than the real Sangoma mirror. Pin
the real IP (165.22.184.19 = oss-downloads.sangoma.com) via hostAliases
so curl reaches the actual server.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cluster egress goes through a step-ca-fronted TLS proxy that install-sounds
doesn't trust ("SSL certificate problem: self-signed certificate"). The
Asterisk core sounds tarball is a public artifact; integrity is enforced
downstream when Asterisk plays the file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The install-sounds init container was a stub that left /var/lib/asterisk/sounds/en
empty. Result: every SpeakText fallback path (vm-advopts, vm-goodbye, characters:*,
digits/*, beep, pbx-invalid) resolved to a missing file, Asterisk silently failed
each Playback, zero RTP was produced, and callers heard dead air. This is why
dialing *0 (Settings Menu) or *100 (Debug IVR) "picks up quietly" — there is
literally nothing to stream.
Replaced the stub with alpine:3.20 + curl + tar that downloads the pinned
asterisk-core-sounds-en-ulaw-1.6.1.tar.gz (~10 MB) from downloads.asterisk.org
and unpacks it into the sounds emptyDir. Idempotent — skips download if
vm-goodbye is already present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>