zabbix-web nginx+PHP-FPM container serves / at ~3-5s baseline with
occasional 6-7s spikes (probe path renders full dashboard via PHP).
kube-probe was killing the container after 3 consecutive 5s-timeout
499s, producing CrashLoopBackOff alert noise even though the app
was serving real traffic fine.
15s timeout absorbs the natural variance; explicit failureThreshold=3
documents the policy (was implicit default).
Closes the firing PodCrashLoopBackOff (zabbix-web) + pending
HTTPServiceSlow/HTTPServiceDegraded alerts. zabbix.iamworkin.lan
remains slow at the application layer (separate work — PHP-FPM
warm-up + Zabbix server "host not found" agent lookup spam need
their own fixes) but the pod restart loop stops.
Same ArgoCD + SSA self-heal loop pattern as guacamole (20e4130):
K8s defaults volumeMode=Filesystem on volumeClaimTemplates at
creation, git omits it, argocd-controller owns the atomic list so
every reconcile sees drift, and volumeClaimTemplates is immutable
so it can never reconcile. Adding the field closes both loops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Zabbix: Remove hardcoded zabbix-db-secret and zabbix-admin-secret, reference
zabbix-credentials (1Password) for DB-User, DB-Password, and admin password
- Matrix: Remove hardcoded matrix-db-secret, reference matrix-credentials for
Postgres user/password. Convert ConfigMap homeserver.yaml to template with
__DB_PASSWORD__/__DB_USER__ placeholders, inject via busybox init container
- Guacamole: Add OnePasswordItem CRD for future use. MySQL DB creds remain in
guac-db-secret (1Password item lacks DB-specific fields — gap documented)
- All three services now include OnePasswordItem CRD manifests for ArgoCD mgmt