From e50e103ba0a1656e20187bd0fc3dc09680827695 Mon Sep 17 00:00:00 2001 From: Andrew Stoltz Date: Fri, 15 May 2026 15:59:04 -0500 Subject: [PATCH] =?UTF-8?q?fix(zabbix):=20bump=20web=20probe=20timeouts=20?= =?UTF-8?q?5s=E2=86=9215s=20+=20add=20failureThreshold?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit zabbix-web nginx+PHP-FPM container serves / at ~3-5s baseline with occasional 6-7s spikes (probe path renders full dashboard via PHP). kube-probe was killing the container after 3 consecutive 5s-timeout 499s, producing CrashLoopBackOff alert noise even though the app was serving real traffic fine. 15s timeout absorbs the natural variance; explicit failureThreshold=3 documents the policy (was implicit default). Closes the firing PodCrashLoopBackOff (zabbix-web) + pending HTTPServiceSlow/HTTPServiceDegraded alerts. zabbix.iamworkin.lan remains slow at the application layer (separate work — PHP-FPM warm-up + Zabbix server "host not found" agent lookup spam need their own fixes) but the pod restart loop stops. --- apps/zabbix/zabbix.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/apps/zabbix/zabbix.yaml b/apps/zabbix/zabbix.yaml index 2a68c03..454d024 100644 --- a/apps/zabbix/zabbix.yaml +++ b/apps/zabbix/zabbix.yaml @@ -305,15 +305,17 @@ spec: path: / port: 8080 initialDelaySeconds: 60 - timeoutSeconds: 5 + timeoutSeconds: 15 periodSeconds: 10 + failureThreshold: 3 readinessProbe: httpGet: path: / port: 8080 initialDelaySeconds: 30 periodSeconds: 5 - timeoutSeconds: 5 + timeoutSeconds: 15 + failureThreshold: 3 --- apiVersion: v1 kind: Service