From 38558641c14c6342feae801d566c15247739d5bc Mon Sep 17 00:00:00 2001 From: Andrew Stoltz Date: Mon, 27 Apr 2026 23:28:07 -0500 Subject: [PATCH] fix(ttsreader-kokoro): bump liveness probe timeouts (Sprint E Phase 1a) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Kokoro pod has 4 restarts in 2d6h with exit 143 (SIGTERM from kubelet). kubectl describe events all show: Liveness probe failed: Get "http://10.42.229.109:8880/v1/audio/voices": context deadline exceeded The probe path /v1/audio/voices shares the FastAPI worker pool with /v1/audio/speech. A long synth (Bible chapter, 30+ sentences) holds the pool past the prior 5s × 3 = 15s probe window, kubelet kills the pod, in-flight renders fail. Operator hits "fallback chain failed" toasts + partial-render breadcrumbs during these windows. Bump probe timeoutSeconds 5 → 15 and failureThreshold 3 → 5 → 75 s of grace before kubelet gives up. Combined with the kokoro-side circuit breaker landing in TtsReader (Sprint E Phase 1b), the FC backend will also stop slamming kokoro during recovery so it can serve the probe even faster. The companion Prometheus alerts (KokoroPodFlapping, PiperPodFlapping) land in FlowerCore.Notes/scripts/monitoring/alerts.yml. Co-Authored-By: Claude Opus 4.7 (1M context) --- apps/fc-ttsreader/fc-ttsreader.yaml | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/apps/fc-ttsreader/fc-ttsreader.yaml b/apps/fc-ttsreader/fc-ttsreader.yaml index 8326efb..44d7c80 100644 --- a/apps/fc-ttsreader/fc-ttsreader.yaml +++ b/apps/fc-ttsreader/fc-ttsreader.yaml @@ -296,14 +296,23 @@ spec: periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 18 + # Sprint E Phase 1a (kokoro stability) — 4 restarts in 2d6h with + # exit 143 traced to liveness probe `context deadline exceeded` while + # kokoro was busy synthesizing. /v1/audio/voices shares the FastAPI + # worker pool with /v1/audio/speech, so a long synth can starve the + # probe out within the prior 5s × 3 = 15s window. Bump timeoutSeconds + # 5 → 15 and failureThreshold 3 → 5 → 75s grace before kubelet kills + # the pod. The TtsCircuitBreaker on the synthesizer side (Phase 1b) + # backs this up so the FC backend stops slamming kokoro during + # recovery. livenessProbe: httpGet: path: /v1/audio/voices port: 8880 initialDelaySeconds: 180 periodSeconds: 30 - timeoutSeconds: 5 - failureThreshold: 3 + timeoutSeconds: 15 + failureThreshold: 5 --- # fc-biblical-tts — eSpeak-NG-backed Ancient Greek + Hebrew TTS with # word-level timing for read-along playback. Companion to ttsreader-kokoro