Kokoro pod has 4 restarts in 2d6h with exit 143 (SIGTERM from kubelet).
kubectl describe events all show:
Liveness probe failed: Get "http://10.42.229.109:8880/v1/audio/voices":
context deadline exceeded
The probe path /v1/audio/voices shares the FastAPI worker pool with
/v1/audio/speech. A long synth (Bible chapter, 30+ sentences) holds the
pool past the prior 5s × 3 = 15s probe window, kubelet kills the pod,
in-flight renders fail. Operator hits "fallback chain failed" toasts +
partial-render breadcrumbs during these windows.
Bump probe timeoutSeconds 5 → 15 and failureThreshold 3 → 5 → 75 s of
grace before kubelet gives up. Combined with the kokoro-side circuit
breaker landing in TtsReader (Sprint E Phase 1b), the FC backend will
also stop slamming kokoro during recovery so it can serve the probe
even faster.
The companion Prometheus alerts (KokoroPodFlapping, PiperPodFlapping)
land in FlowerCore.Notes/scripts/monitoring/alerts.yml.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>