Deploys fix for stale Failed manifest accumulation in TTS Reader Ops view
and atomic-write guard against empty/corrupt job manifests.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sprint E XXL Phase 4γ MVP deploy — POST /api/v1/render endpoint.
Two changes:
1. Image tag v202604272339 → v202604280946 (TtsReader@d9e0a58 master tip
includes the new RenderController + RenderService + 9 tests).
2. New TtsReader__Render__CdnDirectory=/data/cdn env var. Default
wwwroot/cdn resolves under the read-only app filesystem when
runAsNonRoot=true; pin to the existing writable PVC mount alongside
other TtsReader runtime data. Manifests + cue audio land at
/data/cdn/sha256/<hash>/manifest.json + cues/.
Pre-existing PVC mount at /data/ already covers this — no PVC change
needed, just the env var override.
Pairs with TtsReader@d9e0a58 master tip (ready for image build + import).
TtsReader@9333480: distinguishes partial-render (yellow Warning, audio
plays, 'Re-render N failed sentences' button) from full-fail (red
Danger, 'Try render again'). New TtsFallbackChainFailedException carries
both voices when Kokoro + Piper both fail; chapter breadcrumb names
the entire chain instead of just the requested voice. +8 tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kokoro pod has 4 restarts in 2d6h with exit 143 (SIGTERM from kubelet).
kubectl describe events all show:
Liveness probe failed: Get "http://10.42.229.109:8880/v1/audio/voices":
context deadline exceeded
The probe path /v1/audio/voices shares the FastAPI worker pool with
/v1/audio/speech. A long synth (Bible chapter, 30+ sentences) holds the
pool past the prior 5s × 3 = 15s probe window, kubelet kills the pod,
in-flight renders fail. Operator hits "fallback chain failed" toasts +
partial-render breadcrumbs during these windows.
Bump probe timeoutSeconds 5 → 15 and failureThreshold 3 → 5 → 75 s of
grace before kubelet gives up. Combined with the kokoro-side circuit
breaker landing in TtsReader (Sprint E Phase 1b), the FC backend will
also stop slamming kokoro during recovery so it can serve the probe
even faster.
The companion Prometheus alerts (KokoroPodFlapping, PiperPodFlapping)
land in FlowerCore.Notes/scripts/monitoring/alerts.yml.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Day 8 disk-cache warmer crashes on production with
'Read-only file system : /home/app/data' because the relative default
'data/voice-previews' resolves under runAsNonRoot HOME (read-only with
readOnlyRootFilesystem=true). Pin to /data/voice-previews so the cache
lands on the writable PVC mount alongside ttsreader.db, audio output,
and jobs root.
Image v202604272216 (already on nodes) is unaffected by this — only
the env routing changes. ArgoCD reconciles + rollout restart picks up
the new env without rebuild.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v202604272157 crash-looped on the production PVC because Database.EnsureCreated()
is a no-op on existing DBs and the VoiceProfiles table was missing. TtsReader@a9f0b73
adds an idempotent CREATE TABLE IF NOT EXISTS to the infra reconciler before
TtsReaderDataSeeder runs. Bumping the manifest to pick up that fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the Deployment + Service for the fc-modern-tts container that
landed in the previous commit. Same shape as ttsreader-biblical:
runAsNonRoot uid 1654, dnsPolicy: None to bypass the iamworkin.lan
hijack on Microsoft endpoint lookups, /health probes, modest CPU/mem
since edge-tts is network-bound.
Service surfaces ttsreader-modern.fc-ttsreader.svc:10403 for the web
pod to call when the operator picks a he-IL-* or el-GR-* voice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a third TTS engine alongside Piper (modern English/multi-lang) and
Kokoro (high-quality English): a small FastAPI wrapper around eSpeak-NG
with built-in support for Ancient Greek (grc), Hebrew (he), and Modern
Greek (el). Same shape as fc-speech-align so AiStation talks to all the
TTS/alignment services with one HTTP client pattern.
biblical-tts/Dockerfile + app.py:
- Python 3.12 base + apt-get espeak-ng + libsndfile1 + ffmpeg-free deps.
- POST /tts -> WAV audio bytes (audio/wav).
- POST /timings -> word-level timings derived from espeak's --pho phoneme
duration stream, distributed across whitespace-split words proportional
to character count. Accuracy is good enough for chip-level read-along
highlighting (~30-80ms per-word jitter).
- GET /voices for catalog discovery, GET /health for probes.
- Body shape mirrors AlignmentRequest from FlowerCore.Shared.Speech so
the .NET BiblicalTtsClient round-trips it cleanly.
K8s deployment in fc-ttsreader namespace:
- ttsreader-biblical Deployment + Service on port 10402.
- localhost/fc-biblical-tts:v1, imagePullPolicy: Never (built on noc1,
imported to all 3 RKE2 nodes via ctr).
- runAsNonRoot uid 1654 to match the namespace's standard security ctx.
- Modest resources (100m/128Mi req, 1000m/512Mi limit) — eSpeak is
CPU-cheap.
- Probes hit /health which returns the supported language list.
Verified live: container started, /health returns ok with grc/el/he,
POST /timings on Ἐν ἀρχῇ ἦν ὁ λόγος returned 5 words / 1714ms.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cluster ttsreader-web was reaching across to BLUEJAY-WS:10401 for
Kokoro synthesis, which meant a workstation-down event broke render-
pipeline TTS. Add a cluster-native ttsreader-kokoro Deployment and
Service inside fc-ttsreader so the cluster owns the engine.
- Image: ghcr.io/remsky/kokoro-fastapi-cpu:latest. Model + 67 voices
ship inside the image, so no PVC is required.
- Port 8880 (the kokoro-fastapi default; the entrypoint hardcodes it).
- Resources: 250m/1Gi request, 2000m/3Gi limit. CPU-only inference
matches what AiStation runs locally on BLUEJAY-WS.
- dnsPolicy: None to bypass CoreDNS's *.iamworkin.lan template hijack
on huggingface.co lookups, same shape as ttsreader-align.
- Probes hit /v1/audio/voices since the kokoro server doesn't expose
/health; that endpoint is cheap (lists configured voice files).
ttsreader-web env var TtsReader__Kokoro__BaseUrl flips from the
workstation pointer to the cluster service:
http://ttsreader-kokoro.fc-ttsreader.svc.cluster.local.:8880.
AiStation keeps its local http://localhost:8880 since the workstation
operator still wants the audio to render on the local sound device
without a network hop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /align endpoint was returning Whisper-native word fields
(word/startSeconds/endSeconds/confidence), but FlowerCore.Shared.Speech's
FasterWhisperAlignmentClient on master deserializes
FasterWhisperWord against [JsonPropertyName("text")/("startMs")/("endMs")].
Result: ttsreader-web reported alignment.source="whisper" with words[]
present but every entry had Text="" and StartMs=EndMs=0 — visible in the
2026-04-25 hello-world smoke against ttsreader.iamworkin.lan.
Match the published Common contract instead of the Python model's native
shape: emit text/startMs/endMs (millisecond ints, not float seconds).
Confidence stays on the wire as informational; the deployed C# client
ignores it but a future fc-align operator UI can surface low-confidence
words. Bump tag to v3 and bump the Deployment image accordingly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New ttsreader-align Deployment + Service + 5Gi PVC under
apps/fc-ttsreader/. Wraps SYSTRAN/faster-whisper in a small FastAPI app
exposing POST /align (fc-align contract used by Shared.Speech) AND
POST /transcribe (audio-in feature consumed by ttsreader-web Lane G).
Source: apps/fc-ttsreader/speech-align/ (Dockerfile + app.py +
requirements.txt). Built locally (apt-get RUN steps need BLUEJAY-WS,
not noc1) and ctr-imported to all 3 RKE2 nodes.
- ttsreader-web env: flip Speech__Alignment__Enabled=true and point
BaseUrl at http://ttsreader-align.fc-ttsreader.svc.cluster.local.:9200.
Add new TtsReader__Transcription__* env triplet pointing at the same
service (same /transcribe endpoint).
- Bump ttsreader-web image to v202604251046 (carries the
TranscriptionController + MCP tool + Quick.razor InputFile UI).
The cluster-wide pod cannot reach BLUEJAY-WS speaches on 10.0.56.20:9200
because the rootless+host-net podman setup binds 127.0.0.1 only on the
WSL machine; nothing on the LAN-facing interface. The openai-compatible
Backend value also relied on a Common change still on feat/shared-indexing
rather than master, so the deployed image's Shared.Speech only knows
the FC-native /align shape.
Disable Speech:Alignment for now. EstimatedAlignmentClient kicks in and
keeps /api/v1/voices/preview-with-timings returning word-aligned JSON,
just with uniform-distribution timings instead of real Whisper output.
Re-enable once: (a) Common's openai-compatible Backend lands on master
and a new TtsReader image ships, or (b) we point at a LAN-routable
backend (e.g. an aiohttp /align shim, or speaches running on a node
that's actually reachable from cluster pods).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fc-speech-align container on BLUEJAY-WS (port 9200) is the speaches
build of faster-whisper-server, which exposes the OpenAI-compatible
/v1/audio/transcriptions contract — not the FlowerCore /align contract.
FasterWhisperAlignmentClient (FlowerCore.Common a1b3bfc) supports both
shapes; tell it explicitly to talk OpenAI-compatible here so requests land
on the right endpoint and verbose_json gets adapted into the FC alignment
response. Also pin the Model id to one speaches recognizes.
Switch back to fc-align once a native /align backend is deployed (or wire
a tiny FastAPI shim in front of speaches if we want a stable contract).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flips Speech__Alignment__Enabled=true and points BaseUrl at the
new BLUEJAY-WS podman quadlet running fc-speech-align (faster-
whisper, /align contract). When Lane 1δ's
/api/v1/voices/preview-with-timings runs after this lands, the
alignment.source field flips from 'estimated' to 'whisper' and
the per-word timings come from real audio analysis instead of
uniform-spacing estimates.
No image rebuild — the Lane 1α DI registration already routes
IWhisperAlignmentClient to FasterWhisperAlignmentClient when
Speech:Alignment:Enabled is true.
Companion firewall rule from FlowerCore.Puppet@bbc02ea +
@05504ed (whisper_align_enabled flag on bluejay-ws-linux Hiera)
opens port 9200 to RKE2 pod CIDR durably.
Picks up Lane 1δ + 3β:
- Lane 3β: GET /reader-embed iframe-friendly host route + public-
read CORS for /api/v1/voices and /_content/.../embed/
- Lane 1δ: GET /api/v1/voices/preview-with-timings — pairs
synthesized audio with per-word alignment timings (faster-
whisper or estimated fallback) so embed bundle / FcReaderOverlay
/ in-app /voices preview can word-highlight in one round-trip
- Latest FlowerCore.UI.Components from Common master:
FcReaderOverlay annotation popover (Lane 2γ) + <fc-reader>
standalone embed bundle (Phase 3)
Built from FlowerCore.TtsReader@06ef815 (master) against
FlowerCore.Common@d23d4c3 (master).
Image imported on rke2-server / rke2-agent1 / rke2-agent2.
Previous v202604241543backlog image accidentally used a stale
publish/ directory at the TtsReader repo root (Dockerfile.deploy
says COPY publish/ but my ad-hoc publish wrote to artifacts/publish/).
Rebuilt with a clean copy from artifacts/publish/ to publish/ first.
Confirmed new image has appsettings.json Preview section + the
quick-swipe-gestures.js asset baked in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hotfix for two live render errors:
- Kokoro chapter render failed with "count ('-1') must be non-negative"
— streaming-WAV chunk-size sentinel (0xFFFFFFFF) read as -1.
- Piper render timed out on book-chapter paragraphs with no sentence
punctuation — one giant segment exceeded the 2-min timeout.
Source fix: FlowerCore.TtsReader@826589b. 153/153 tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TtsReader__Kokoro__Enabled=true + BaseUrl=http://10.0.56.20:10401
+ TimeoutSeconds=120 so the pod routes kokoro-tagged voices to the
Kokoro-FastAPI backend running on BLUEJAY-WS. Multi-engine router
falls through to Piper for piper-tagged and untagged voices.
Requires nftables on BLUEJAY-WS to permit tcp/10401 from 10.0.56/23
and 10.42.0.0/16. Applied to the live ruleset — Puppet Hiera path is
the durable fix (kokoro_server_enabled under profile::security::firewall).
Tests 107 → 114 (+7 MultiEngineSpeechSynthesizerTests).
New surfaces: POST /api/v1/bible/projects (one-click whole-book render),
GET /api/v1/bible/books, GET /api/v1/bible/books/{book}/preview, MCP
tools render_tts_reader_bible_book + list_tts_reader_bible_books,
Dashboard "Render a Bible book" card. 107/107 tests, +7 from previous.
Pulls in FlowerCore.TtsReader@9e2497f: P2.3 iTunes-namespace podcast
feed (author, summary, category, cover art, episode numbering,
duration, atom:self link, serial channel type for Bible projects) and
P2.4 ID3v2 tags on MP3 export + Vorbis comments on OGG (title, artist
with Piper voice humanized, album, track N/M, genre defaulting to
Religion & Spirituality for Bible or Audiobook for text sources,
date). Phones and podcast apps now show proper track info instead of
"Unknown - Unknown".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in FlowerCore.TtsReader@63e6b62: P1.1 Media Session API wiring in
fc-media-session.js + quick-player.js + rendered-chapter-player.js, and
P1.2 biblical-name pronunciation lexicon auto-seed on Bible-source
project creation plus apply-bible-defaults endpoint + MCP tool for
existing projects. Tests 81 -> 97 all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>