Files
Andrew Stoltz bc32b5ef04 fc-ttsreader: deploy fc-modern-tts (Edge Read Aloud Hebrew/Greek)
Adds a fourth TTS engine alongside Piper / Kokoro / biblical-tts: a
small FastAPI bridge to Microsoft Edge's Read Aloud TTS via the
edge-tts Python package. Provides studio-quality Modern Hebrew (he-IL)
and Modern Greek (el-GR) narrators for the cluster.

modern-tts/Dockerfile + app.py:
- Python 3.12 base + edge-tts==7.2.8 (older versions hit 403 from MS).
- POST /tts -> MP3 audio (audio/mpeg).
- POST /timings -> word-level timings. Edge sometimes omits WordBoundary
  events for non-English voices; fall back to MP3-frame-walking duration
  estimate + proportional distribution across whitespace-split words
  (same approach biblical-tts uses for eSpeak).
- GET /voices?language=all|default — filtered to he-/el- by default so
  the AiStation voice picker isn't overwhelmed by 400+ voices.
- GET /health for probes.
- Body shape mirrors BiblicalTtsRequest so the .NET client lives in the
  same FlowerCore.Shared.Speech package.

K8s deployment in fc-ttsreader namespace:
- ttsreader-modern Deployment + Service on port 10403.
- localhost/fc-modern-tts:v1, imagePullPolicy: Never (built on noc1,
  imported to all 3 RKE2 nodes via ctr).
- runAsNonRoot uid 1654 + fsGroup 1654.
- dnsPolicy: None to bypass the *.iamworkin.lan template hijack on
  Microsoft endpoint lookups.
- Modest resources (100m/128Mi req, 1000m/512Mi limit) — edge-tts is
  network-bound, not compute-bound.
- Probes against /health.

Verified live locally: container handles 'Καλημέρα Ελλάδα Πώς είστε'
in 2496ms, returns el-GR-NestorasNeural voice + 4 word timings.
Hebrew: 'בְּרֵאשִׁית בָּרָא אֱלֹהִים' returns he-IL-AvriNeural,
2472ms, 3 words.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:39:21 -05:00

37 lines
1.3 KiB
Docker

# FlowerCore modern-tts — wraps Microsoft Edge's Read Aloud TTS service
# (via the edge-tts Python package) to give the cluster studio-quality
# Modern Hebrew (he-IL-*) and Modern Greek (el-GR-*) voices alongside the
# eSpeak biblical engine. Same shape as fc-biblical-tts so the .NET client
# lives in the same Shared.Speech package.
#
# Note: edge-tts depends on Microsoft's public Edge endpoint; the cluster
# pod needs egress to *.tts.speech.microsoft.com. dnsPolicy: None on the
# Deployment makes sure the iamworkin.lan template hijack doesn't rewrite
# the lookup back to Traefik VIP.
FROM python:3.12-slim
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r requirements.txt
COPY app.py /app/
RUN useradd --create-home --shell /usr/sbin/nologin --uid 1654 tts
USER 1654
EXPOSE 10403
HEALTHCHECK --interval=30s --timeout=5s --start-period=20s --retries=3 \
CMD python -c "import urllib.request,sys; urllib.request.urlopen('http://127.0.0.1:10403/health',timeout=3); sys.exit(0)" || exit 1
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "10403", "--workers", "1"]