gx10/tts: persist Piper /tts source + manifest (telephony TTS port baseline)

Dockerfile (linux/arm64, en_US-amy-medium baked), tts_service.py (16kHz/16-bit/mono WAV, numpy resample 22050->16000), gx10-tts.yaml (CPU NodePort 30850, no GPU request), README (build/import/cutover/verify on the GX10 cluster).
2026-06-14 14:14:59 -05:00
parent e4d1735d35
commit d03a92407d
4 changed files with 324 additions and 0 deletions
--- a/gx10/tts/README.md
+++ b/gx10/tts/README.md
@@ -0,0 +1,59 @@
+# GX10 Piper TTS — telephony `/tts` endpoint
+
+CPU Piper TTS serving the telephony `/tts` contract on the **GX10 RKE2 cluster**
+(ASUS Ascent GX10 / NVIDIA DGX Spark, ARM64, `10.0.56.14`). This is the
+telephony-TTS-port-to-GX10 (P1) baseline: edge1 parity at higher quality, zero
+GPU/aarch64 risk, frees telephony off the slow edge1 Pi 5.
+
+## What it is
+- `tts_service.py` — Flask app: `POST /tts {"text"}` → **16 kHz / 16-bit / mono WAV**
+  (canonical 44-byte header) + `GET /health`. Voice `en_US-amy-medium` (22.05 kHz
+  native) is numpy-resampled to 16 kHz so it drops straight onto Asterisk's
+  `.sln16` path (telephony strips the 44-byte header). Same wire contract as the
+  edge1 `speech-pipeline` `/tts`, just the TTS half (no STT/Wyoming).
+- `Dockerfile` — `linux/arm64`, voice baked in (no runtime HuggingFace dep).
+- `gx10-tts.yaml` — Namespace `tts` + Deployment (CPU-only, **no GPU request** so it
+  co-resides with the GPU-holding Ollama pod) + NodePort Service.
+
+## This cluster is NOT under the old-cluster ArgoCD (yet)
+Apply manually with the GX10's own kubectl:
+```bash
+ssh -J noc1 -i ~/.ssh/fcadmin_ed25519 bluejay@10.0.56.14
+export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
+K=/var/lib/rancher/rke2/bin/kubectl
+$K apply -f gx10-tts.yaml
+```
+
+## Build + import (native arm64 on the GX10)
+```bash
+docker build -t localhost/fc-gx10-tts:v20260614 .
+docker save localhost/fc-gx10-tts:v20260614 -o /tmp/t.tar
+sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/t.tar
+# manifest uses imagePullPolicy: Never (image lives in containerd, no registry)
+```
+
+## Telephony cutover (reversible)
+Endpoint telephony hits: **`http://10.0.56.14:30850`** (NodePort, MGMT VLAN 56).
+In `apps/telephony/telephony.yaml`:
+1. Deployment env `Tts__PiperUrl=http://10.0.56.14:30850` — **this is the real lever**;
+   env vars override `appsettings.Production.json`, so the configmap `Tts` block alone
+   is inert (it was shadowed by a drifted live env `Tts__PiperUrl=edge1`).
+2. NetworkPolicy egress to `10.0.56.14/32:30850` (telephony-web is `hostNetwork`, so this
+   only matters for non-hostNetwork pods; harmless either way).
+3. edge1 (`10.0.57.17:8500`) stays warm — **rollback = set `Tts__PiperUrl` back to it**.
+   The TTS circuit breaker + `MapTextToSound` canned-prompt fallback mean a bad endpoint
+   degrades gracefully, never to silence.
+
+## Verify (not a manual call)
+```bash
+FLOWERCORE_SIP_TEST_MODE=required dotnet.exe test \
+  FlowerCore.Telephony/tests/FlowerCore.Telephony.SipTests/FlowerCore.Telephony.SipTests.csproj \
+  --filter FullyQualifiedName~Call_Star100_ReceivesAudibleAudioStream
+```
+A passing audible test alone is NOT sufficient (edge1 also produces audible audio) —
+confirm the **GX10 TTS pod's own access log** (`kubectl -n tts logs deploy/gx10-tts`)
+shows `POST /tts 200` during the call, and telephony-web logs target `10.0.56.14:30850`.
+
+## Voice upgrade (follow-on)
+Operator's pick is **Kokoro**; needs GPU time-slicing (Ollama holds the GB10 GPU; MPS is
+refuted on GB10) OR Kokoro-CPU behind a `/tts` shim. This Piper baseline stays as the floor.