bluejay-infra/gx10/tts/README.md

# GX10 Piper TTS — telephony `/tts` endpoint

CPU Piper TTS serving the telephony `/tts` contract on the **GX10 RKE2 cluster**
(ASUS Ascent GX10 / NVIDIA DGX Spark, ARM64, `10.0.56.14`). This is the
telephony-TTS-port-to-GX10 (P1) baseline: edge1 parity at higher quality, zero
GPU/aarch64 risk, frees telephony off the slow edge1 Pi 5.

## What it is
- `tts_service.py` — Flask app: `POST /tts {"text"}` → **16 kHz / 16-bit / mono WAV**
  (canonical 44-byte header) + `GET /health`. Voice `en_US-amy-medium` (22.05 kHz
  native) is numpy-resampled to 16 kHz so it drops straight onto Asterisk's
  `.sln16` path (telephony strips the 44-byte header). Same wire contract as the
  edge1 `speech-pipeline` `/tts`, just the TTS half (no STT/Wyoming).
- `Dockerfile` — `linux/arm64`, voice baked in (no runtime HuggingFace dep).
- `gx10-tts.yaml` — Namespace `tts` + Deployment (CPU-only, **no GPU request** so it
  co-resides with the GPU-holding Ollama pod) + NodePort Service.

## This cluster is NOT under the old-cluster ArgoCD (yet)
Apply manually with the GX10's own kubectl:
```bash
ssh -J noc1 -i ~/.ssh/fcadmin_ed25519 bluejay@10.0.56.14
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
K=/var/lib/rancher/rke2/bin/kubectl
$K apply -f gx10-tts.yaml
```

## Build + import (native arm64 on the GX10)
```bash
docker build -t localhost/fc-gx10-tts:v20260614 .
docker save localhost/fc-gx10-tts:v20260614 -o /tmp/t.tar
sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/t.tar
# manifest uses imagePullPolicy: Never (image lives in containerd, no registry)
```

## Telephony cutover (reversible)
Endpoint telephony hits: **`http://10.0.56.14:30850`** (NodePort, MGMT VLAN 56).
In `apps/telephony/telephony.yaml`:
1. Deployment env `Tts__PiperUrl=http://10.0.56.14:30850` — **this is the real lever**;
   env vars override `appsettings.Production.json`, so the configmap `Tts` block alone
   is inert (it was shadowed by a drifted live env `Tts__PiperUrl=edge1`).
2. NetworkPolicy egress to `10.0.56.14/32:30850` (telephony-web is `hostNetwork`, so this
   only matters for non-hostNetwork pods; harmless either way).
3. edge1 (`10.0.57.17:8500`) stays warm — **rollback = set `Tts__PiperUrl` back to it**.
   The TTS circuit breaker + `MapTextToSound` canned-prompt fallback mean a bad endpoint
   degrades gracefully, never to silence.

## Verify (not a manual call)
```bash
FLOWERCORE_SIP_TEST_MODE=required dotnet.exe test \
  FlowerCore.Telephony/tests/FlowerCore.Telephony.SipTests/FlowerCore.Telephony.SipTests.csproj \
  --filter FullyQualifiedName~Call_Star100_ReceivesAudibleAudioStream
```
A passing audible test alone is NOT sufficient (edge1 also produces audible audio) —
confirm the **GX10 TTS pod's own access log** (`kubectl -n tts logs deploy/gx10-tts`)
shows `POST /tts 200` during the call, and telephony-web logs target `10.0.56.14:30850`.

## Voice upgrade (follow-on)
Operator's pick is **Kokoro**; needs GPU time-slicing (Ollama holds the GB10 GPU; MPS is
refuted on GB10) OR Kokoro-CPU behind a `/tts` shim. This Piper baseline stays as the floor.