Dockerfile (linux/arm64, en_US-amy-medium baked), tts_service.py (16kHz/16-bit/mono WAV, numpy resample 22050->16000), gx10-tts.yaml (CPU NodePort 30850, no GPU request), README (build/import/cutover/verify on the GX10 cluster).
60 lines
3.1 KiB
Markdown
60 lines
3.1 KiB
Markdown
# GX10 Piper TTS — telephony `/tts` endpoint
|
|
|
|
CPU Piper TTS serving the telephony `/tts` contract on the **GX10 RKE2 cluster**
|
|
(ASUS Ascent GX10 / NVIDIA DGX Spark, ARM64, `10.0.56.14`). This is the
|
|
telephony-TTS-port-to-GX10 (P1) baseline: edge1 parity at higher quality, zero
|
|
GPU/aarch64 risk, frees telephony off the slow edge1 Pi 5.
|
|
|
|
## What it is
|
|
- `tts_service.py` — Flask app: `POST /tts {"text"}` → **16 kHz / 16-bit / mono WAV**
|
|
(canonical 44-byte header) + `GET /health`. Voice `en_US-amy-medium` (22.05 kHz
|
|
native) is numpy-resampled to 16 kHz so it drops straight onto Asterisk's
|
|
`.sln16` path (telephony strips the 44-byte header). Same wire contract as the
|
|
edge1 `speech-pipeline` `/tts`, just the TTS half (no STT/Wyoming).
|
|
- `Dockerfile` — `linux/arm64`, voice baked in (no runtime HuggingFace dep).
|
|
- `gx10-tts.yaml` — Namespace `tts` + Deployment (CPU-only, **no GPU request** so it
|
|
co-resides with the GPU-holding Ollama pod) + NodePort Service.
|
|
|
|
## This cluster is NOT under the old-cluster ArgoCD (yet)
|
|
Apply manually with the GX10's own kubectl:
|
|
```bash
|
|
ssh -J noc1 -i ~/.ssh/fcadmin_ed25519 bluejay@10.0.56.14
|
|
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
|
|
K=/var/lib/rancher/rke2/bin/kubectl
|
|
$K apply -f gx10-tts.yaml
|
|
```
|
|
|
|
## Build + import (native arm64 on the GX10)
|
|
```bash
|
|
docker build -t localhost/fc-gx10-tts:v20260614 .
|
|
docker save localhost/fc-gx10-tts:v20260614 -o /tmp/t.tar
|
|
sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/t.tar
|
|
# manifest uses imagePullPolicy: Never (image lives in containerd, no registry)
|
|
```
|
|
|
|
## Telephony cutover (reversible)
|
|
Endpoint telephony hits: **`http://10.0.56.14:30850`** (NodePort, MGMT VLAN 56).
|
|
In `apps/telephony/telephony.yaml`:
|
|
1. Deployment env `Tts__PiperUrl=http://10.0.56.14:30850` — **this is the real lever**;
|
|
env vars override `appsettings.Production.json`, so the configmap `Tts` block alone
|
|
is inert (it was shadowed by a drifted live env `Tts__PiperUrl=edge1`).
|
|
2. NetworkPolicy egress to `10.0.56.14/32:30850` (telephony-web is `hostNetwork`, so this
|
|
only matters for non-hostNetwork pods; harmless either way).
|
|
3. edge1 (`10.0.57.17:8500`) stays warm — **rollback = set `Tts__PiperUrl` back to it**.
|
|
The TTS circuit breaker + `MapTextToSound` canned-prompt fallback mean a bad endpoint
|
|
degrades gracefully, never to silence.
|
|
|
|
## Verify (not a manual call)
|
|
```bash
|
|
FLOWERCORE_SIP_TEST_MODE=required dotnet.exe test \
|
|
FlowerCore.Telephony/tests/FlowerCore.Telephony.SipTests/FlowerCore.Telephony.SipTests.csproj \
|
|
--filter FullyQualifiedName~Call_Star100_ReceivesAudibleAudioStream
|
|
```
|
|
A passing audible test alone is NOT sufficient (edge1 also produces audible audio) —
|
|
confirm the **GX10 TTS pod's own access log** (`kubectl -n tts logs deploy/gx10-tts`)
|
|
shows `POST /tts 200` during the call, and telephony-web logs target `10.0.56.14:30850`.
|
|
|
|
## Voice upgrade (follow-on)
|
|
Operator's pick is **Kokoro**; needs GPU time-slicing (Ollama holds the GB10 GPU; MPS is
|
|
refuted on GB10) OR Kokoro-CPU behind a `/tts` shim. This Piper baseline stays as the floor.
|