Files
bluejay-infra/docs/gx10-tenant-landing/CUTOVER-RUNBOOK.md
Andrew Stoltz eae7b4ed7a infra(cx2-5): DNS auth/NetPol substrate, air-gap landing, arm64 ARC runner + tenant landing manifests
- fc-dns: add OnePasswordItem CRD for DNS API keys + NetworkPolicy for Phase 0 auth hardening; bump dns-web image tag
- fc-landing: rewrite landing HTML to remove CDN dependencies (air-gap safe); add preview.html standalone preview
- github-runner: add TOOLCACHE_ARCH to install-ruby-toolcache.sh for arm64 support; add Dockerfile.arm64 for arm64 ARC runner image
- docs/gx10-tenant-landing: per-user Deployment+IngressRoute manifests (andrew/dustin/erik/fit/matt) + CUTOVER-RUNBOOK.md

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 11:53:26 -05:00

4.8 KiB

GX10 Tenant Landing-Site Migration — Cutover Runbook

Date: 2026-06-16. Migrates the 5 per-tenant public landing sites from the OLD RKE2 cluster (10.0.56.200 Traefik) to the GX10 ARM64 cluster (10.0.57.202 VIP / NodePort 10.0.56.14:32491).

Deployed on GX10 (DONE — staged-verified, NOT yet receiving public traffic)

Domain(s) GX10 ns Workload TLS secret (in ns + traefik-system) Live content replicated
bluejay.dev, www.bluejay.dev fc-tenant-andrew nginx:alpine cf-origin-bluejay-dev "Blue Jay" (custom)
timeforta.co, www.timeforta.co fc-tenant-dustin nginx:alpine cf-origin-timeforta-co "Coming Soon" (generic)
erckak.dev, www.erckak.dev fc-tenant-erik nginx:alpine cf-origin-erckak-dev "Erckak" (custom)
flowerinsider.xyz, www.* fc-tenant-fit nginx:alpine cf-origin-flowerinsider-xyz "Flower Insider" (custom)
matt.flowercore.io fc-tenant-matt nginx:alpine cf-origin-flowercore-io "Coming Soon" (generic)

All nginx pods 1/1 Running, IngressRoutes priority 100 (override the GX10 public-catchall). Each site replicates EXACTLY what was live on OLD at migration time, so cutover is content-invisible.

Staged verification (all HTTP 200, correct content, SNI-correct cert):

curl -sk --resolve <host>:32491:10.0.56.14 https://<host>:32491/

Public routing reality (why NO automatic cutover happened)

Every tenant domain enters the network through Cloudflare (proxied) → a dedicated pfSense WAN IP in 74.40.140.16/28 → pfSense port-forward. ALL FIVE currently forward to OLD Traefik 10.0.56.200:443:

Domain CF origin WAN IP pfSense rdr today
bluejay.dev 74.40.140.17 → 10.0.56.200:443
matt.flowercore.io 74.40.140.19 → 10.0.56.200:443
timeforta.co 74.40.140.21 → 10.0.56.200:443
erckak.dev 74.40.140.23 → 10.0.56.200:443
flowerinsider.xyz 74.40.140.25 → 10.0.56.200:443

(Contrast: main flowercore.io = WAN .24 → already GX10 10.0.56.14:32491.) NOTE: matt.flowercore.io is bound to WAN .19 (the MATT VPN IP), NOT .24, so the "*.flowercore.io already NATs to GX10" assumption does NOT cover matt.

Because none of these NAT to GX10 yet, no cutover was performed (live sites untouched).

OPERATOR ACTION — cutover = repoint the pfSense port-forward target

For each domain, change the HTTPS (and HTTP) port-forward TARGET from 10.0.56.200 to 10.0.56.14:32491 (HTTPS) / 10.0.56.14:30776 (HTTP). pfSense port-forwards (Firewall → NAT → Port Forward), edit these rule descriptions:

  • ANDREW: HTTPS to Traefik 74.40.140.17:443 → change target 10.0.56.200:443 to 10.0.56.14:32491
  • MATT: HTTPS to Traefik 74.40.140.19:443 → change target 10.0.56.200:443 to 10.0.56.14:32491
  • DUSTIN: HTTPS to Traefik 74.40.140.21:443 → change target 10.0.56.200:443 to 10.0.56.14:32491
  • ERIK: HTTPS to Traefik 74.40.140.23:443 → change target 10.0.56.200:443 to 10.0.56.14:32491
  • FIT: HTTPS to Traefik 74.40.140.25:443 → change target 10.0.56.200:443 to 10.0.56.14:32491
  • (corresponding :80 → 10.0.56.14:30776 HTTP rules likewise, optional — sites are HTTPS-only)

No Cloudflare DNS change is required: the WAN IPs stay the same, only the internal NAT target moves. Each can be flipped independently (per-tenant blast radius).

Post-flip verify (external):

curl -sI https://<host>/    # expect HTTP 200, Server: cloudflare, unchanged content

Rollback

OLD cluster left fully intact (ArgoCD apps infra-andrew/dustin/erik/fit Synced+Healthy, pods Running). To roll back any domain: revert that pfSense port-forward target to 10.0.56.200.

Notes

  • The OLD cluster has DUPLICATE namespaces per tenant (tenant-X custom page + fc-tenant-X generic landing), both with IngressRoutes claiming the same host. Traefik non-deterministically picked a winner; live content was: andrew/erik/fit = custom (tenant-X), dustin/matt = generic (fc-tenant-X). GX10 consolidates to ONE namespace per tenant (fc-tenant-X) serving the content that was actually live.
  • infra-worldbuilder (worldbuilder.iamworkin.lan, internal .NET app) was ALREADY migrated to GX10 (fc-worldbuilder, 1/1 Running) — no action.
  • infra-flowercore (tenant-flowercore/flowercore-web demo) has NO public route and is superseded by the production fc-system/fc-landing-public (flowercore.io root) already live on GX10 — intentionally NOT migrated.