infra(public-tls): add gated Let's Encrypt issuers + tenant NetworkPolicy substrate

Cl-infra-2 (deep-regroup 2026-06-13). LE staging+prod ClusterIssuers (HTTP-01
via Traefik, DNS-01 stub) + a per-tenant default-deny NetworkPolicy template,
under gated/public-tls/ OUTSIDE apps/ so the ApplicationSet does NOT auto-apply
them (an applied ACME ClusterIssuer registers an account immediately). Internal
*.iamworkin.lan TLS stays on step-ca. Inert until the operator opens the
web-hosting public-exposure gate (R-1; 14/14 blockers red). Pairs with Codex
Wh-C1 (hybrid public TLS) + Wh-C2 (isolation).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Andrew Stoltz
2026-06-13 12:06:31 -05:00
parent b098604a6f
commit 387097485e
3 changed files with 176 additions and 0 deletions

View File

@@ -0,0 +1,39 @@
# Public-TLS substrate (gated)
**Lane:** Cl-infra-2 (deep-regroup 2026-06-13). **Status:** authored, **NOT applied** — operator-gated.
This directory holds the Let's Encrypt + isolation substrate for **public** multi-tenant
web hosting. It lives **outside `apps/`** on purpose: the bluejay-infra ApplicationSet only
reconciles `apps/*`, so nothing here is auto-applied. Applying a cert-manager ACME
`ClusterIssuer` registers an ACME account immediately, so these stay inert until the
operator opens the web-hosting public-exposure gate (**R-1**).
## What's here
| File | What | Activate when |
|---|---|---|
| `letsencrypt-issuers.yaml` | `letsencrypt-staging` + `letsencrypt-prod` ClusterIssuers (HTTP-01 via Traefik; DNS-01 stub for wildcards) | Public-go. Move to `apps/cluster-issuers/`, **staging first**. |
| `tenant-networkpolicy-template.yaml` | Per-tenant default-deny + allowlist NetworkPolicy (Traefik ingress, CoreDNS, own-DB egress only) | Rendered per tenant at provision time (Wh-C2 isolation). |
## The gate
Public exposure is **NO-GO** until the §6 go/no-go checklist in
[`docs/standards/web-hosting-production-readiness-plan.md`](../../../FlowerCore.Notes/docs/standards/web-hosting-production-readiness-plan.md)
is green (currently 14/14 red) **and** the operator explicitly opens R-1. Internal
`*.iamworkin.lan` TLS stays on **step-ca** (`apps/fc-dns/fc-dns.yaml``step-ca-dns01`);
these LE issuers are **only** for public tenant domains.
## Pairing
- **Codex Wh-C1** consumes `letsencrypt-staging`/`-prod` for hybrid public TLS on
FlowerCore.PHP/MySQL/DNS.
- **Codex Wh-C2** consumes the NetworkPolicy template for cross-tenant isolation suites.
## Activation checklist (public-go)
1. Wire a public DNS-01 solver (Cloudflare/Namecheap webhook) **or** confirm public tenant
domains route HTTP-01 to the cluster ingress.
2. `git mv gated/public-tls/letsencrypt-issuers.yaml apps/cluster-issuers/` — staging only.
3. Issue one **staging** cert for a throwaway public domain; verify the chain in a browser.
4. Flip that tenant's Certificate `issuerRef` to `letsencrypt-prod`; mind LE rate limits.
5. Render `tenant-networkpolicy-template.yaml` per tenant; run the Wh-C2 negative suites.

View File

@@ -0,0 +1,78 @@
# ============================================================================
# Let's Encrypt ClusterIssuers — PUBLIC TLS substrate (Cl-infra-2, deep-regroup 2026-06-13)
# ============================================================================
# GATED. This file lives OUTSIDE apps/ on purpose, so the bluejay-infra
# ApplicationSet does NOT auto-apply it. Applying a cert-manager ACME
# ClusterIssuer registers an ACME account immediately, so we keep these inert
# until the operator opens the web-hosting public-exposure gate (R-1; the §6
# go/no-go checklist in docs/standards/web-hosting-production-readiness-plan.md
# is currently 14/14 red).
#
# Pairs with Codex Wh-C1 (FlowerCore.PHP/MySQL/DNS hybrid public TLS) and
# Wh-C2 (isolation). Internal *.iamworkin.lan certs STAY on step-ca
# (apps/fc-dns/fc-dns.yaml: ClusterIssuer step-ca-dns01) — these LE issuers are
# ONLY for public tenant domains.
#
# TO ACTIVATE (operator public-go):
# 1. Confirm a public DNS-01 solver is wired (Cloudflare/Namecheap webhook) OR
# that public tenant domains route HTTP-01 to the cluster's public ingress.
# 2. Move this file to apps/cluster-issuers/ (the ApplicationSet will create
# infra-cluster-issuers and apply it), staging FIRST.
# 3. Issue ONE staging cert for a throwaway public domain, verify the chain,
# THEN switch that tenant's Certificate issuerRef to letsencrypt-prod.
# 4. Mind LE prod rate limits (50 certs/registered-domain/week, 5 dupes/week).
#
# Registration email is for expiry notices only — adjust to a role address if
# desired (astoltz@iamwork.in is the current operator contact).
# ----------------------------------------------------------------------------
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
acme:
# LE STAGING — untrusted certs, generous limits. Use this first, always.
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: astoltz@iamwork.in
privateKeySecretRef:
name: letsencrypt-staging-account-key
solvers:
# HTTP-01 via Traefik. Requires the public tenant domain's :80 traffic to
# reach the cluster ingress. For wildcard / apex without inbound :80, swap
# to the dns01 solver block below (needs a public DNS provider webhook).
- http01:
ingress:
class: traefik
# --- DNS-01 alternative for wildcards (uncomment + wire a public DNS webhook) ---
# - dns01:
# webhook:
# groupName: acme.flowercore.io # or the cloudflare/namecheap solver
# solverName: <public-dns-solver>
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
acme:
# LE PRODUCTION — trusted certs, strict rate limits. Only after staging proves out.
server: https://acme-v02.api.letsencrypt.org/directory
email: astoltz@iamwork.in
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
- http01:
ingress:
class: traefik
# - dns01:
# webhook:
# groupName: acme.flowercore.io
# solverName: <public-dns-solver>

View File

@@ -0,0 +1,59 @@
# ============================================================================
# Per-tenant NetworkPolicy TEMPLATE — web-hosting isolation (Cl-infra-2 / Wh-C2)
# ============================================================================
# GATED substrate (outside apps/, not auto-applied). Modeled on the canonical
# default-deny + allowlist shape in apps/fc-devicemgmt/network-policy.yaml.
#
# Purpose: when a public multi-tenant site is provisioned, each tenant's pods
# get a NetworkPolicy that (a) default-denies all ingress/egress, then allows
# only Traefik ingress + CoreDNS + that tenant's own DB. This enforces the
# cross-tenant isolation Wh-C2 verifies with negative suites.
#
# Replace the {{TENANT}} placeholders and apply alongside the tenant's workload
# (the MySQL/PHP managers should emit this when they create a tenant, or a
# templating step in apps/ should render it). Kept here as the reference shape.
# ----------------------------------------------------------------------------
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: tenant-{{TENANT}}-isolation
namespace: fc-tenant-{{TENANT}}
labels:
app.kubernetes.io/part-of: flowercore
flowercore.io/tenant-id: "{{TENANT}}"
flowercore.io/created-by: bluejay-infra
flowercore.io/gate: public-tls
spec:
podSelector: {} # all pods in the tenant namespace
policyTypes: [Ingress, Egress]
ingress:
# Only Traefik may reach tenant pods (public traffic terminates at Traefik).
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: traefik-system
ports:
- { protocol: TCP, port: 80 }
- { protocol: TCP, port: 443 }
- { protocol: TCP, port: 8080 }
egress:
# CoreDNS resolution.
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- { protocol: UDP, port: 53 }
- { protocol: TCP, port: 53 }
# This tenant's OWN MySQL only (NOT other tenants' DBs — that's the isolation).
- to:
- podSelector:
matchLabels:
flowercore.io/tenant-id: "{{TENANT}}"
app.kubernetes.io/name: mysql
ports:
- { protocol: TCP, port: 3306 }
# NOTE: deliberately NO blanket egress. Add per-tenant allowances explicitly
# (object storage, mail relay, etc.) so a compromised tenant pod cannot reach
# the rest of the fleet or other tenants.