Three Certificates requested duration: 2160h (90d) with renewBefore: 720h (30d). step-ca's ACME provisioner caps cert lifetime at 30d, so it silently issued 720h certs — making renewBefore EQUAL to the actual cert lifetime. cert-manager treats the cert as needing immediate renewal the moment it's issued, creates a CertificateRequest, gets a new (still 30d) cert, marks it for immediate renewal, and loops. Damage on 2026-05-07 ~20:30 (caught during regroup after 5h gap): - fc-worldbuilder/worldbuilder-web-tls: 2365 CRs in 18h - fc-distribution/fc-distribution-tls: 10880 CRs in 18h - knowledge/knowledge-tls: 10888 CRs in 18h Total: 24,133 stale CertificateRequest objects in etcd. Bulk-deleted all CRs + Orders in those 3 namespaces, then this commit fixes the source so ArgoCD sync stops re-creating the loop. Fix: match the working 720h/240h pattern used by every other FC service cert (agent-zero, fc-dns, fc-llm-bridge, fc-php, traefik-system, etc.). 30d cert lifetime + 10d renewal headroom = renewal at day 20, which is the cert-manager standard 2/3-of-lifetime practice. Side effect during loop: ALSO contributed to step-ca load and may have caused intermittent timeouts cluster-wide (the latest stuck challenge was timing out dialing step-ca:9443 even though step-ca itself was up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fc-distribution — staged deployment (Phase 1, USB provisioning)
Status: manifests staged, NOT YET APPLIED. Image must be built +
imported and signing 1Password items confirmed before git push.
- Architecture:
../../../FlowerCore.Notes/docs/infrastructure/usb-provisioning-architecture.md - Repo:
D:\git\FlowerCore\FlowerCore.Distribution\(README.md,CLAUDE.md) - Shared lib:
FlowerCore.Common->FlowerCore.Shared.Distribution
FlowerCore.Distribution publishes signed edition manifests (ECDSA P-256
over canonical JSON) and serves the SHA-256 content-addressed blob store
that USB builders pull from. The verifier embeds the IAmWorkin ACME CA Root CA as the trust anchor; per-edition leaf signing material lives in
1Password and is mounted into the pod read-only.
Deployment order (do NOT skip / reorder)
1. FlowerCore.DNS preflight — VERIFIED 2026-04-23
dist.iamworkin.lan already resolves to 10.0.56.200, but keep the
FlowerCore.DNS preflight green before push:
curl -sk "https://dns.iamworkin.lan/api/v1/zones/iamworkin.lan/resolve-preflight?hostname=dist.iamworkin.lan"
# Expect: "resolvable": true
python bluejay-infra/scripts/check-pfsense-dns.py
# Historical filename retained; implementation now calls FlowerCore.DNS
# resolve-preflight instead of raw resolver lookups.
If the record ever disappears, recreate it through FlowerCore.DNS before push/apply:
curl -sk https://dns.iamworkin.lan/api/v1/servers
curl -sk -X POST https://dns.iamworkin.lan/api/v1/servers/<serverId>/zones/iamworkin.lan/records \
-H "Content-Type: application/json" \
-d '{"name":"dist","type":"A","data":"10.0.56.200","ttl":300}'
If this is missing, cert-manager HTTP-01 will silently back off ~2h. See
memory feedback_pfsense_dns_required_for_acme.md.
2. 1Password items required in vault IAmWorkin
| Item title | Item id | Used as |
|---|---|---|
FlowerCore Code Signing CA |
(existing) | Informational handle only — root CA is baked into the image at build time, not mounted |
FlowerCore Edition Signing Key - edition:kiosk-standard |
3hf33egdvnni6jyuws3r737mqe |
Mounted at /signing/kiosk-standard/ |
FlowerCore Edition Signing Key - edition:aistation-field |
ccxrtsan5samfq4pfuczymacrq |
Mounted at /signing/aistation-field/ |
Each edition item must publish three field labels (the operator turns field labels into Secret keys verbatim):
certificate.pem— leaf certificateprivate-key.pem— ECDSA P-256 private keychain.pem— leaf + intermediate (referenced by the env var as the cert-path; the verifier uses this for signature path validation)
3. Build + import the image to rke2-server
The Pod is pinned to rke2-server because the Synology NFS export
/volume1/kubernetes only allows that node. Importing to the agents is
optional until the ACL is widened.
# From BLUEJAY-WS, in D:\git\FlowerCore\FlowerCore.Distribution
TAG="v$(date +%Y%m%d%H%M)"
dotnet.exe publish -c Release -o deploy/app \
src/FlowerCore.Distribution.Web/FlowerCore.Distribution.Web.csproj
podman build -t localhost/fc-distribution:$TAG -f deploy/Dockerfile.deploy deploy
podman save localhost/fc-distribution:$TAG -o /tmp/fc-distribution.tar
scp /tmp/fc-distribution.tar rke2-server:/tmp/
ssh rke2-server "sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock -n k8s.io images import /tmp/fc-distribution.tar"
4. Bump the image tag + push
Edit fc-distribution.yaml, replace localhost/fc-distribution:v202604231530
with the tag from step 3, then:
cd D:/git/FlowerCore/bluejay-infra
python scripts/check-pfsense-dns.py
git add apps/fc-distribution/
git commit -m "feat(fc-distribution): deploy Phase 1 manifest publisher"
git push
ArgoCD picks up within ~3 minutes and creates infra-fc-distribution.
5. Verify
fcadmin_ssh noc1 '
kubectl -n argocd get application infra-fc-distribution
kubectl -n fc-distribution get certificate,pod,secret
curl -sk -m 8 -o /dev/null -w "HTTP %{http_code}\n" https://dist.iamworkin.lan/healthz
'
Expect: Certificate Ready: True within ~60s, /healthz HTTP 200, both
edition-kiosk-standard and edition-aistation-field Secrets present
with certificate.pem, private-key.pem, chain.pem keys.