Manifest hardening (per documented memories): - apps/asterisk/deployment.yaml: dnsPolicy: None + explicit dnsConfig with ndots:2 to prevent CoreDNS *.iamworkin.lan template from hijacking external egress (downloads.asterisk.org). - apps/fc-llm-bridge/fc-llm-bridge.yaml: same dnsConfig pattern for api.anthropic.com egress. - apps/fc-ttsreader/fc-ttsreader.yaml: same dnsConfig pattern for huggingface.co model seeding. - apps/fc-messageboard/fc-messageboard.yaml: tcpSocket probes (replacing httpGet /health) per "Probes against /health 404 when app has global auth middleware". - apps/fc-signalcontrol/fc-signalcontrol.yaml: same tcpSocket probe fix. New lint project: - tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj — local-first lint test sweep for the recurring K8s gotchas in the fleet. - tests/bluejay-infra-lint/FleetManifestLintTests.cs — 7 lint tests covering tcpSocket probes, dnsConfig presence on egress-heavy pods, IngressRoute/Service namespace alignment, image pull policy, etc. - tests/bluejay-infra-lint/conftest.dev/ — matching conftest policies for environments with conftest/opa. - .gitignore — adds bin/ + obj/ + DS_Store/swp. README.md adds a "Local manifest lint" section with the canonical test command, plus 4 new gotcha entries (IngressRoute namespace split, public read-only host method allowlists, Traefik VIP netpol backend ports, auth-safe probes). Tests: 7 / 7 lint tests passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
125 lines
7.7 KiB
Markdown
125 lines
7.7 KiB
Markdown
# bluejay-infra
|
|
|
|
Infrastructure manifests for ArgoCD. An `ApplicationSet` in `argocd` namespace watches the `apps/*` directories in this repo and creates one `Application` per subdir (prefixed `infra-<name>`).
|
|
|
|
## Adding a new service to the cluster
|
|
|
|
Follow these steps in order. **Step 1 must run before step 3** — if you skip it, cert-manager HTTP-01 will silently fail for ~2h per cert (exponential backoff) until someone diagnoses the DNS.
|
|
|
|
### 1. Create or verify the FlowerCore.DNS A record (REQUIRED for current HTTP-01 manifests)
|
|
|
|
step-ca (the ACME CA on noc1) runs in a Podman container with host networking. Its container resolver uses pfSense Unbound (10.0.56.1), **not** cluster CoreDNS. So even though CoreDNS has a wildcard `*.iamworkin.lan → 10.0.56.200` for in-cluster lookups, step-ca cannot see it. Every new public hostname needs an explicit pfSense host override.
|
|
|
|
The management path is now `FlowerCore.DNS`, not `FlowerCore.Notes/scripts/pfsense-add-dns-overrides.py`. Add or verify the public A record there before you apply the manifest:
|
|
|
|
```bash
|
|
curl -sk https://dns.iamworkin.lan/api/v1/servers
|
|
# Find the pfSense serverId, then create the record using the host label only.
|
|
# Example: for foo.iamworkin.lan, use "name":"foo".
|
|
|
|
curl -sk -X POST https://dns.iamworkin.lan/api/v1/servers/<serverId>/zones/iamworkin.lan/records \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"name":"<yourservice>","type":"A","data":"10.0.56.200","ttl":300}'
|
|
```
|
|
|
|
Verify all referenced iamworkin.lan hosts resolve (run from anywhere on LAN):
|
|
|
|
```bash
|
|
python scripts/check-pfsense-dns.py
|
|
# Historical filename retained. The script now calls
|
|
# https://dns.iamworkin.lan/api/v1/zones/iamworkin.lan/resolve-preflight
|
|
# for every Certificate dnsName and Traefik Host(...) rule it finds.
|
|
|
|
python scripts/check-pfsense-dns.py --live
|
|
# Optional stronger pass when kubectl access is available; also checks
|
|
# live-cluster Certificates and IngressRoutes for drift outside manifests.
|
|
```
|
|
|
|
**Symptom if you skip this:** the Certificate resource stays `Ready: False` with `status.reason: unexpected non-ACME API error: context deadline exceeded`. Recovery requires `kubectl -n <ns> delete order <order-name>` after adding the DNS to bypass cert-manager's backoff.
|
|
|
|
### 2. Create the app manifest
|
|
|
|
Create `apps/<name>/<name>.yaml` containing the Namespace, Deployment, Service, Certificate, and IngressRoute. Reference an existing directory (e.g. `apps/fc-messageboard/`) for the canonical shape.
|
|
|
|
Conventions:
|
|
|
|
- `Namespace` has label `app.kubernetes.io/part-of: bluejay-infra`
|
|
- `Deployment.spec.selector.matchLabels` and `Service.spec.selector` MUST use the same label key. The historical convention here is `app: <name>` (not `app.kubernetes.io/name`) — don't mix.
|
|
- Image: `localhost/<name>:v<YYYYMMDD><HHMM>`, `imagePullPolicy: Never`. Import the image to every RKE2 node (server + both agents) via `ctr images import` before applying — pods schedule anywhere.
|
|
- If the app persists local state (SQLite, uploads), declare the `PersistentVolumeClaim` here with `storageClassName: longhorn` and `accessModes: [ReadWriteOnce]`. Add `strategy.type: Recreate` to the Deployment — RWO PVC blocks rolling updates.
|
|
- Probes: use `tcpSocket` if the app has middleware that intercepts unauth requests (returns 404/401 for `/health`). Otherwise prefer `httpGet` against whatever the app exposes (verify the path isn't gated by auth).
|
|
- Certificate: `issuerRef.name: step-ca-acme`, `issuerRef.kind: ClusterIssuer`. `dnsNames` must match the hostname you created in FlowerCore.DNS in step 1.
|
|
|
|
### 3. Commit & push
|
|
|
|
```bash
|
|
git add apps/<name>/
|
|
git commit -m "<name>: initial deployment"
|
|
git push
|
|
```
|
|
|
|
ArgoCD's `ApplicationSet` picks up the new directory within ~3 minutes and creates `infra-<name>` with auto-sync + self-heal enabled.
|
|
|
|
### 4. Verify
|
|
|
|
```bash
|
|
# From noc1
|
|
fcadmin_ssh noc1 '
|
|
kubectl -n argocd get application infra-<name>
|
|
kubectl -n <ns> get certificate,pod
|
|
curl -sk -m 8 -o /dev/null -w "HTTP %{http_code}\n" https://<name>.iamworkin.lan/
|
|
'
|
|
```
|
|
|
|
Certificate should be `Ready: True` within ~60s. If it stalls `False` for >2m, the pfSense DNS step got skipped — go back to step 1, then `kubectl -n <ns> delete order <order-name>` to bust the backoff.
|
|
|
|
### Pre-merge gate
|
|
|
|
Before `git push`, always run:
|
|
|
|
```bash
|
|
python scripts/check-pfsense-dns.py
|
|
```
|
|
|
|
It's a quick service-backed check that would have caught the entire 2026-04-22 cert-manager outage. Consider wiring it into a pre-commit hook or a Gitea Actions workflow.
|
|
|
|
## Retiring a service
|
|
|
|
1. `kubectl -n argocd delete application infra-<name>` (cascade deletes the K8s resources via ArgoCD finalizers)
|
|
2. `git rm -r apps/<name>/` and push
|
|
3. Remove the FlowerCore.DNS record through the UI or API, for example:
|
|
|
|
```bash
|
|
curl -sk https://dns.iamworkin.lan/api/v1/servers
|
|
curl -sk -X DELETE https://dns.iamworkin.lan/api/v1/servers/<serverId>/zones/iamworkin.lan/records/<yourservice>
|
|
```
|
|
|
|
## Known gotchas
|
|
|
|
- **CoreDNS template + ndots:5 collision**: inside pods, `<svc>.<ns>.svc.cluster.local` with <5 dots gets search-expanded through `iamworkin.lan` FIRST and hits the wildcard template → resolves to Traefik VIP, not the real ClusterIP. Use short service names (`<svc>`) in K8s manifests. See memory `feedback_coredns_ndots_template_collision.md`.
|
|
- **Image not on node**: pods stuck `ErrImageNeverPull` means the image wasn't imported to the node Kubernetes scheduled the pod onto. `ctr images import` on all of rke2-server, rke2-agent1, rke2-agent2.
|
|
- **StatefulSet PVC drift**: `volumeClaimTemplates` needs explicit `volumeMode: Filesystem` or ArgoCD SSA self-heals forever. See memory `feedback_argocd_statefulset_pvc_drift.md`.
|
|
- **IngressRoute namespace split**: this RKE2 Traefik install does not allow cross-namespace service refs. Keep the `IngressRoute`, backend `Service`, and TLS secret in the same namespace; if one host is shared across namespaces, duplicate the `Certificate` and move the route next to the destination service.
|
|
- **Public read-only hosts**: if a public host fronts a service that also exposes admin writes internally, add a Traefik route match like `Host(...) && (Method(GET) || Method(HEAD))` on the public edge instead of trusting the app to reject unsafe methods.
|
|
- **Traefik VIP netpols**: when a `NetworkPolicy` allows `10.0.56.200`, also allow the post-DNAT backend ports (`8443` for TLS plus `8080` or `8000` for HTTP) or Calico will drop the rewritten flow.
|
|
- **Auth-safe probes**: services behind API-key or global auth middleware should prefer `tcpSocket` probes unless `/health` is explicitly exempted before the middleware runs.
|
|
- **ArgoCD must use internal Gitea URL**: `http://gitea-clusterip.gitea.svc.cluster.local:3000/bluejay/bluejay-infra.git`, not the external HTTPS URL (step-ca cert isn't trusted by ArgoCD). The `ApplicationSet` and any hand-created `Application` must both use the internal URL.
|
|
|
|
## Local manifest lint
|
|
|
|
The repo now carries a local-first lint pass for the recurring K8s gotchas that have burned the fleet:
|
|
|
|
```bash
|
|
dotnet test tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj -c Release
|
|
```
|
|
|
|
That test project sweeps `bluejay-infra/apps/**` plus the canonical sibling `FlowerCore.*\\k8s` manifests that share the same workspace. Matching `conftest.dev` policy files live under `tests/bluejay-infra-lint/conftest.dev/` for environments that also have `conftest` or `opa`.
|
|
|
|
## References
|
|
|
|
- Cert-manager recovery playbook: `FlowerCore.Notes/memory/project_cert_manager_recovery_2026_04_22.md`
|
|
- Why pfSense DNS is required: `FlowerCore.Notes/memory/feedback_pfsense_dns_required_for_acme.md`
|
|
- Public DNS operator host: `https://dns.iamworkin.lan`
|
|
- Canonical credential helper: `FlowerCore.Notes/scripts/credential-helper.sh`
|
|
- pfSense admin automation: `FlowerCore.Notes/memory/feedback_pfsense_automation.md`
|