diff --git a/gated/authentik-tenant-sync/README.md b/gated/authentik-tenant-sync/README.md new file mode 100644 index 0000000..30ec55e --- /dev/null +++ b/gated/authentik-tenant-sync/README.md @@ -0,0 +1,129 @@ +# authentik-tenant-mapping-sync — GATED manifest staging + +**Status:** GATED (suspended). **ADR:** ADR-198 §2.A P1 (Au-1 / Au-3 substrate). **Pairs:** Codex **Cx2-7**. + +This directory is a **Notes staging area**, NOT a deploy target. The orchestrator relocates +`cronjob.yaml` into a `gated/` path **outside** `bluejay-infra/apps/` so ArgoCD's `apps/*` +directory generator never picks it up. Nothing here runs until the activation steps below. + +## What this is + +A nightly Kubernetes `CronJob` that runs +[`scripts/authentik/authentik-tenant-mapping-sync.py`](../../../scripts/authentik/authentik-tenant-mapping-sync.py) +(Notes repo). The script: + +- reads the 1Password Document **`flowercore-tenant-mapping`** (vault `IAmWorkin`, field + `mapping`) via **1Password Connect REST** — never the 1Password CLI/desktop (operator hard rule); +- parses + light-validates the mapping JSON (schema: [`authentik-oidc-tenant-mapping-schema.md`](../../standards/authentik-oidc-tenant-mapping-schema.md) — `version==1`, `mappings[]` with `authentikGroup` / `fcTenantId` / `fcRole`); +- reconciles each distinct `authentikGroup` into Authentik `/api/v3/core/groups/`: + create-if-missing, PATCH-managed-markers-on-drift, **never delete or disable unmanaged groups**; +- emits structured (Serilog-shaped JSON) logs and exits 0 on success. + +It is the **slow nightly fix-up path**. The **<1s hot path** stays the MCP tool +`authentik_sync_tenant_mapping` (schema doc §6.2 force-broadcast). This CronJob does NOT +broadcast SignalR — group reconcile is its only side effect; services pick up mapping changes +on their own 5-minute 1P refresh. + +## Why it is GATED (two locks) + +1. **`spec.suspend: true`** in `cronjob.yaml` — belt-and-suspenders so even if applied it never fires. +2. **Lives outside `apps/`** — staged here in Notes; ArgoCD does not manage it. + +Both must be cleared to go live. This pairs Codex **Cx2-7**: do not activate ahead of the Au-3 +public-go for tenant self-registration. + +## Files + +| File | Purpose | +|------|---------| +| `cronjob.yaml` | The suspended `CronJob` + the script-delivery `ConfigMap` (placeholder body). | +| `README.md` | This file. | +| `scripts/authentik/authentik-tenant-mapping-sync.py` | The reconcile script (canonical source; NOT in this dir). | + +## Secrets (referenced, not invented) + +No secret **values** appear in `cronjob.yaml` — only `secretKeyRef`s: + +- **`AUTHENTIK_TOKEN`** ← `Secret authentik/authentik-credentials` key `BOOTSTRAP_ADMIN_TOKEN` + (already exists; the same token `provision-oidc-client.py` reads). **Au-9 caveat:** this is the + never-rotated bootstrap token — when `/rotate-password rotate authentik` (Au-9) lands, this + CronJob is one of its fan-out consumers. +- **`OP_TOKEN`** ← `Secret authentik/tenant-mapping-sync-op-token` key `token`. + +### OP_TOKEN cross-namespace + +The canonical 1P Connect token Secret is `onepassword-system/onepassword-token`, but this +CronJob runs in the `authentik` namespace and K8s Secrets are namespace-scoped. Pick one at +activation: + +- **Option A (copy, simplest).** Mint a same-namespace copy right before un-suspending: + ```sh + kubectl get secret onepassword-token -n onepassword-system -o jsonpath='{.data.token}' \ + | base64 -d \ + | kubectl create secret generic tenant-mapping-sync-op-token -n authentik \ + --from-file=token=/dev/stdin --dry-run=client -o yaml | kubectl apply -f - + ``` + (Re-run whenever the Connect token rotates — add this CronJob to the **Au-10** Connect-token + fan-out checklist so the copy can't go stale.) +- **Option B (CRD, preferred long-term).** Use an `OnePasswordItem` CRD + (`feedback_1password_operator_pattern`) so the 1P operator mints/refreshes + `authentik/tenant-mapping-sync-op-token` automatically — no manual copy, rotation-safe. + +> If neither secret exists yet, that's fine **while suspended** — the job never schedules. + +## How to ACTIVATE (at Au-3 public-go) + +1. **Pre-flight (workstation dry-run, writes nothing):** + ```sh + export AUTHENTIK_TOKEN=... # or let it read authentik/authentik-credentials via kubectl + export OP_TOKEN=... # or rely on credential-helper.sh get_op_token (fcadmin@noc1) + python scripts/authentik/authentik-tenant-mapping-sync.py --dry-run --verbose + ``` + Confirm the planned create/update set matches the 1P mapping document. +2. **Provide `OP_TOKEN` in-cluster** — Option A or B above. +3. **Materialize the script ConfigMap from the canonical file** (do NOT hand-edit a copy into + `cronjob.yaml` — the embedded body is a deliberate placeholder): + ```sh + kubectl create configmap authentik-tenant-mapping-sync-script -n authentik \ + --from-file=authentik-tenant-mapping-sync.py=scripts/authentik/authentik-tenant-mapping-sync.py \ + --dry-run=client -o yaml | kubectl apply -f - + ``` + (Or, in the imaged future per ADR-198 §2.B P3, bake the script into `fc-runtime-base` and + drop the ConfigMap volume.) +4. **Relocate into bluejay-infra** — move `cronjob.yaml` into a `gated/` (or `apps/`) path in + `bluejay-infra` per the orchestrator's placement decision. If under `apps/`, ArgoCD will sync it. +5. **Un-suspend** — set `spec.suspend: false` (commit in `bluejay-infra` so ArgoCD selfHeal + doesn't revert), or one-off: + ```sh + kubectl patch cronjob authentik-tenant-mapping-sync -n authentik \ + -p '{"spec":{"suspend":false}}' + ``` +6. **Smoke (VG-A1):** trigger an immediate run and check the structured logs: + ```sh + kubectl create job --from=cronjob/authentik-tenant-mapping-sync tms-smoke -n authentik + kubectl logs -n authentik job/tms-smoke + ``` + Then edit a mapping entry in 1P and confirm the next run reconciles the group; the <1s + propagation still comes from the MCP `authentik_sync_tenant_mapping` force-broadcast. + +## Rollback + +Re-suspend (`spec.suspend: true`) or delete the CronJob. The script never deletes Authentik +groups, so a bad run can only over-create groups present in the mapping — remove any unwanted +group by hand in the Authentik admin UI. No data loss path. + +## Idempotency / safety summary + +- Re-running is a no-op when groups already match (mirrors `provision-oidc-client.py`). +- Only the managed attribute block (`fc:managed-by` / `fc:tenant` / `fc:role` / optional + `fc:label` / `fc:regulated` / `fc:strict-mode`) is asserted; group parent/users/roles are + never touched. +- Wildcard SuperAdmin entries (`fcTenantId: "*"`) do not create a per-tenant group. +- `--dry-run` prints the plan and writes nothing — always run it first. + +## Cross-links + +- [`docs/standards/auth-acl-unattended-lifecycle-plan.md`](../../standards/auth-acl-unattended-lifecycle-plan.md) — ADR-198; Au-1/Au-3 lanes, VG-A1/A2. +- [`docs/standards/authentik-oidc-tenant-mapping-schema.md`](../../standards/authentik-oidc-tenant-mapping-schema.md) — the mapping JSON shape + 1P item layout (§2/§3). +- [`scripts/authentik/provision-oidc-client.py`](../../../scripts/authentik/provision-oidc-client.py) — sibling idempotent provisioner (same API + posture). +- [`scripts/credential-helper.sh`](../../../scripts/credential-helper.sh) — `get_op_token` 1P Connect bootstrap (fcadmin@noc1). diff --git a/gated/authentik-tenant-sync/cronjob.yaml b/gated/authentik-tenant-sync/cronjob.yaml new file mode 100644 index 0000000..8933dc0 --- /dev/null +++ b/gated/authentik-tenant-sync/cronjob.yaml @@ -0,0 +1,151 @@ +# ===================================================================================== +# authentik-tenant-mapping-sync — GATED nightly CronJob (Au-3 / ADR-198 §2.A P1) +# +# STATUS: GATED. spec.suspend: true (belt-and-suspenders). This manifest lives in a Notes +# STAGING path (docs/gated-manifests/) and is NOT under bluejay-infra apps/, so ArgoCD +# does not deploy it. It does NOTHING until Au-3 public-go (see README.md in this dir). +# +# WHAT IT RUNS: scripts/authentik/authentik-tenant-mapping-sync.py (Notes repo) — reads the +# 1Password Document `flowercore-tenant-mapping` via Connect REST and reconciles its +# mappings[].authentikGroup entries into Authentik groups (idempotent; never deletes +# unmanaged groups). Pairs Codex Cx2-7. +# +# SECRETS (referenced, NOT invented — no secret VALUES in this file): +# AUTHENTIK_TOKEN <- Secret authentik/authentik-credentials key BOOTSTRAP_ADMIN_TOKEN (exists) +# OP_TOKEN <- Secret authentik/tenant-mapping-sync-op-token key token +# (a copy of onepassword-system/onepassword-token — see README "OP_TOKEN +# cross-namespace" for the one-liner that mints it; OR mint via the +# OnePasswordItem CRD per feedback_1password_operator_pattern). +# +# The script is delivered via the ConfigMap below (same pattern as guacamole guac-k8s-sync). +# When this lane is libraryized/imaged later (ADR-198 §2.B P3) this ConfigMap can be replaced +# by a baked image; for now ConfigMap-delivery keeps the script the single source of truth. +# ===================================================================================== +apiVersion: batch/v1 +kind: CronJob +metadata: + name: authentik-tenant-mapping-sync + namespace: authentik + labels: + app.kubernetes.io/name: authentik-tenant-mapping-sync + app.kubernetes.io/component: sync + app.kubernetes.io/part-of: flowercore-identity + flowercore.io/adr: "198" + flowercore.io/gated: "true" + annotations: + flowercore.io/gate: "Au-3 public-go — suspended until tenant self-registration goes live" + flowercore.io/pairs-with: "Codex Cx2-7" +spec: + # GATE: suspended so it never fires until an operator un-suspends at Au-3 public-go. + suspend: true + # Nightly at 03:17 (off-peak; jittered minute to avoid colliding with other 03:00 jobs). + schedule: "17 3 * * *" + concurrencyPolicy: Forbid + startingDeadlineSeconds: 600 + successfulJobsHistoryLimit: 3 + failedJobsHistoryLimit: 3 + jobTemplate: + spec: + backoffLimit: 2 + activeDeadlineSeconds: 600 + template: + metadata: + labels: + app.kubernetes.io/name: authentik-tenant-mapping-sync + app.kubernetes.io/component: sync + spec: + restartPolicy: OnFailure + securityContext: + runAsNonRoot: true + runAsUser: 65532 + runAsGroup: 65532 + fsGroup: 65532 + seccompProfile: + type: RuntimeDefault + containers: + - name: sync + # python:3.12-slim is sufficient: the script uses only the stdlib (urllib/json/ssl). + # No pip install needed. Pin a digest at activation time for air-gap reproducibility. + image: python:3.12-slim + imagePullPolicy: IfNotPresent + command: + - python3 + - /scripts/authentik-tenant-mapping-sync.py + # NOTE: no --dry-run here -> this is the real reconcile. Operators wanting a + # dry-run first should `kubectl create job --from=cronjob/... ` with the arg + # appended, or run the script from a workstation. See README. + env: + - name: AUTHENTIK_URL + value: "https://id.iamworkin.lan" + - name: OP_CONNECT_URL + value: "http://10.0.56.10:8180/v1" # port 8180, NOT 8443 + - name: OP_VAULT_ID + value: "qaphopopkryhbg353ukzhhuqoq" # IAmWorkin + - name: TENANT_MAPPING_ITEM + value: "flowercore-tenant-mapping" + - name: TENANT_MAPPING_FIELD + value: "mapping" + - name: AUTHENTIK_TOKEN + valueFrom: + secretKeyRef: + name: authentik-credentials + key: BOOTSTRAP_ADMIN_TOKEN + - name: OP_TOKEN + valueFrom: + secretKeyRef: + # A same-namespace copy of onepassword-system/onepassword-token. + # See README "OP_TOKEN cross-namespace". Until Au-3 this Secret need + # not exist (the job is suspended). + name: tenant-mapping-sync-op-token + key: token + resources: + requests: + cpu: 25m + memory: 64Mi + limits: + cpu: 250m + memory: 128Mi + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] + volumeMounts: + - name: script + mountPath: /scripts + readOnly: true + volumes: + - name: script + configMap: + name: authentik-tenant-mapping-sync-script + defaultMode: 0555 +--- +# The reconcile script, delivered as a ConfigMap (single source of truth = the Notes repo +# scripts/authentik/authentik-tenant-mapping-sync.py). At activation, regenerate this +# ConfigMap from the live script so the two never drift, e.g.: +# kubectl create configmap authentik-tenant-mapping-sync-script -n authentik \ +# --from-file=authentik-tenant-mapping-sync.py=scripts/authentik/authentik-tenant-mapping-sync.py \ +# --dry-run=client -o yaml > docs/gated-manifests/authentik-tenant-sync/configmap.script.yaml +# (kept as a placeholder body here so the manifest set is self-describing; the real body is +# the script file — DO NOT hand-edit a divergent copy into this ConfigMap.) +apiVersion: v1 +kind: ConfigMap +metadata: + name: authentik-tenant-mapping-sync-script + namespace: authentik + labels: + app.kubernetes.io/name: authentik-tenant-mapping-sync + app.kubernetes.io/component: sync + flowercore.io/gated: "true" + annotations: + flowercore.io/source: "scripts/authentik/authentik-tenant-mapping-sync.py (Notes repo) — regenerate at activation, do not hand-edit" +data: + authentik-tenant-mapping-sync.py: | + # PLACEHOLDER — regenerate from the canonical script at activation (see annotation above). + # The Notes repo file scripts/authentik/authentik-tenant-mapping-sync.py is the source of + # truth; embedding a hand-copy here would drift. The orchestrator (or the activation + # runbook) materializes this ConfigMap from the live script via `kubectl create configmap + # ... --from-file=...` before un-suspending the CronJob. + import sys + sys.exit("authentik-tenant-mapping-sync ConfigMap not materialized from the canonical " + "script — regenerate with kubectl create configmap --from-file before activation.")