K8s gotcha sweep C7 — extend lint + cover Track A allowlist + scope Notes/k8s
Follow-up to 0b52093 (K8s manifest hardening) closing two real gaps the
prior sweep didn't catch:
1. Public read-write allowlist regression guard (Track A)
- New PublicReadWriteAllowlistHosts set tracks updatecenter.iamworkin.lan
+ updates.iamworkin.lan. The allowlist on those hosts is
GET||HEAD||POST||OPTIONS — POST is required for the bootstrap-JWT
check-in endpoint. PUT/PATCH/DELETE must still 404 at the route.
- New PublicReadWriteIngressRoutes_MustPinGetHeadPostOptionsAllowlist
test enforces the allowlist invariant (3 required methods present,
3 forbidden methods absent).
- Companion conftest.dev policy 08_public_readwrite_allowlist.rego.
2. Selenium NetworkPolicy DNAT backend port audit
- FlowerCore.Notes/k8s/selenium/06-networkpolicy.yaml allowed Traefik
VIP 10.0.56.200:443 + :80 but its 10.42.0.0/16 + 10.43.0.0/16 egress
rules didn't include the post-DNAT backend ports (8443 for Traefik
TLS, 8080 for HTTP). Per feedback_netpol_dnat_backend_port: kube-proxy
DNATs the destination to a backend pod IP+port BEFORE Calico
evaluates the FORWARD chain, so without those backend ports in the
pod CIDR rule, Selenium-driven browser AAT calls to
https://*.iamworkin.lan time out at connect.
- Lint inventory now includes FlowerCore.Notes/k8s/selenium/ so
regressions in this manifest fail fast.
Lint scope notes:
- FlowerCore.Notes/k8s/guacamole/ + monitoring/ are historical
scaffolds that have diverged from the live state (bluejay-infra/apps/
is canonical). Operator review is required before bringing them in
line OR decommissioning them — kept out of lint scope until that
decision lands (see xxl-regroup-2026-05-03-followup.md "Codex 7 §0").
README hardening:
- New "Public read-write allowlist hosts" entry under "Known gotchas"
documenting the GET||HEAD||POST||OPTIONS pattern + linking the lint.
Tests: 8/8 lint tests pass.
Companion fix in FlowerCore.Updater repo on branch
codex/k8s-gotcha-fleet-sweep-c7 (k8s/web-deployment.yaml: localhost/ image
needs imagePullPolicy: Never). The FlowerCore.Updater fix applies to a
deploy that's currently live but bites only on first scheduled-pod
landing on a fresh node — not a live production-impact regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -101,6 +101,7 @@ curl -sk -X DELETE https://dns.iamworkin.lan/api/v1/servers/<serverId>/zones/iam
|
|||||||
- **StatefulSet PVC drift**: `volumeClaimTemplates` needs explicit `volumeMode: Filesystem` or ArgoCD SSA self-heals forever. See memory `feedback_argocd_statefulset_pvc_drift.md`.
|
- **StatefulSet PVC drift**: `volumeClaimTemplates` needs explicit `volumeMode: Filesystem` or ArgoCD SSA self-heals forever. See memory `feedback_argocd_statefulset_pvc_drift.md`.
|
||||||
- **IngressRoute namespace split**: this RKE2 Traefik install does not allow cross-namespace service refs. Keep the `IngressRoute`, backend `Service`, and TLS secret in the same namespace; if one host is shared across namespaces, duplicate the `Certificate` and move the route next to the destination service.
|
- **IngressRoute namespace split**: this RKE2 Traefik install does not allow cross-namespace service refs. Keep the `IngressRoute`, backend `Service`, and TLS secret in the same namespace; if one host is shared across namespaces, duplicate the `Certificate` and move the route next to the destination service.
|
||||||
- **Public read-only hosts**: if a public host fronts a service that also exposes admin writes internally, add a Traefik route match like `Host(...) && (Method(GET) || Method(HEAD))` on the public edge instead of trusting the app to reject unsafe methods.
|
- **Public read-only hosts**: if a public host fronts a service that also exposes admin writes internally, add a Traefik route match like `Host(...) && (Method(GET) || Method(HEAD))` on the public edge instead of trusting the app to reject unsafe methods.
|
||||||
|
- **Public read-write allowlist hosts**: if a public host accepts a tightly bounded write surface (e.g. bootstrap-JWT POST), pin the allowlist as `(Method(GET) || Method(HEAD) || Method(POST) || Method(OPTIONS))`. PUT/PATCH/DELETE must still 404 at the route. Track A's `updatecenter.iamworkin.lan` / `updates.iamworkin.lan` are the canonical example. The lint test enforces this invariant.
|
||||||
- **Traefik VIP netpols**: when a `NetworkPolicy` allows `10.0.56.200`, also allow the post-DNAT backend ports (`8443` for TLS plus `8080` or `8000` for HTTP) or Calico will drop the rewritten flow.
|
- **Traefik VIP netpols**: when a `NetworkPolicy` allows `10.0.56.200`, also allow the post-DNAT backend ports (`8443` for TLS plus `8080` or `8000` for HTTP) or Calico will drop the rewritten flow.
|
||||||
- **Auth-safe probes**: services behind API-key or global auth middleware should prefer `tcpSocket` probes unless `/health` is explicitly exempted before the middleware runs.
|
- **Auth-safe probes**: services behind API-key or global auth middleware should prefer `tcpSocket` probes unless `/health` is explicitly exempted before the middleware runs.
|
||||||
- **ArgoCD must use internal Gitea URL**: `http://gitea-clusterip.gitea.svc.cluster.local:3000/bluejay/bluejay-infra.git`, not the external HTTPS URL (step-ca cert isn't trusted by ArgoCD). The `ApplicationSet` and any hand-created `Application` must both use the internal URL.
|
- **ArgoCD must use internal Gitea URL**: `http://gitea-clusterip.gitea.svc.cluster.local:3000/bluejay/bluejay-infra.git`, not the external HTTPS URL (step-ca cert isn't trusted by ArgoCD). The `ApplicationSet` and any hand-created `Application` must both use the internal URL.
|
||||||
|
|||||||
@@ -17,6 +17,17 @@ public sealed class FleetManifestLintTests
|
|||||||
"dns.iamworkin.lan",
|
"dns.iamworkin.lan",
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Public hosts that allow a tightly bounded write surface in addition to
|
||||||
|
// GET/HEAD. updatecenter.iamworkin.lan accepts POST /api/v1/checkin/{id}
|
||||||
|
// (bootstrap-JWT) so its allowlist is GET||HEAD||POST||OPTIONS — but
|
||||||
|
// PUT/PATCH/DELETE must still 404 at the route. Anything wider than this
|
||||||
|
// set should fail this lint.
|
||||||
|
private static readonly HashSet<string> PublicReadWriteAllowlistHosts = new(StringComparer.Ordinal)
|
||||||
|
{
|
||||||
|
"updatecenter.iamworkin.lan",
|
||||||
|
"updates.iamworkin.lan",
|
||||||
|
};
|
||||||
|
|
||||||
private static readonly HashSet<string> ApiKeyProtectedDeployments = new(StringComparer.Ordinal)
|
private static readonly HashSet<string> ApiKeyProtectedDeployments = new(StringComparer.Ordinal)
|
||||||
{
|
{
|
||||||
"messageboard-web",
|
"messageboard-web",
|
||||||
@@ -82,6 +93,52 @@ public sealed class FleetManifestLintTests
|
|||||||
violations.Should().BeEmpty();
|
violations.Should().BeEmpty();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void PublicReadWriteIngressRoutes_MustPinGetHeadPostOptionsAllowlist()
|
||||||
|
{
|
||||||
|
// For hosts in PublicReadWriteAllowlistHosts, the route match MUST
|
||||||
|
// contain Method(`GET`), Method(`HEAD`), Method(`POST`), and
|
||||||
|
// Method(`OPTIONS`) AND MUST NOT contain Method(`PUT`),
|
||||||
|
// Method(`PATCH`), or Method(`DELETE`). This keeps the public
|
||||||
|
// allowlist invariant against regression — see Track A's
|
||||||
|
// updatecenter-web ingressroute hardening.
|
||||||
|
var violations = Inventory.Documents
|
||||||
|
.Where(document => document.Kind == "IngressRoute")
|
||||||
|
.SelectMany(document =>
|
||||||
|
document.MappingSequence("spec", "routes")
|
||||||
|
.Select(route => new
|
||||||
|
{
|
||||||
|
Document = document,
|
||||||
|
Match = ManifestNodeExtensions.Scalar(route, "match") ?? string.Empty,
|
||||||
|
}))
|
||||||
|
.Where(entry => PublicReadWriteAllowlistHosts.Any(host => entry.Match.Contains($"Host(`{host}`)", StringComparison.Ordinal)))
|
||||||
|
.SelectMany(entry =>
|
||||||
|
{
|
||||||
|
var localViolations = new List<string>();
|
||||||
|
|
||||||
|
foreach (var required in new[] { "GET", "HEAD", "POST", "OPTIONS" })
|
||||||
|
{
|
||||||
|
if (!entry.Match.Contains($"Method(`{required}`)", StringComparison.Ordinal))
|
||||||
|
{
|
||||||
|
localViolations.Add($"{entry.Document.Descriptor} is missing required Method(`{required}`).");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
foreach (var forbidden in new[] { "PUT", "PATCH", "DELETE" })
|
||||||
|
{
|
||||||
|
if (entry.Match.Contains($"Method(`{forbidden}`)", StringComparison.Ordinal))
|
||||||
|
{
|
||||||
|
localViolations.Add($"{entry.Document.Descriptor} must not include Method(`{forbidden}`) on a public host.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return localViolations;
|
||||||
|
})
|
||||||
|
.ToList();
|
||||||
|
|
||||||
|
violations.Should().BeEmpty();
|
||||||
|
}
|
||||||
|
|
||||||
[Fact]
|
[Fact]
|
||||||
public void TraefikVipNetworkPolicies_MustAllowPostDnatBackendPorts()
|
public void TraefikVipNetworkPolicies_MustAllowPostDnatBackendPorts()
|
||||||
{
|
{
|
||||||
@@ -311,6 +368,16 @@ internal sealed class ManifestInventory
|
|||||||
Path.Combine(workspaceRoot, "FlowerCore.Media", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.Media", "k8s"),
|
||||||
Path.Combine(workspaceRoot, "FlowerCore.MenuBoard", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.MenuBoard", "k8s"),
|
||||||
Path.Combine(workspaceRoot, "FlowerCore.MessageBoard", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.MessageBoard", "k8s"),
|
||||||
|
// FlowerCore.Notes/k8s/selenium/ is the live Selenium Grid
|
||||||
|
// manifest tree (consumed by deploy-selenium scripts).
|
||||||
|
// FlowerCore.Notes/k8s/guacamole/ + FlowerCore.Notes/k8s/monitoring/
|
||||||
|
// are historical scaffolds that have diverged from the live state
|
||||||
|
// (bluejay-infra/apps/guacamole + bluejay-infra/apps/monitoring are
|
||||||
|
// canonical). Operator review is required before bringing them in
|
||||||
|
// line OR decommissioning them — keep them out of the lint scope
|
||||||
|
// until that decision lands. See xxl-regroup-2026-05-03-followup.md
|
||||||
|
// "Codex 7 §0 stop conditions" + the C7 close-session output.
|
||||||
|
Path.Combine(workspaceRoot, "FlowerCore.Notes", "k8s", "selenium"),
|
||||||
Path.Combine(workspaceRoot, "FlowerCore.MySQL", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.MySQL", "k8s"),
|
||||||
Path.Combine(workspaceRoot, "FlowerCore.PHP", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.PHP", "k8s"),
|
||||||
Path.Combine(workspaceRoot, "FlowerCore.Presentations", "k8s"),
|
Path.Combine(workspaceRoot, "FlowerCore.Presentations", "k8s"),
|
||||||
|
|||||||
@@ -0,0 +1,35 @@
|
|||||||
|
package bluejayinfra.public_readwrite_allowlist
|
||||||
|
|
||||||
|
# Public hosts that allow a tightly bounded write surface in addition to
|
||||||
|
# GET/HEAD. updatecenter.iamworkin.lan accepts POST /api/v1/checkin/{id}
|
||||||
|
# (bootstrap-JWT) so its allowlist is GET||HEAD||POST||OPTIONS — but
|
||||||
|
# PUT/PATCH/DELETE must still 404 at the route. Any host in this set MUST
|
||||||
|
# include all four required methods AND MUST NOT include any forbidden
|
||||||
|
# method.
|
||||||
|
public_readwrite_hosts := {"updatecenter.iamworkin.lan", "updates.iamworkin.lan"}
|
||||||
|
|
||||||
|
required_methods := {"GET", "HEAD", "POST", "OPTIONS"}
|
||||||
|
|
||||||
|
forbidden_methods := {"PUT", "PATCH", "DELETE"}
|
||||||
|
|
||||||
|
deny[msg] {
|
||||||
|
input.kind == "IngressRoute"
|
||||||
|
route := input.spec.routes[_]
|
||||||
|
match := object.get(route, "match", "")
|
||||||
|
host := public_readwrite_hosts[_]
|
||||||
|
contains(match, sprintf("Host(`%s`)", [host]))
|
||||||
|
required := required_methods[_]
|
||||||
|
not contains(match, sprintf("Method(`%s`)", [required]))
|
||||||
|
msg := sprintf("IngressRoute %s/%s is missing required Method(%s) for public read-write host %s", [input.metadata.namespace, input.metadata.name, required, host])
|
||||||
|
}
|
||||||
|
|
||||||
|
deny[msg] {
|
||||||
|
input.kind == "IngressRoute"
|
||||||
|
route := input.spec.routes[_]
|
||||||
|
match := object.get(route, "match", "")
|
||||||
|
host := public_readwrite_hosts[_]
|
||||||
|
contains(match, sprintf("Host(`%s`)", [host]))
|
||||||
|
forbidden := forbidden_methods[_]
|
||||||
|
contains(match, sprintf("Method(`%s`)", [forbidden]))
|
||||||
|
msg := sprintf("IngressRoute %s/%s must not include Method(%s) on public read-write host %s", [input.metadata.namespace, input.metadata.name, forbidden, host])
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user