feat(github-runner): harden Linux runner fleet #5

Merged
bluejay merged 1 commits from codex/sprint30-linux-runner-fleet into main 2026-05-18 02:51:07 +00:00
Owner

Summary

  • sets writable non-root .NET, NuGet, XDG, HOME, and Actions tool-cache paths on every github-runner pod so actions/setup-dotnet installs under /home/runner instead of /usr/share/dotnet
  • keeps Common and Shared.Pos runners, scales Shared.Pos plus the top Linux-cost repo runners to two emptyDir-backed replicas, and avoids any multi-replica RWO PVC sharing
  • adds repo-scoped runners for Puppet, Signage, DMS, Telephony, Print.Web, Chat, MySQL, and Kiosk.Linux with ACCESS_TOKEN, RUN_AS_ROOT=false, EPHEMERAL=true, and self-hosted Linux labels
  • adds LinuxRunnerOffline to the bluejay-infra Prometheus and Grafana provisioning manifests
  • adds focused lint coverage for runner registration, writable non-root paths, scaled emptyDir safety, and alert wiring

Validation

  • YAML parse: apps/github-runner/github-runner.yaml and apps/monitoring/noc-monitoring.yaml parsed successfully
  • Kubernetes client dry-run for built-in resources passed after excluding local-unavailable CRDs (OnePasswordItem, Certificate, IngressRoute)
  • Focused new runner/alert tests: 4/4 passed
  • Full bluejay-infra lint: 47/48 passed; one pre-existing unrelated failure remains in FlowerCore.Updater ingressroute Method(POST) allowlist
  • Scanned changed files for ubuntu-latest, windows-latest, macos-latest, macos-13, macos-14, obvious PAT/private-key patterns: no hits

Follow-ups

  • Live Notes scripts/monitoring/alerts.yml still needs the matching LinuxRunnerOffline live Podman Prometheus rule in a separate Notes change
  • Shared.Pos publish proof is blocked until this PR is reviewed/merged and ArgoCD syncs infra-github-runner
## Summary - sets writable non-root .NET, NuGet, XDG, HOME, and Actions tool-cache paths on every github-runner pod so actions/setup-dotnet installs under /home/runner instead of /usr/share/dotnet - keeps Common and Shared.Pos runners, scales Shared.Pos plus the top Linux-cost repo runners to two emptyDir-backed replicas, and avoids any multi-replica RWO PVC sharing - adds repo-scoped runners for Puppet, Signage, DMS, Telephony, Print.Web, Chat, MySQL, and Kiosk.Linux with ACCESS_TOKEN, RUN_AS_ROOT=false, EPHEMERAL=true, and self-hosted Linux labels - adds LinuxRunnerOffline to the bluejay-infra Prometheus and Grafana provisioning manifests - adds focused lint coverage for runner registration, writable non-root paths, scaled emptyDir safety, and alert wiring ## Validation - YAML parse: apps/github-runner/github-runner.yaml and apps/monitoring/noc-monitoring.yaml parsed successfully - Kubernetes client dry-run for built-in resources passed after excluding local-unavailable CRDs (OnePasswordItem, Certificate, IngressRoute) - Focused new runner/alert tests: 4/4 passed - Full bluejay-infra lint: 47/48 passed; one pre-existing unrelated failure remains in FlowerCore.Updater ingressroute Method(POST) allowlist - Scanned changed files for ubuntu-latest, windows-latest, macos-latest, macos-13, macos-14, obvious PAT/private-key patterns: no hits ## Follow-ups - Live Notes scripts/monitoring/alerts.yml still needs the matching LinuxRunnerOffline live Podman Prometheus rule in a separate Notes change - Shared.Pos publish proof is blocked until this PR is reviewed/merged and ArgoCD syncs infra-github-runner
bluejay added 1 commit 2026-05-17 21:31:06 +00:00
bluejay force-pushed codex/sprint30-linux-runner-fleet from 7256cfe38e to 67064c4129 2026-05-18 02:50:46 +00:00 Compare
bluejay merged commit 634b9c4169 into main 2026-05-18 02:51:07 +00:00
bluejay deleted branch codex/sprint30-linux-runner-fleet 2026-05-18 02:51:09 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bluejay/bluejay-infra#5