feat: add qt sdk remotedesktop warm pool

brochure: delete apps/brochure/ — full prune per operator decision 2026-05-19
Removes the apps/brochure/ directory entirely from the bluejay-infra ApplicationSet glob. ArgoCD will: 1. See infra-brochure has no git source -> mark for delete 2. Prune the brochure namespace + Deployment + Service + Certificate + Secret + IngressRoute (all generated from the now-gone apps/brochure/brochure.yaml) 3. Remove the infra-brochure Application from argocd ns Operator decision 2026-05-19 (follow-up to 09387f9 ARCHIVED banner commit): "Yes, prune argo for brochure. Probably fully deleted there." The brochure subdomain project was a planning-chain misinterpretation of "make TtsReader + AI Station production-ready" — see memory/project_brochure_split_misinterpretation_archived_2026_05_19.md in FlowerCore.Notes for the full decision record. Reusable artifacts that were the operator's archive concern stay alive in their actual homes: - FlowerCore.Intranet.Web PR #8 content-NuGet carve-out: still in Intranet's master, may transfer to TtsReader / AI Station prod work - Sprint 32 Cl-5 substrate (public-twin design ideas): SUPERSEDED banner in-place in FlowerCore.Notes docs/standards/, history preserved - magpie-doc-writer + wren-walkthrough skill output: unchanged in Intranet's flowercore-whats-new/walkthroughs/galleries directories Companion Notes-side commit updates the "scaled to 0 + ARCHIVED banner" language in mvp-readiness.html + fleet-roadmap-2026-05-19-sprint36-v2.md + memory record to reflect full deletion instead. Wrong-codebase image localhost/fc-brochure-web:v20260524-sprint32 is being removed from rke2-server / rke2-agent1 / rke2-agent2 in a follow-up step (reclaims ~800MB per node). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:30:32 -05:00 · 2026-05-19 10:42:30 -05:00 · 2026-05-19 10:34:28 -05:00 · 2026-05-19 10:22:25 -05:00 · 2026-05-19 10:11:09 -05:00 · 2026-05-19 09:56:14 -05:00
17 changed files with 4363 additions and 122 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,2 @@
 *.yaml text eol=lf
 *.yml text eol=lf
--- a/README.md
+++ b/README.md
@@ -118,6 +118,7 @@ That test project sweeps `bluejay-infra/apps/**` plus the canonical sibling `Flo
 ## References
 - OpenVox noc1 durability runbook: `docs/runbooks/openvoxserver-quadlet-durability.md`
 - Cert-manager recovery playbook: `FlowerCore.Notes/memory/project_cert_manager_recovery_2026_04_22.md`
 - Why pfSense DNS is required: `FlowerCore.Notes/memory/feedback_pfsense_dns_required_for_acme.md`
 - Public DNS operator host: `https://dns.iamworkin.lan`
--- a/apps/fc-chat/fc-chat.yaml
+++ b/apps/fc-chat/fc-chat.yaml
@@ -30,3 +30,41 @@ spec:
          port: 80
  tls:
    secretName: chat-web-tls
 ---
 # Public host profile marker. The app treats this header as authoritative for
 # the public twin, while the internal chat.iamworkin.lan route does not attach
 # it and keeps the operator-oriented UI.
 apiVersion: traefik.io/v1alpha1
 kind: Middleware
 metadata:
  name: chat-public-profile-header
  namespace: fc-chat
 spec:
  headers:
    customRequestHeaders:
      X-FC-Chat-Host-Profile: "public"
 ---
 # Public Cloudflare-fronted twin for the anonymous chat surface. Operator
 # paths are intentionally absent from the allowlist below, so /admin,
 # /operator, /console, /ops, /api/operator, and /operatorhub miss this route
 # and return Traefik 404 before reaching the pod. Operator action still needed:
 # create/verify Cloudflare DNS chat.flowercore.io -> public Traefik endpoint
 # and mirror the cf-origin-flowercore-io TLS secret into namespace fc-chat.
 apiVersion: traefik.io/v1alpha1
 kind: IngressRoute
 metadata:
  name: chat-web-public
  namespace: fc-chat
 spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`chat.flowercore.io`) && (Path(`/`) || Path(`/chat`) || PathPrefix(`/_blazor`) || PathPrefix(`/_framework`) || PathPrefix(`/_content`) || PathPrefix(`/avatars`) || PathPrefix(`/css`) || PathPrefix(`/js`) || PathPrefix(`/favicon`) || PathPrefix(`/chathub`)) && (Method(`GET`) || Method(`HEAD`) || Method(`POST`) || Method(`OPTIONS`))
      kind: Rule
      middlewares:
        - name: chat-public-profile-header
      services:
        - name: chat-web
          port: 80
  tls:
    secretName: cf-origin-flowercore-io
--- a/apps/fc-desktop/remotedesktop-pools.yaml
+++ b/apps/fc-desktop/remotedesktop-pools.yaml
@@ -0,0 +1,30 @@
 # FlowerCore RemoteDesktop warm-pool posture.
 #
 # The RemoteDesktop Web and Operator Deployments remain owned by
 # FlowerCore.RemoteDesktop. bluejay-infra owns these GitOps pool intents so
 # rebuilds preserve the operational posture without baking it into service code.
 ---
 apiVersion: flowercore.io/v1
 kind: RemoteDesktopPoolCrd
 metadata:
  name: qt-sdk-pool
  namespace: fc-desktop
  labels:
    app.kubernetes.io/name: remotedesktop-pool
    app.kubernetes.io/component: warm-pool
    app.kubernetes.io/part-of: flowercore-remotedesktop
    flowercore.io/template: dev-workstation
    flowercore.io/image: localhost-fc-desktop-qt-sdk
  annotations:
    flowercore.io/deficit-tolerance: "0"
    flowercore.io/scale-mode: ManualScaleOnDemand
    flowercore.io/image-ref: localhost/fc-desktop:qt-sdk
    flowercore.io/image-pull-policy: Never
 spec:
  templateSlug: dev-workstation
  desiredSize: 0
  enabled: false
  userVolumeMode: LateAttach
  deficitTolerance: 0
  scaleMode: ManualScaleOnDemand
  reconcileNow: false
--- a/apps/fc-devicemgmt/deployment-operator.yaml
+++ b/apps/fc-devicemgmt/deployment-operator.yaml
@@ -47,7 +47,7 @@ spec:
        fsGroupChangePolicy: OnRootMismatch
      containers:
        - name: operator
-          image: localhost/fc-devicemgmt-operator:v20260512-cx5
+          image: localhost/fc-devicemgmt-operator:v20260519-sp34cl3-fix
          imagePullPolicy: Never
          ports:
            - name: metrics
--- a/apps/fc-devicemgmt/deployment-web.yaml
+++ b/apps/fc-devicemgmt/deployment-web.yaml
@@ -4,6 +4,22 @@
 # Sprint 9+ lane. This manifest is static-valid without requiring the image to
 # exist yet; import localhost/fc-devicemgmt-web:<tag> to all schedulable RKE2
 # nodes before letting ArgoCD sync a live rollout.
 #
 # SCALED TO 0 — 2026-05-19 morning-routine cleanup.
 # The Web pod cannot start until TWO upstream gaps close:
 #   1. MySQL DB instance `flowercore_devicemgmt` (user `fc_devicemgmt`) is
 #      provisioned via fc-mysql Manager. The cluster currently has ZERO
 #      MySqlInstanceCrds and no `mysql.fc-mysql.svc:3306` Service, so the
 #      deployment-web container env `FlowerCore__Database__Host=mysql.fc-mysql.svc`
 #      points at nothing. Provision via the fc-mysql Manager UI/REST/MCP.
 #   2. 1Password vault item `IAmWorkin/FlowerCore DeviceManagement Runtime`
 #      with 5 fields (DB-Password, mtls-ca.pem, mtls-client.crt, mtls-client.key,
 #      mtls-chain.pem) — see apps/fc-devicemgmt/1password-item.yaml. Mint mTLS
 #      from step-ca-agent ClusterIssuer per ADR-126; DB-Password must match the
 #      password configured for the MySQL user.
 # Re-enable: change replicas back to 2 after both gaps close. The image tag
 # in this file (v20260512-cx5) MAY also need a refresh — it predates the
 # Sprint 34 Cl-3 operator fix; Web may have an analogous bug.
 apiVersion: apps/v1
 kind: Deployment
 metadata:
@@ -20,7 +36,7 @@ metadata:
  annotations:
    flowercore.io/traceability-standard: k8s-pod-ownership-and-traceability-standard
 spec:
-  replicas: 2
+  replicas: 0
  revisionHistoryLimit: 3
  selector:
    matchLabels:
--- a/apps/fc-ttsreader/fc-ttsreader.yaml
+++ b/apps/fc-ttsreader/fc-ttsreader.yaml
@@ -532,7 +532,7 @@ spec:
        fsGroupChangePolicy: OnRootMismatch
      containers:
        - name: web
-          image: localhost/fc-ttsreader-web:v20260506-phase6
+          image: localhost/fc-ttsreader-web:v20260518-sprint36-demo-finish-b132cbf
          imagePullPolicy: Never
          ports:
            - containerPort: 5217
@@ -555,9 +555,13 @@ spec:
            - name: TtsReader__Jobs__Root
              value: "/data/jobs"
            - name: TtsReader__Piper__Host
-              value: "ttsreader-piper.fc-ttsreader.svc.cluster.local."
+              value: "10.0.57.17"
            - name: TtsReader__Piper__Port
-              value: "10200"
+              value: "8500"
            - name: TtsReader__Piper__Transport
              value: "http"
            - name: TtsReader__Piper__HttpPath
              value: "/tts"
            - name: TtsReader__Kokoro__Enabled
              value: "true"
            - name: TtsReader__Kokoro__BaseUrl
--- a/apps/github-runner/README.md
+++ b/apps/github-runner/README.md
@@ -0,0 +1,76 @@
 # GitHub Runner Fleet
 ArgoCD owns `apps/github-runner/github-runner.yaml`. Do not patch live runner
 Deployments with `kubectl`; update this manifest and let ArgoCD reconcile.
 ## Runner Shape
 All repo-scoped Linux runners use:
 - `ACCESS_TOKEN` from the `github-runner-token` Secret
 - `RUN_AS_ROOT=false`
 - `EPHEMERAL=true`
 - `LABELS=self-hosted,linux,fc-build-linux`
 - writable non-root paths under `/home/runner` for .NET, NuGet, XDG cache, and
  Actions tool cache
 `github-runner` for `FlowerCore.Common` is single-replica because it retains the
 original Longhorn ReadWriteOnce NuGet PVC. Every other repo-scoped runner uses
 two replicas with per-pod `emptyDir` caches. That is the safe backlog-drain
 strategy: no two pods share one RWO PVC.
 Sprint 32 final long-tail wave adds 16 two-replica Deployments:
 `FlowerCore.Knowledge`, `FlowerCore.LlmBridge`, `FlowerCore.Media`,
 `FlowerCore.Presentations`, `FlowerCore.RemoteDesktop`, `FlowerCore.DNS`,
 `FlowerCore.Distribution`, `FlowerCore.Scoreboard`,
 `FlowerCore.SegmentDisplay`, `FlowerCore.Signage.Contracts`,
 `FlowerCore.SignalControl`, `FlowerCore.Intranet.Web`,
 `FlowerCore.Provisioning`, `FlowerCore.Redis`, `FlowerCore.MessageBoard`, and
 `FlowerCore.MenuBoard`.
 ## Post-Merge Proof
 After the PR is merged and ArgoCD syncs, verify the runner fleet:
 ```bash
 kubectl -n github-runner get deploy,pods,pvc
 ```
 Verify GitHub registration for the repo-scoped runners:
 ```bash
 for repo in FlowerCore.Common FlowerCore.Shared.Pos FlowerCore.Puppet FlowerCore.Signage \
            FlowerCore.DMS FlowerCore.Telephony FlowerCore.Print.Web FlowerCore.Chat \
            FlowerCore.MySQL FlowerCore.Kiosk.Linux FlowerCore.Marquee FlowerCore.TtsReader \
            FlowerCore.Knowledge FlowerCore.LlmBridge FlowerCore.Media \
            FlowerCore.Presentations FlowerCore.RemoteDesktop FlowerCore.DNS \
            FlowerCore.Distribution FlowerCore.Scoreboard FlowerCore.SegmentDisplay \
            FlowerCore.Signage.Contracts FlowerCore.SignalControl FlowerCore.Intranet.Web \
            FlowerCore.Provisioning FlowerCore.Redis FlowerCore.MessageBoard \
            FlowerCore.MenuBoard; do
  echo "=== $repo ==="
  gh api "/repos/astoltz/$repo/actions/runners" \
    --jq '.runners[] | select(.labels[].name == "fc-build-linux") | {name,status,busy,labels:[.labels[].name]}'
 done
 ```
 Shared.Pos publish proof after the runner pod is online:
 ```bash
 gh run list --repo astoltz/FlowerCore.Shared.Pos \
  --workflow "Build, Test & Publish" --branch main --limit 5
 ```
 If the latest run is still queued after runner registration, rerun the workflow
 from GitHub Actions and verify it lands on an `rke2-linux-*` runner.
 ## Failure Notes
 - `actions/setup-dotnet` permission error at `/usr/share/dotnet`: check that
  `DOTNET_INSTALL_DIR=/home/runner/.dotnet` and related cache env vars are
  present on the runner pod.
 - `404` during runner registration: the fine-grained PAT is valid but missing
  repository access for that repo. Add the repo to the PAT access list; the PAT
  value does not change.
 - `Multi-Attach` volume error: only the Common runner uses a RWO PVC and it must
  stay single-replica. New multi-replica runners use `emptyDir`.
--- a/apps/github-runner/github-runner.yaml
+++ b/apps/github-runner/github-runner.yaml
--- a/apps/monitoring/noc-monitoring.yaml
+++ b/apps/monitoring/noc-monitoring.yaml
@@ -723,6 +723,24 @@ data:
              summary: "Mac mini GitHub runner offline ({{ $labels.runner }})"
              description: "A macmini-* GitHub Actions runner has not reported online for more than 10 minutes. Puppet manages its LaunchDaemon under /Library/LaunchDaemons/io.flowercore.github-runner-<slug>.plist; runners survive reboot and do not require a GUI session."
      - name: linux-runners
        rules:
          - alert: LinuxRunnerOffline
            expr: |
              kube_deployment_status_replicas_ready{
                namespace="github-runner",
                deployment=~"github-runner(|-(sharedpos|puppet|signage|dms|telephony|print-web|chat|mysql|kiosk-linux))"
              } == 0
            for: 5m
            labels:
              severity: warning
              alert_channel: irc
              service: github-runner
              team: ci
            annotations:
              summary: "Linux CI runner offline: {{ $labels.deployment }}"
              description: "Deployment {{ $labels.deployment }} in namespace github-runner has 0 ready replicas for more than 5 minutes. CI jobs targeting this repo will queue until the runner pod restarts and re-registers with GitHub. Check pods with: kubectl -n github-runner get pods -l app.kubernetes.io/name={{ $labels.deployment }}. Check logs with: kubectl -n github-runner logs -l app.kubernetes.io/name={{ $labels.deployment }} --tail=50. Common causes: PAT missing repo access, runner CrashLoopBackOff, or node/resource pressure."
      - name: remote-desktop
        rules:
          - alert: RemoteDesktopWebDown
@@ -948,6 +966,52 @@ data:
            annotations:
              summary: "Disk usage high on {{ $labels.instance }} ({{ $value | printf \"%.1f\" }}%)"
      # Puppet agent + service alerts.
      # Mirror of FlowerCore.Notes/scripts/monitoring/alerts.yml `puppet` group
      # so a future migration to in-cluster Prometheus inherits the ruleset.
      # Source-of-truth for the live Podman Prometheus on noc1 is the Notes file.
      # See feedback_monitoring_k8s_target_vs_live_podman.
      - name: puppet
        rules:
          - alert: PuppetAgentReportStale
            expr: puppet_last_run_age_seconds > 7200
            for: 30m
            labels:
              severity: warning
              alert_channel: irc
            annotations:
              summary: "Puppet agent {{ $labels.instance }} hasn't reported in over 2h"
              description: "Last run age: {{ $value | humanizeDuration }}. The puppet agent on {{ $labels.instance }} may be stopped, the node may be powered off, or noc1 may be unreachable from this node."
              runbook: "1. SSH to node (via noc1 jumpbox if needed) 2. sudo systemctl status puppet 3. sudo puppet agent -t --noop to force a run 4. Check r10k: ssh fcadmin@10.0.56.10 'sudo podman logs openvoxserver --tail 50' 5. Verify noc1 reachability: ping puppet.iamworkin.lan"
          - alert: PuppetAgentReportCritical
            expr: puppet_last_run_age_seconds > 86400
            for: 1h
            labels:
              severity: critical
              alert_channel: irc
            annotations:
              summary: "Puppet agent {{ $labels.instance }} silent for over 24h — node is unmanaged"
              description: "Last run age: {{ $value | humanizeDuration }}. Node {{ $labels.instance }} has not submitted a Puppet report in over 24 hours. Config drift is accumulating — investigate immediately. If intentional (maintenance), add to the exclusion filter or silence in Grafana."
              runbook: "URGENT: 1. Check node power state 2. SSH via noc1 jumpbox: ssh fcadmin@10.0.56.10 then ssh <node> 3. sudo systemctl status puppet 4. sudo systemctl start puppet + sudo puppet agent -t 5. Check for network partitions (VLAN connectivity to 10.0.56.10) 6. If node was recently reimaged: sudo puppet agent -t to re-register with new SSL cert"
          # Sprint 33 Cx-7 Phase B (2026-05-25 postmortem follow-up):
          # Detects puppet.service in failed state — distinct from PuppetAgentReportStale
          # which catches "agent hasn't run." This catches "systemd gave up restarting it"
          # (CA-verify loop or other fatal exit). Requires node-exporter systemd collector
          # enabled with --collector.systemd. If `node_systemd_unit_state` has no series
          # for a node, the collector is disabled there — flag in postmortem follow-up.
          - alert: PuppetServiceFailed
            expr: node_systemd_unit_state{name="puppet.service",state="failed"} == 1
            for: 5m
            labels:
              severity: warning
              alert_channel: irc
            annotations:
              summary: "Puppet service failed on {{ $labels.instance }}"
              description: "puppet.service on {{ $labels.instance }} has been in failed state for 5+ minutes. systemd has stopped auto-restarting (CA-verify-loop or other exit). Manual `systemctl status puppet` confirms. Run `sudo systemctl start puppet` to recover; investigate journal for root cause."
              runbook_url: "https://github.com/astoltz/FlowerCore.Notes/blob/master/memory/feedback_puppet_service_dead_after_ca_loop_alert_misreads.md"
      # K8s pod-state alerts. Require kube-state-metrics scrape (added
      # 2026-04-26 — see scrape_configs above). Would have surfaced the
      # agent-zero ollama-proxy 172x crash-loop instead of letting it
@@ -1209,24 +1273,55 @@ metadata:
 data:
  notify.py: |
    #!/usr/bin/env python3
-    """HTTP->IRC alert relay with thermal printer forwarding for Grafana webhooks.
+    """HTTP->IRC alert relay with thermal-printer DIGEST forwarding.
-    Listens on :9119, posts to #alerts on UnrealIRCd via raw IRC protocol.
+
-    Alerts tagged alert_channel=thermal_print also POST to Print.Web /api/print/alert.
+    Listens on :9119, posts to #alerts on UnrealIRCd, forwards to Print.Web
    /api/print/alert. Thermal printing is BATCHED into hourly digests by
    default so the printer no longer spam-fires per Grafana webhook.
    Routing (per Grafana webhook alert):
      - IRC: always per-event (operator likes the stream)
      - Thermal printer:
          * severity in {critical,disaster,page} OR
            label alert_channel=thermal_print_immediate -> print NOW
          * label alert_channel=thermal_print -> enqueue into hourly digest
          * everything else -> IRC only
      - RESOLVED webhooks remove the alert from the digest buffer
    Env vars (defaults preserve old behavior on first deploy):
      THERMAL_PRINT_ENABLED  default "true"   - master kill switch
      BATCH_INTERVAL_MIN     default "60"     - minutes between digest prints
      BATCH_MAX_PENDING      default "50"     - force-flush threshold
    HTTP surface:
      POST /         - Grafana webhook entry
      POST /flush    - manual digest flush (idempotent)
      GET  /         - status + config + buffer depth + stats
    """
-    import json, socket, sys, time
+    import json, os, socket, sys, threading, time
    from collections import defaultdict
    from datetime import datetime, timezone
    from http.server import HTTPServer, BaseHTTPRequestHandler
    from urllib.request import Request, urlopen
    from urllib.error import URLError
-    IRC_HOST = "unrealircd.irc.svc"  # short name: CoreDNS ndots:5 + iamworkin.lan template hijacks full .cluster.local (see memory)
+    THERMAL_PRINT_ENABLED = os.environ.get("THERMAL_PRINT_ENABLED", "true").lower() == "true"
-    IRC_PORT = 6667
+    BATCH_INTERVAL_MIN    = int(os.environ.get("BATCH_INTERVAL_MIN", "60"))
-    IRC_NICK = "grafana-bot"
+    BATCH_MAX_PENDING     = int(os.environ.get("BATCH_MAX_PENDING", "50"))
-    IRC_CHANNEL = "#alerts"
+
-    PRINT_WEB_URL = "http://10.0.57.16:5200/api/print/alert"
+    IRC_HOST      = os.environ.get("IRC_HOST", "unrealircd.irc.svc")
-    PRINT_ENABLED = True
+    IRC_PORT      = int(os.environ.get("IRC_PORT", "6667"))
    IRC_NICK      = os.environ.get("IRC_NICK", "grafana-bot")
    IRC_CHANNEL   = os.environ.get("IRC_CHANNEL", "#alerts")
    PRINT_WEB_URL = os.environ.get("PRINT_WEB_URL", "http://10.0.57.16:5200/api/print/alert")
    _buffer_lock = threading.Lock()
    _buffer = {}   # fingerprint -> {"alert": dict, "first_seen": float, "last_seen": float}
    _last_flush_time = time.time()
    _stats = {"webhooks_received": 0, "irc_sent": 0, "print_immediate": 0,
              "digest_flushed": 0, "buffer_dedup": 0, "buffer_added": 0,
              "buffer_resolved": 0, "started_at": time.time()}
    def send_irc(message):
        """Connect, handle PING, join, send, quit."""
        try:
            sock = socket.create_connection((IRC_HOST, IRC_PORT), timeout=15)
            sock.sendall(f"NICK {IRC_NICK}\r\n".encode())
@@ -1259,52 +1354,137 @@ data:
            time.sleep(0.5)
            sock.sendall(b"QUIT :alert delivered\r\n")
            sock.close()
            _stats["irc_sent"] += 1
            return True
        except Exception as e:
            print(f"[irc-notify] IRC send failed: {e}", file=sys.stderr)
            return False
-    def send_thermal_print(alert):
+    def post_thermal(payload, kind):
-        if not PRINT_ENABLED: return
+        if not THERMAL_PRINT_ENABLED:
-        labels = alert.get("labels", {})
+            print(f"[irc-notify] thermal disabled; skip {kind} ({payload.get('title','?')[:40]})", file=sys.stderr)
-        annotations = alert.get("annotations", {})
+            return False
        status = alert.get("status", "firing").upper()
        summary = annotations.get("summary", "")
        description = annotations.get("description", "")
        runbook = annotations.get("runbook", "")
        # Build a useful message: summary + description + runbook steps
        parts = []
        if summary: parts.append(summary)
        if description and description != summary: parts.append(description)
        if runbook: parts.append("STEPS: " + runbook)
        message = " | ".join(parts) if parts else labels.get("alertname", "Unknown alert")
        payload = {
            "title": labels.get("alertname", "Unknown"),
            "severity": labels.get("severity", "warning").capitalize(),
            "host": labels.get("instance", labels.get("host", "unknown")),
            "message": message,
            "eventId": alert.get("fingerprint", ""),
            "source": "Grafana",
            "status": "RESOLVED" if status == "RESOLVED" else "PROBLEM",
            "acknowledged": False
        }
        try:
            req = Request(PRINT_WEB_URL, data=json.dumps(payload).encode("utf-8"),
                          headers={"Content-Type": "application/json"}, method="POST")
            resp = urlopen(req, timeout=10)
-            print(f"[irc-notify] Thermal print sent: {resp.read().decode()}", file=sys.stderr)
+            if kind == "immediate": _stats["print_immediate"] += 1
            print(f"[irc-notify] thermal {kind} sent: {payload.get('title','?')[:50]}", file=sys.stderr)
            return True
        except Exception as e:
-            print(f"[irc-notify] Thermal print failed: {e}", file=sys.stderr)
+            print(f"[irc-notify] thermal {kind} failed: {e}", file=sys.stderr)
            return False
-    def should_print(alert):
+    def fingerprint_of(alert):
        fp = alert.get("fingerprint", "")
        if fp: return fp
        labels = alert.get("labels", {})
-        if labels.get("alert_channel") == "thermal_print": return True
+        target = labels.get("pod") or labels.get("instance") or labels.get("deployment") or labels.get("statefulset") or labels.get("namespace") or ""
-        if labels.get("severity", "").lower() in ("critical", "disaster"): return True
+        return f"{labels.get('alertname','?')}/{labels.get('namespace','')}/{target}"
-        if alert.get("status", "").upper() == "RESOLVED": return False
+
-        return False
+    def is_critical(alert):
        return alert.get("labels", {}).get("severity", "").lower() in ("critical", "disaster", "page")
    def is_immediate_label(alert):
        return alert.get("labels", {}).get("alert_channel") == "thermal_print_immediate"
    def is_batched_label(alert):
        return alert.get("labels", {}).get("alert_channel") == "thermal_print"
    def add_to_digest(alert):
        """Add an alert to the digest buffer. Returns True if the buffer GREW
        (new fingerprint), False if it was a dedup, resolution, or no-op.
        """
        if not THERMAL_PRINT_ENABLED: return False
        fp = fingerprint_of(alert)
        status = alert.get("status", "firing").lower()
        with _buffer_lock:
            if status == "resolved":
                if fp in _buffer:
                    del _buffer[fp]
                    _stats["buffer_resolved"] += 1
                return False
            if fp in _buffer:
                _buffer[fp]["last_seen"] = time.time()
                _buffer[fp]["alert"] = alert
                _stats["buffer_dedup"] += 1
                return False
            _buffer[fp] = {"alert": alert, "first_seen": time.time(), "last_seen": time.time()}
            _stats["buffer_added"] += 1
            return True
    def build_digest_payload():
        with _buffer_lock:
            items = list(_buffer.values())
        if not items: return None
        by_name = defaultdict(list)
        for item in items:
            labels = item["alert"].get("labels", {})
            by_name[labels.get("alertname", "Unknown")].append(item)
        lines = []
        for name, group in sorted(by_name.items()):
            targets = []
            for it in group[:5]:
                labels = it["alert"].get("labels", {})
                t = (labels.get("pod") or labels.get("instance") or labels.get("deployment")
                     or labels.get("statefulset") or labels.get("namespace") or "?")
                targets.append(t)
            more = f" (+{len(group)-5})" if len(group) > 5 else ""
            sevs = sorted({it["alert"].get("labels", {}).get("severity", "warning") for it in group})
            lines.append(f"[{'/'.join(sevs)}] {name} x{len(group)}: {', '.join(targets)}{more}")
        now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
        title = f"Alert digest: {len(items)} firing"
        body = "\n".join([
            f"=== {title} ===",
            f"as of {now}",
            "",
            *lines,
            "",
            "Stream: #alerts (IRC)  |  Triage: grafana-noc1.iamworkin.lan",
            "Force-flush: POST irc-notify.monitoring.svc:9119/flush",
        ])
        return {"title": title, "severity": "Warning", "host": "monitoring",
                "message": body, "eventId": f"digest-{int(time.time())}",
                "source": "Grafana digest", "status": "PROBLEM", "acknowledged": False}
    def flush_digest():
        payload = build_digest_payload()
        if payload is None:
            print("[irc-notify] flush: buffer empty, no digest sent", file=sys.stderr)
            return False
        sent = post_thermal(payload, "digest")
        with _buffer_lock:
            _buffer.clear()
        if sent: _stats["digest_flushed"] += 1
        return sent
    def digest_loop():
        global _last_flush_time
        while True:
            try:
                now = time.time()
                elapsed = now - _last_flush_time
                if elapsed >= BATCH_INTERVAL_MIN * 60:
                    print(f"[irc-notify] digest tick: interval reached ({BATCH_INTERVAL_MIN}m); buffer={len(_buffer)}", file=sys.stderr)
                    flush_digest()
                    _last_flush_time = now
                elif len(_buffer) >= BATCH_MAX_PENDING:
                    print(f"[irc-notify] digest tick: buffer full ({len(_buffer)}); force flush", file=sys.stderr)
                    flush_digest()
                    _last_flush_time = now
                time.sleep(15)
            except Exception as e:
                print(f"[irc-notify] digest loop error: {e}", file=sys.stderr)
                time.sleep(60)
    class Handler(BaseHTTPRequestHandler):
        def do_POST(self):
            if self.path == "/flush":
                ok = flush_digest()
                self.send_response(200); self.send_header("Content-Type", "application/json"); self.end_headers()
                self.wfile.write(json.dumps({"flushed": ok, "buffer_after": len(_buffer)}).encode())
                return
            _stats["webhooks_received"] += 1
            length = int(self.headers.get("Content-Length", 0))
            body = json.loads(self.rfile.read(length)) if length else {}
            for alert in body.get("alerts", []):
@@ -1319,22 +1499,56 @@ data:
                msg = f"{icon}{sev_tag} {name}: {summary}"
                if desc: msg += f"\n  {desc}"
                send_irc(msg)
-                if should_print(alert): send_thermal_print(alert)
+                # Thermal routing — EVERYTHING (including criticals) goes into
-            self.send_response(200)
+                # the hourly digest. Only the explicit `alert_channel=thermal_print_immediate`
-            self.send_header("Content-Type", "application/json")
+                # label bypasses, and even that flushes-the-current-digest rather
-            self.end_headers()
+                # than printing a standalone job, so the same fingerprint can't
                # spam the printer per webhook cycle.
                if status == "RESOLVED":
                    add_to_digest(alert)  # removes from buffer
                    continue
                if is_immediate_label(alert):
                    # Explicit opt-in for "paper this NOW" — first arrival of a
                    # new fingerprint triggers an immediate digest flush; repeat
                    # webhooks for the same fingerprint dedupe in the buffer
                    # until the next interval or until the alert resolves.
                    new_in_buffer = add_to_digest(alert)
                    if new_in_buffer:
                        global _last_flush_time
                        flush_digest()
                        _last_flush_time = time.time()
                elif is_critical(alert) or is_batched_label(alert):
                    add_to_digest(alert)
                # else: IRC-only (warnings without thermal_print label)
            self.send_response(200); self.send_header("Content-Type", "application/json"); self.end_headers()
            self.wfile.write(b'{"status":"ok"}')
        def do_GET(self):
-            self.send_response(200)
+            self.send_response(200); self.send_header("Content-Type", "application/json"); self.end_headers()
-            self.send_header("Content-Type", "application/json")
+            with _buffer_lock:
-            self.end_headers()
+                alertnames = sorted({it["alert"].get("labels", {}).get("alertname", "?") for it in _buffer.values()})
-            self.wfile.write(json.dumps({"service":"irc-notify","thermal_print":PRINT_ENABLED}).encode())
+                depth = len(_buffer)
            info = {
                "service": "irc-notify",
                "config": {"thermal_print_enabled": THERMAL_PRINT_ENABLED,
                           "batch_interval_min": BATCH_INTERVAL_MIN,
                           "batch_max_pending": BATCH_MAX_PENDING,
                           "irc_target": f"{IRC_HOST}:{IRC_PORT} {IRC_CHANNEL}",
                           "print_web_url": PRINT_WEB_URL},
                "buffer": {"depth": depth, "alertnames": alertnames,
                           "seconds_since_last_flush": int(time.time() - _last_flush_time),
                           "seconds_until_next_flush": max(0, int(BATCH_INTERVAL_MIN*60 - (time.time() - _last_flush_time)))},
                "stats": _stats,
            }
            self.wfile.write(json.dumps(info, indent=2).encode())
        def log_message(self, format, *args):
            print(f"[irc-notify] {args[0]}", file=sys.stderr)
    if __name__ == "__main__":
        threading.Thread(target=digest_loop, daemon=True).start()
        server = HTTPServer(("0.0.0.0", 9119), Handler)
-        print(f"IRC alert relay :9119 -> {IRC_HOST}:{IRC_PORT} {IRC_CHANNEL} (thermal: {PRINT_ENABLED})")
+        print(f"[irc-notify] :9119 -> IRC {IRC_HOST}:{IRC_PORT} {IRC_CHANNEL} | thermal={'ON' if THERMAL_PRINT_ENABLED else 'OFF'} | digest={BATCH_INTERVAL_MIN}m max={BATCH_MAX_PENDING}", file=sys.stderr)
        server.serve_forever()
 # =============================================================================
@@ -3421,6 +3635,39 @@ data:
                relativeTimeRange: {from: 120, to: 0}
                datasourceUid: __expr__
                model: {type: threshold, expression: B, conditions: [{evaluator: {params: [600], type: gt}}], refId: C}
      - orgId: 1
        name: CI Runners
        folder: CI Alerts
        interval: 1m
        rules:
          - uid: linux-runner-offline
            title: LinuxRunnerOffline
            condition: C
            for: 5m
            noDataState: OK
            execErrState: Error
            annotations:
              summary: "Linux CI runner offline: {{ $labels.deployment }}"
              description: "A github-runner namespace Deployment has 0 ready replicas for more than 5 minutes. CI jobs targeting that repo will queue until the runner pod restarts and re-registers."
              runbook: "1. kubectl -n github-runner get pods -l app.kubernetes.io/name={{ $labels.deployment }} 2. kubectl -n github-runner logs -l app.kubernetes.io/name={{ $labels.deployment }} --tail=50 3. Verify PAT repo access if registration returns 404 4. Verify no RWO PVC is shared by scaled runners"
            labels:
              severity: warning
              service: github-runner
              alert_channel: irc
              team: ci
            data:
              - refId: A
                relativeTimeRange: {from: 300, to: 0}
                datasourceUid: prometheus
                model: {expr: 'kube_deployment_status_replicas_ready{namespace="github-runner",deployment=~"github-runner(|-(sharedpos|puppet|signage|dms|telephony|print-web|chat|mysql|kiosk-linux))"} == 0', instant: true, refId: A}
              - refId: B
                relativeTimeRange: {from: 300, to: 0}
                datasourceUid: __expr__
                model: {type: reduce, expression: A, reducer: last, refId: B}
              - refId: C
                relativeTimeRange: {from: 300, to: 0}
                datasourceUid: __expr__
                model: {type: threshold, expression: B, conditions: [{evaluator: {params: [0], type: gt}}], refId: C}
      - orgId: 1
        name: Infrastructure
        folder: AI Stack Alerts
--- a/apps/worldbuilder/README.md
+++ b/apps/worldbuilder/README.md
@@ -28,9 +28,12 @@ Source: `D:\git\FlowerCore\FlowerCore.WorldBuilder` (master)
   Memory: `feedback_rke2_image_import_per_node_scp`.
 3. **Bump image tag** in `worldbuilder.yaml` and git push.
   ArgoCD ApplicationSet picks up within ~3 minutes.
-4. **First production render** — open `https://worldbuilder.iamworkin.lan`,
+4. **First production render** — open
-   create World → Character → Storyboard → ExportJob, confirm artifact
+   `https://worldbuilder.iamworkin.lan/studio/c32e0000-0000-4000-8000-000000000004`
-   downloads. ComfyUI lives on BLUEJAY-WS at `http://10.0.56.20:8188`.
+   and confirm the Cyberpunk Blue Jay demo prompt loads with five seeded fake
   generated images. This Sprint 32 visitor-safe profile uses
   `ClientMode=fake`; switch the image-generation env vars back to ComfyUI only
   for an operator-owned GPU render lane.
 ## Health probes
@@ -53,8 +56,13 @@ Source: `D:\git\FlowerCore\FlowerCore.WorldBuilder` (master)
 ## Image generation backend
-`FlowerCore:WorldBuilder:ImageGeneration:BaseUrl=http://10.0.56.20:8188` —
+Sprint 32 pins the Kubernetes profile to
-ComfyUI runs on BLUEJAY-WS Windows (R9700 / gfx1201 / ROCm 7.2.1). Pod reaches
+`FlowerCore:WorldBuilder:ImageGeneration:ClientMode=fake` with
-the workstation directly across the 10.0.56.0/24 VLAN (no Podman-style host-
+`BaseUrl=http://127.0.0.1:1`. That keeps the public/internal visitor demo
-filter issues — K8s pods route via Calico, which is L3-routed across the
+deterministic, avoids GPU exposure, and still exercises the studio/gallery
-VLAN).
+surface with persisted generated-image metadata.
 The previous ComfyUI backend target was `http://10.0.56.20:8188` on
 BLUEJAY-WS (R9700 / gfx1201 / ROCm 7.2.1). Re-enable it only in an
 operator-owned follow-up that also verifies workstation reachability and image
 import freshness.
--- a/apps/worldbuilder/worldbuilder.yaml
+++ b/apps/worldbuilder/worldbuilder.yaml
@@ -16,7 +16,11 @@ kind: Namespace
 metadata:
  name: fc-worldbuilder
  labels:
    app.kubernetes.io/name: fc-worldbuilder
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
 ---
 # SQLite DB + generated image gallery + PDF/PNG exports.
 # Longhorn RWO — single replica with `Recreate` rollout strategy keeps it safe.
@@ -25,6 +29,13 @@ kind: PersistentVolumeClaim
 metadata:
  name: worldbuilder-data
  namespace: fc-worldbuilder
  labels:
    app.kubernetes.io/name: worldbuilder-data
    app.kubernetes.io/component: storage
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
 spec:
  accessModes:
    - ReadWriteOnce
@@ -40,7 +51,13 @@ metadata:
  namespace: fc-worldbuilder
  labels:
    app.kubernetes.io/name: worldbuilder-web
    app.kubernetes.io/component: web
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
  annotations:
    flowercore.io/traceability-standard: k8s-pod-ownership-and-traceability-standard
 spec:
  replicas: 1
  revisionHistoryLimit: 3
@@ -54,11 +71,16 @@ spec:
    metadata:
      labels:
        app.kubernetes.io/name: worldbuilder-web
        app.kubernetes.io/component: web
        app.kubernetes.io/part-of: flowercore
        app.kubernetes.io/managed-by: argocd
        flowercore.io/tenant-id: system
        flowercore.io/created-by: bluejay-infra
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics/prometheus"
        flowercore.io/audit-trace-id: "worldbuilder-runtime-demo"
    spec:
      securityContext:
        fsGroup: 1654
@@ -92,11 +114,14 @@ spec:
              value: "/data/gallery"
            - name: FlowerCore__WorldBuilder__Export__RootPath
              value: "/data/exports"
-            # ComfyUI on BLUEJAY-WS (R9700 / gfx1201 / ROCm 7.2.1).
+            # Visitor-safe Sprint 32 profile: fake backend keeps public demo
            # rendering deterministic and avoids exposing BLUEJAY-WS GPU.
            - name: FlowerCore__WorldBuilder__ImageGeneration__BaseUrl
-              value: "http://10.0.56.20:8188"
+              value: "http://127.0.0.1:1"
            - name: FlowerCore__WorldBuilder__ImageGeneration__ClientMode
-              value: "comfyui"
+              value: "fake"
            - name: FlowerCore__WorldBuilder__ImageGeneration__BackendId
              value: "fake"
          resources:
            # Cluster CPU-request budget runs hot (99% on all 3 nodes at deploy
            # time) while actual CPU usage is well below capacity. Idle Blazor
@@ -165,7 +190,11 @@ metadata:
  namespace: fc-worldbuilder
  labels:
    app.kubernetes.io/name: worldbuilder-web
    app.kubernetes.io/component: web
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
 spec:
  type: ClusterIP
  selector:
@@ -180,6 +209,13 @@ kind: Certificate
 metadata:
  name: worldbuilder-web-tls
  namespace: fc-worldbuilder
  labels:
    app.kubernetes.io/name: worldbuilder-web-tls
    app.kubernetes.io/component: ingress
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
 spec:
  secretName: worldbuilder-web-tls
  issuerRef:
@@ -200,6 +236,13 @@ kind: IngressRoute
 metadata:
  name: worldbuilder-web
  namespace: fc-worldbuilder
  labels:
    app.kubernetes.io/name: worldbuilder-web
    app.kubernetes.io/component: ingress
    app.kubernetes.io/part-of: flowercore
    app.kubernetes.io/managed-by: argocd
    flowercore.io/tenant-id: system
    flowercore.io/created-by: bluejay-infra
 spec:
  entryPoints:
    - websecure
--- a/docs/runbooks/openvoxserver-quadlet-durability.md
+++ b/docs/runbooks/openvoxserver-quadlet-durability.md
@@ -0,0 +1,84 @@
 # openvoxserver Quadlet Durability
 This runbook documents the noc1 `openvoxserver` durability fix for the Puppet control-repo deploy path. The service is a noc1 host artifact, not an ArgoCD application, so discovery always starts on noc1 rather than in `apps/*`.
 ## Current State
 As of the Sprint 32 Cx-12 apply on 2026-05-17:
 - `/etc/containers/systemd/openvoxserver.container` has a `GIT_SSH_COMMAND` environment entry that points at the persisted serverdata deploy key.
 - `/etc/systemd/system/openvoxserver-safeconfig.service` is enabled and active, and reapplies `git config --global --add safe.directory *` inside the running container.
 - `/opt/puppet/r10k-deploy.sh` self-heals before each fetch by setting `safe.directory`, the repo-local `core.sshCommand`, and the persisted `known_hosts` file when needed.
 - `puppet-deploy.service` exits `0/SUCCESS` after the apply and the control repo reports `HEAD == origin/master`.
 - `systemctl cat openvoxserver` does not currently resolve to a generated unit on noc1. The container is running through Podman with `restart=always`, so destructive recreate smoke must not run until the generated unit is present.
 ## Discovery
 Run every command through noc1 as `fcadmin`; do not assume BLUEJAY-WS can reach container-local surfaces directly.
 ```bash
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "hostname && sudo -n true"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo find /etc/containers/systemd /usr/share/containers/systemd /etc/systemd/system -name 'openvoxserver*' 2>/dev/null"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo sed -n '1,220p' /etc/containers/systemd/openvoxserver.container"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo systemctl cat puppet-deploy.service"
 ```
 If a future noc1 profile manages these files, update the Puppet control repo and let `puppet-deploy.service` apply the change. On 2026-05-17, host `puppet` was not installed, so Cx-12 used a direct noc1 host edit.
 ## Durable Fix Shape
 The Quadlet keeps the deploy key as a path reference only:
 ```ini
 Environment=GIT_SSH_COMMAND=ssh -i /opt/puppetlabs/server/data/puppetserver/.puppet-deploy-key -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o UserKnownHostsFile=/opt/puppetlabs/server/data/puppetserver/.known_hosts
 ```
 The safeconfig service is intentionally independent of `openvoxserver.service` until the generated unit exists. It waits for the `openvoxserver` container name and then runs:
 ```bash
 /usr/bin/podman exec openvoxserver git config --global --add safe.directory *
 ```
 The deploy script self-heals inside the container before it fetches the control repo:
 ```bash
 git config --global --add safe.directory "*" 2>/dev/null || true
 DEPLOY_KEY="/opt/puppetlabs/server/data/puppetserver/.puppet-deploy-key"
 KNOWN_HOSTS="/opt/puppetlabs/server/data/puppetserver/.known_hosts"
 REPO="/etc/puppetlabs/code/environments/production"
 export GIT_SSH_COMMAND="ssh -i $DEPLOY_KEY -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o UserKnownHostsFile=$KNOWN_HOSTS"
 git -C "$REPO" config core.sshCommand "$GIT_SSH_COMMAND" 2>/dev/null || true
 ```
 ## Validation
 Non-destructive validation:
 ```bash
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo grep -n 'GIT_SSH_COMMAND' /etc/containers/systemd/openvoxserver.container"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo systemctl status openvoxserver-safeconfig.service --no-pager -l"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo systemctl start puppet-deploy.service && sudo systemctl status puppet-deploy.service --no-pager -l"
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "sudo podman exec openvoxserver git -C /etc/puppetlabs/code/environments/production config --get core.sshCommand"
 ```
 Destructive recreate smoke is opt-in only:
 ```bash
 scp scripts/monitoring/openvox-recreate-smoke.sh fcadmin@10.0.56.10:/tmp/openvox-recreate-smoke.sh
 ssh -i ~/.ssh/fcadmin_ed25519 fcadmin@10.0.56.10 "chmod +x /tmp/openvox-recreate-smoke.sh && sudo OPENVOX_RECREATE_SMOKE=1 /tmp/openvox-recreate-smoke.sh"
 ```
 Do not run the smoke during normal sprint work. It stops and removes the production container before starting it again through systemd, and it now refuses to continue unless `systemctl cat openvoxserver` succeeds.
 ## Credential Rotation Note
 When rotating the Puppet deploy key, update the persisted serverdata copy on noc1:
 ```bash
 sudo install -m 0600 -o root -g root <new-deploy-key> /opt/puppet/serverdata/.puppet-deploy-key
 sudo podman exec openvoxserver sh -c "ssh-keyscan github.com > /opt/puppetlabs/server/data/puppetserver/.known_hosts"
 sudo systemctl start openvoxserver-safeconfig.service
 sudo systemctl start puppet-deploy.service
 ```
 Never commit the deploy key or print it in logs.
--- a/scripts/monitoring/openvox-recreate-smoke.sh
+++ b/scripts/monitoring/openvox-recreate-smoke.sh
@@ -0,0 +1,48 @@
 #!/usr/bin/env bash
 set -euo pipefail
 if [ "${OPENVOX_RECREATE_SMOKE:-}" != "1" ]; then
  echo "SKIP: set OPENVOX_RECREATE_SMOKE=1 to run the destructive openvoxserver recreate smoke." >&2
  exit 64
 fi
 SUDO="${SUDO:-sudo}"
 REPO="/etc/puppetlabs/code/environments/production"
 CORE_SSH_COMMAND_FRAGMENT=".puppet-deploy-key"
 if ! $SUDO systemctl cat openvoxserver >/dev/null 2>&1; then
  echo "SKIP: systemctl cat openvoxserver failed; refusing to remove a container without a verified systemd recreate path." >&2
  exit 65
 fi
 before="$($SUDO podman exec openvoxserver git -C "$REPO" rev-parse --short HEAD)"
 echo "Before recreate: $before"
 $SUDO systemctl stop openvoxserver
 $SUDO podman rm openvoxserver 2>/dev/null || true
 $SUDO systemctl start openvoxserver
 sleep 50
 $SUDO systemctl start puppet-deploy.service
 sleep 5
 $SUDO systemctl status puppet-deploy.service --no-pager -l
 after="$($SUDO podman exec openvoxserver git -C "$REPO" rev-parse --short origin/master)"
 echo "After recreate origin/master: $after"
 $SUDO test -d /opt/puppet/code/environments/production/site-modules/profile/manifests
 core_ssh="$($SUDO podman exec openvoxserver git -C "$REPO" config --get core.sshCommand)"
 case "$core_ssh" in
  *"$CORE_SSH_COMMAND_FRAGMENT"*) ;;
  *)
    echo "FAIL: core.sshCommand does not reference the persisted deploy key." >&2
    exit 1
    ;;
 esac
 $SUDO podman exec openvoxserver git -C "$REPO" status --short --branch
 echo "PASS: openvoxserver recreate smoke completed without git safety or deploy-key failure."
--- a/tests/bluejay-infra-lint/FleetManifestLintTests.cs
+++ b/tests/bluejay-infra-lint/FleetManifestLintTests.cs
@@ -13,6 +13,7 @@ public sealed class FleetManifestLintTests
    private static readonly HashSet<string> PublicReadOnlyHosts = new(StringComparer.Ordinal)
    {
        "brochure.flowercore.io",
        "dist.flowercore.io",
        "dns.iamworkin.lan",
    };
@@ -54,6 +55,43 @@ public sealed class FleetManifestLintTests
        "ttsreader-piper",
    };
    private static readonly IReadOnlyDictionary<string, string> LinuxRunnerRepos = new Dictionary<string, string>(StringComparer.Ordinal)
    {
        ["github-runner"] = "https://github.com/astoltz/FlowerCore.Common",
        ["github-runner-sharedpos"] = "https://github.com/astoltz/FlowerCore.Shared.Pos",
        ["github-runner-puppet"] = "https://github.com/astoltz/FlowerCore.Puppet",
        ["github-runner-signage"] = "https://github.com/astoltz/FlowerCore.Signage",
        ["github-runner-dms"] = "https://github.com/astoltz/FlowerCore.DMS",
        ["github-runner-telephony"] = "https://github.com/astoltz/FlowerCore.Telephony",
        ["github-runner-print-web"] = "https://github.com/astoltz/FlowerCore.Print.Web",
        ["github-runner-chat"] = "https://github.com/astoltz/FlowerCore.Chat",
        ["github-runner-mysql"] = "https://github.com/astoltz/FlowerCore.MySQL",
        ["github-runner-kiosk-linux"] = "https://github.com/astoltz/FlowerCore.Kiosk.Linux",
    };
    private static readonly HashSet<string> ScaledLinuxRunnerDeployments = new(StringComparer.Ordinal)
    {
        "github-runner-sharedpos",
        "github-runner-puppet",
        "github-runner-signage",
        "github-runner-dms",
        "github-runner-telephony",
        "github-runner-print-web",
        "github-runner-chat",
        "github-runner-mysql",
        "github-runner-kiosk-linux",
    };
    private static readonly IReadOnlyDictionary<string, string> WritableRunnerEnv = new Dictionary<string, string>(StringComparer.Ordinal)
    {
        ["HOME"] = "/home/runner",
        ["DOTNET_INSTALL_DIR"] = "/home/runner/.dotnet",
        ["DOTNET_CLI_HOME"] = "/home/runner",
        ["NUGET_PACKAGES"] = "/home/runner/.nuget/packages",
        ["XDG_CACHE_HOME"] = "/home/runner/.cache",
        ["RUNNER_TOOL_CACHE"] = "/home/runner/_tool",
    };
    [Fact]
    public void IngressRoutes_MustKeepServiceReferencesInTheSameNamespace()
    {
@@ -187,6 +225,98 @@ public sealed class FleetManifestLintTests
        violations.Should().BeEmpty();
    }
    [Fact]
    public void GitHubRunnerFleet_MustRegisterRequiredReposAsRepoScopedDeployments()
    {
        var deployments = GitHubRunnerDeployments();
        foreach (var expectedRunner in LinuxRunnerRepos)
        {
            deployments.Should().ContainKey(expectedRunner.Key);
            var container = deployments[expectedRunner.Key].ContainerMappings().Should().ContainSingle().Subject;
            EnvValue(container, "REPO_URL").Should().Be(expectedRunner.Value);
            EnvValue(container, "EPHEMERAL").Should().Be("true");
            EnvValue(container, "LABELS").Should().Be("self-hosted,linux,fc-build-linux");
            EnvValue(container, "RUN_AS_ROOT").Should().Be("false");
            EnvValue(container, "ACCESS_TOKEN").Should().BeNull("ACCESS_TOKEN must come from github-runner-token Secret, not a literal");
            EnvSecretName(container, "ACCESS_TOKEN").Should().Be("github-runner-token");
            EnvSecretKey(container, "ACCESS_TOKEN").Should().Be("credential");
        }
    }
    [Fact]
    public void GitHubRunnerFleet_MustSetWritableNonRootDotnetAndCachePaths()
    {
        foreach (var deployment in GitHubRunnerDeployments().Values)
        {
            var container = deployment.ContainerMappings().Should().ContainSingle().Subject;
            foreach (var expectedEnv in WritableRunnerEnv)
            {
                EnvValue(container, expectedEnv.Key).Should().Be(expectedEnv.Value, $"{deployment.Name} must keep .NET paths writable for uid 1001");
            }
            var mounts = ManifestNodeExtensions.MappingSequence(container, "volumeMounts")
                .ToDictionary(
                    mount => ManifestNodeExtensions.Scalar(mount, "name") ?? string.Empty,
                    mount => ManifestNodeExtensions.Scalar(mount, "mountPath") ?? string.Empty,
                    StringComparer.Ordinal);
            mounts.Should().Contain("runner-home", "/home/runner");
            mounts.Should().Contain("nuget-cache", "/home/runner/.nuget/packages");
            mounts.Should().Contain("tmp", "/tmp");
        }
    }
    [Fact]
    public void GitHubRunnerFleet_MustAvoidRwoMultiAttachForScaledDeployments()
    {
        var deployments = GitHubRunnerDeployments();
        foreach (var deploymentName in ScaledLinuxRunnerDeployments)
        {
            var deployment = deployments[deploymentName];
            ReplicaCount(deployment).Should().Be(2);
            var volumes = deployment.MappingSequence("spec", "template", "spec", "volumes");
            var claimNames = volumes
                .Select(volume => ManifestNodeExtensions.Scalar(volume, "persistentVolumeClaim", "claimName"))
                .Where(value => !string.IsNullOrWhiteSpace(value))
                .ToList();
            claimNames.Should().BeEmpty($"{deploymentName} is scaled and must not share a RWO PVC");
            volumes.Should().Contain(volume =>
                string.Equals(ManifestNodeExtensions.Scalar(volume, "name"), "nuget-cache", StringComparison.Ordinal)
                && ManifestNodeExtensions.Mapping(volume, "emptyDir") != null);
        }
        var common = deployments["github-runner"];
        ReplicaCount(common).Should().Be(1);
        common.MappingSequence("spec", "template", "spec", "volumes")
            .Select(volume => ManifestNodeExtensions.Scalar(volume, "persistentVolumeClaim", "claimName"))
            .Where(value => !string.IsNullOrWhiteSpace(value))
            .Should()
            .ContainSingle()
            .Which
            .Should()
            .Be("github-runner-nuget-cache");
    }
    [Fact]
    public void Monitoring_MustAlertWhenLinuxRunnerDeploymentIsUnavailable()
    {
        var monitoring = File.ReadAllText(Path.Combine(Inventory.BluejayRoot, "apps", "monitoring", "noc-monitoring.yaml"));
        monitoring.Should().Contain("MacMiniRunnerOffline");
        monitoring.Should().Contain("LinuxRunnerOffline");
        monitoring.Should().Contain("kube_deployment_status_replicas_ready");
        monitoring.Should().Contain("github-runner(|-(sharedpos|puppet|signage|dms|telephony|print-web|chat|mysql|kiosk-linux))");
        monitoring.Should().Contain("folder: CI Alerts");
        monitoring.Should().Contain("uid: linux-runner-offline");
        monitoring.Should().Contain("alert_channel: irc");
    }
    [Fact]
    public void StatefulSets_WithVolumeClaimTemplates_MustDeclareFilesystemDefaults()
    {
@@ -493,6 +623,44 @@ public sealed class FleetManifestLintTests
        };
    }
    private static IReadOnlyDictionary<string, ManifestDocument> GitHubRunnerDeployments()
    {
        return Inventory.Documents
            .Where(document => document.Kind == "Deployment")
            .Where(document => document.Namespace == "github-runner")
            .ToDictionary(document => document.Name, StringComparer.Ordinal);
    }
    private static int ReplicaCount(ManifestDocument document)
    {
        return int.TryParse(document.Scalar("spec", "replicas"), out var replicas) ? replicas : 1;
    }
    private static string? EnvValue(YamlMappingNode container, string name)
    {
        return EnvMapping(container, name) is { } env ? ManifestNodeExtensions.Scalar(env, "value") : null;
    }
    private static string? EnvSecretName(YamlMappingNode container, string name)
    {
        return EnvMapping(container, name) is { } env
            ? ManifestNodeExtensions.Scalar(env, "valueFrom", "secretKeyRef", "name")
            : null;
    }
    private static string? EnvSecretKey(YamlMappingNode container, string name)
    {
        return EnvMapping(container, name) is { } env
            ? ManifestNodeExtensions.Scalar(env, "valueFrom", "secretKeyRef", "key")
            : null;
    }
    private static YamlMappingNode? EnvMapping(YamlMappingNode container, string name)
    {
        return ManifestNodeExtensions.MappingSequence(container, "env")
            .SingleOrDefault(env => string.Equals(ManifestNodeExtensions.Scalar(env, "name"), name, StringComparison.Ordinal));
    }
    private static IReadOnlyList<ManifestDocument> FcDeviceManagementDocuments()
    {
        return Inventory.Documents
--- a/tests/bluejay-infra-lint/OpenVoxServerDurabilityTests.cs
+++ b/tests/bluejay-infra-lint/OpenVoxServerDurabilityTests.cs
@@ -0,0 +1,99 @@
 using FluentAssertions;
 using Xunit;
 namespace BluejayInfraLint.Tests;
 [Trait("Category", "Unit")]
 public sealed class OpenVoxServerDurabilityTests
 {
    private static readonly string Root = FindRepoRoot();
    private static readonly string RunbookPath = Path.Combine(Root, "docs", "runbooks", "openvoxserver-quadlet-durability.md");
    private static readonly string SmokePath = Path.Combine(Root, "scripts", "monitoring", "openvox-recreate-smoke.sh");
    [Fact]
    public void Runbook_DocumentsHostArtifactAndNonArgoPath()
    {
        var runbook = File.ReadAllText(RunbookPath);
        runbook.Should().Contain("noc1 host artifact");
        runbook.Should().Contain("not an ArgoCD application");
        runbook.Should().Contain("systemctl cat openvoxserver");
        runbook.Should().Contain("/etc/containers/systemd/openvoxserver.container");
    }
    [Fact]
    public void Runbook_DocumentsCx12LiveApplyState()
    {
        var runbook = File.ReadAllText(RunbookPath);
        runbook.Should().Contain("Sprint 32 Cx-12");
        runbook.Should().Contain("openvoxserver-safeconfig.service");
        runbook.Should().Contain("/opt/puppet/r10k-deploy.sh");
        runbook.Should().Contain("HEAD == origin/master");
    }
    [Fact]
    public void SmokeScript_IsExplicitlyOptIn()
    {
        var smoke = File.ReadAllText(SmokePath);
        smoke.Should().Contain("OPENVOX_RECREATE_SMOKE");
        smoke.Should().Contain("exit 64");
        smoke.IndexOf("OPENVOX_RECREATE_SMOKE", StringComparison.Ordinal)
            .Should().BeLessThan(smoke.IndexOf("systemctl stop openvoxserver", StringComparison.Ordinal));
    }
    [Fact]
    public void SmokeScript_RequiresGeneratedSystemdUnitBeforeRemovingContainer()
    {
        var smoke = File.ReadAllText(SmokePath);
        smoke.Should().Contain("systemctl cat openvoxserver");
        smoke.Should().Contain("refusing to remove a container without a verified systemd recreate path");
        smoke.IndexOf("systemctl cat openvoxserver", StringComparison.Ordinal)
            .Should().BeLessThan(smoke.IndexOf("podman rm openvoxserver", StringComparison.Ordinal));
    }
    [Fact]
    public void Artifacts_DoNotStoreSecretsOrPaidRunnerLabels()
    {
        var forbidden = new[]
        {
            "BEGIN OPENSSH PRIVATE KEY",
            "BEGIN RSA PRIVATE KEY",
            "ubuntu-latest",
            "windows-latest",
            "macos-latest",
        };
        var violations = new[] { RunbookPath, SmokePath }
            .SelectMany(path =>
            {
                var text = File.ReadAllText(path);
                return forbidden
                    .Where(token => text.Contains(token, StringComparison.OrdinalIgnoreCase))
                    .Select(token => $"{Path.GetRelativePath(Root, path)} contains forbidden token {token}");
            })
            .ToList();
        violations.Should().BeEmpty();
    }
    private static string FindRepoRoot()
    {
        var current = new DirectoryInfo(AppContext.BaseDirectory);
        while (current is not null)
        {
            if (Directory.Exists(Path.Combine(current.FullName, "apps"))
                && Directory.Exists(Path.Combine(current.FullName, "scripts"))
                && File.Exists(Path.Combine(current.FullName, "README.md")))
            {
                return current.FullName;
            }
            current = current.Parent;
        }
        throw new DirectoryNotFoundException("Could not find bluejay-infra root.");
    }
 }
--- a/tests/bluejay-infra-lint/conftest.dev/02_public_method_allowlist.rego
+++ b/tests/bluejay-infra-lint/conftest.dev/02_public_method_allowlist.rego
@@ -1,6 +1,6 @@
 package bluejayinfra.public_method_allowlist
-public_hosts := {"dist.flowercore.io", "dns.iamworkin.lan"}
+public_hosts := {"brochure.flowercore.io", "dist.flowercore.io", "dns.iamworkin.lan"}
 deny[msg] {
  input.kind == "IngressRoute"
Author	SHA1	Message	Date
Andrew Stoltz	30e16bfcfb	feat: add qt sdk remotedesktop warm pool	2026-05-19 12:30:32 -05:00
Andrew Stoltz	ca574c2280	brochure: delete apps/brochure/ — full prune per operator decision 2026-05-19 Removes the apps/brochure/ directory entirely from the bluejay-infra ApplicationSet glob. ArgoCD will: 1. See infra-brochure has no git source -> mark for delete 2. Prune the brochure namespace + Deployment + Service + Certificate + Secret + IngressRoute (all generated from the now-gone apps/brochure/brochure.yaml) 3. Remove the infra-brochure Application from argocd ns Operator decision 2026-05-19 (follow-up to `09387f9` ARCHIVED banner commit): "Yes, prune argo for brochure. Probably fully deleted there." The brochure subdomain project was a planning-chain misinterpretation of "make TtsReader + AI Station production-ready" — see memory/project_brochure_split_misinterpretation_archived_2026_05_19.md in FlowerCore.Notes for the full decision record. Reusable artifacts that were the operator's archive concern stay alive in their actual homes: - FlowerCore.Intranet.Web PR #8 content-NuGet carve-out: still in Intranet's master, may transfer to TtsReader / AI Station prod work - Sprint 32 Cl-5 substrate (public-twin design ideas): SUPERSEDED banner in-place in FlowerCore.Notes docs/standards/, history preserved - magpie-doc-writer + wren-walkthrough skill output: unchanged in Intranet's flowercore-whats-new/walkthroughs/galleries directories Companion Notes-side commit updates the "scaled to 0 + ARCHIVED banner" language in mvp-readiness.html + fleet-roadmap-2026-05-19-sprint36-v2.md + memory record to reflect full deletion instead. Wrong-codebase image localhost/fc-brochure-web:v20260524-sprint32 is being removed from rke2-server / rke2-agent1 / rke2-agent2 in a follow-up step (reclaims ~800MB per node). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:42:30 -05:00
Andrew Stoltz	09387f90e1	brochure: ARCHIVED 2026-05-19 — was a misinterpretation, do not re-enable The brochure split project was a misinterpretation of an operator request to make TtsReader + AI Station production-ready. Somewhere in the planning chain it spun up into a separate "showcase brochure product" with its own host, repo, NuGet, and Codex pack — none of which the operator actually wanted. The project itself is pointless and a waste of credits. Archive (not delete) per operator decision 2026-05-19, because some work shipped under the misinterpretation may still have reusable value: - FlowerCore.Intranet.Web PR #8 (merged) introduced FlowerCore.Brochure.Content content-NuGet carve-out — pattern may apply to TtsReader/AiStation production polish. - Sprint 32 Cl-5 substrate has design ideas for public-twin vs operator-host separation that may transfer. - magpie-doc-writer / wren-walkthrough skills still author useful Intranet content — those skills stay active. These manifests stay at replicas: 0 for ArgoCD continuity. Cleanup options (move out of apps/* glob, or delete entirely) are documented in README.md for an operator-explicit future call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:34:28 -05:00
Andrew Stoltz	e641ceab48	monitoring(irc-notify): criticals also batch hourly — fix per-fire spam The first batching pass (`bacac06`) left critical-severity alerts on the immediate-print path. That's still per-event spam for any persistent critical (e.g. PrintPaperRollCritical fires every 30s Grafana evaluation cycle when paper is <5%). Caught immediately after deploy: CUPS queue grew 0 → 8 jobs in 8 minutes from a single firing PrintPaperRollCritical. This commit aligns with the operator's verbatim ask ("one alert an hour"): - Critical-severity alerts now go into the digest buffer, NOT the immediate-print path. The digest payload already shows severity tags per alertname, so the operator still sees "[critical] X" in the printout. - The explicit `alert_channel=thermal_print_immediate` label still bypasses batching, but only on NEW fingerprint arrival — it triggers a flush of the CURRENT digest (with the new alert included), then clears. Repeat webhooks for the same fingerprint dedupe in the buffer until the next hourly tick OR until the alert resolves. No fingerprint can spam. - `add_to_digest` now returns bool (True = buffer grew, False = dedup / resolution / disabled) so the immediate-label path can flush only on state transitions. Net effect: max 1 thermal print per BATCH_INTERVAL_MIN per alert fingerprint, regardless of severity. Rules that genuinely need same-second paper opt in via `alert_channel=thermal_print_immediate` (currently zero rules use this). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:22:25 -05:00
Andrew Stoltz	c263426ea5	fc-devicemgmt: operator image fix + Web scaled to 0 OPERATOR (PodCrashLoopBackOff cleared): - Bumped image to v20260519-sp34cl3-fix (built from astoltz/FlowerCore.DeviceManagement@d9a3685 after Sprint 34 Cl-3 stranded branch was merged via PR #19 squash). - The v20260512-cx5 image was the broken Sprint 8 scaffold: generic Host builder, no kubeops, no Kestrel on :8080, no AddController chain. Readiness probe dial-tcp 8080 failed every restart. - The new image ships the AddController chain for all 4 reconcilers (DeviceCrd / DeviceGroupCrd / DevicePolicyCrd / RemoteCommandCrd) plus Kestrel on :8080 and /healthz. - Image saved + scp'd + ctr-imported on rke2-server / rke2-agent1 / rke2-agent2 before this commit. SHA256: 2cc79ee0a2313c550268d1244f805ae41b396362148dd5603061cc15b6f7fa7e WEB (DeploymentReplicasMismatch cleared via scale-to-0): - Web pod cannot start. Two upstream gaps must close first: 1) MySQL DB instance + user `fc_devicemgmt` / database `flowercore_devicemgmt` are not provisioned in fc-mysql. Cluster has zero MySqlInstanceCrds and no `mysql.fc-mysql.svc:3306` Service. 2) 1Password vault item `IAmWorkin/FlowerCore DeviceManagement Runtime` is missing (5 fields: DB-Password + 4 mTLS PEMs). OnePasswordItem CRD has been stuck Ready=False since 2026-05-18T02:58. - Same pattern as the brochure-web scale-to-0 in `914fed0` — make the cluster clean and quiet, let operator restart deploy on a real schedule. Re-enable path is fully documented in the deployment-web.yaml header comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:11:09 -05:00
Andrew Stoltz	bacac067cf	monitoring(irc-notify): hourly digest batching for thermal printer The thermal printer drained overnight (2026-05-18/19) because the old notify.py POSTed one print job per Grafana webhook fire. With 9 concurrently-firing alerts (zabbix-postgres + fc-devicemgmt + brochure + PrintPaperRollLow), every evaluation cycle stamped fresh CUPS jobs onto the queue until the operator physically powered the printer off. This refactor: - Adds env-var config: THERMAL_PRINT_ENABLED (master kill switch), BATCH_INTERVAL_MIN (default 60), BATCH_MAX_PENDING (default 50). - IRC delivery stays per-event (operator wants the live stream). - Thermal routing now: * critical/disaster/page severity OR alert_channel=thermal_print_immediate -> print immediately * alert_channel=thermal_print -> enqueue into hourly digest * RESOLVED -> remove from digest buffer (no resolution-spam prints) * else -> IRC only, no thermal - Background digest_loop thread flushes the buffer hourly (or sooner if buffer hits BATCH_MAX_PENDING). Digest payload is a single Print.Web /api/print/alert POST listing distinct alertnames + per-rule target counts. - New POST /flush endpoint (manual operator force-flush; useful for testing without waiting an hour). - GET / returns config + buffer depth + per-stat counters for observability. Net effect: max 1 thermal print per BATCH_INTERVAL_MIN for batched warnings, plus immediate prints for criticals. Closes the 2026-05-18/19 alert-storm incident. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 09:56:14 -05:00
bluejay	914fed08d8	fix(brochure): scale brochure-web to 0 — wrong codebase shipped (Intranet.Web binary in fc-brochure-web image, CrashLoopBackOff 296 restarts on /data read-only). Re-enable after Sprint 34 Cx-3 rebuild per docs/ai-agents/codex-prompts/2026-05-18-fc-brochure-web-rebuild-pack.md	2026-05-19 14:45:01 +00:00
Andrew Stoltz	200aeab032	ttsreader: deploy study mode repair image	2026-05-18 16:33:08 -05:00
Andrew Stoltz	8182616d4c	ttsreader: point render piper to edge1 demo endpoint	2026-05-18 16:06:37 -05:00
Andrew Stoltz	f0862ac03c	ttsreader: deploy sprint36 demo audio image	2026-05-18 16:04:59 -05:00
Andrew Stoltz	46c392605e	monitoring: mirror PuppetServiceFailed alert from Notes (Sprint 33 Cx-7 Phase B) Mirrors the live `puppet` alert group from FlowerCore.Notes/scripts/monitoring/alerts.yml into the K8s ConfigMap so a future in-cluster Prometheus inherits the ruleset automatically. Source of truth remains the Notes file (live Podman Prometheus on noc1). See feedback_monitoring_k8s_target_vs_live_podman. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 11:11:07 -05:00
bluejay	89b147bbdd	docs(openvox): document quadlet durability smoke (#12 )	2026-05-18 04:53:02 +00:00
bluejay	d7238a5e3b	feat(brochure): add public brochure GitOps app (#13 )	2026-05-18 04:52:37 +00:00
bluejay	fc444a02a1	feat(chat): add public twin ingress (#11 )	2026-05-18 04:52:20 +00:00
bluejay	83d4883d55	feat(worldbuilder): pin k8s demo to fake backend (#10 )	2026-05-18 04:52:11 +00:00
bluejay	f8fe3b2688	feat(github-runner): add final long-tail runners (#9 )	2026-05-18 04:52:01 +00:00
bluejay	f2ab892ebc	feat(github-runner): add Marquee + TtsReader per-repo runners (#8 )	2026-05-18 03:27:14 +00:00
bluejay	fef68a9560	feat(fc-devicemgmt): add Kubernetes deployment manifests (#1 ) Sprint 8 IMPL lane Cx-5: fc-devicemgmt K8s manifests (rebased onto main 2026-05-18; 13 files, +944). Namespace + Web Deployment (replicas:2, MySQL backend) + Operator Deployment (replicas:1, KubeOps leader-elect) + Service + Certificate (step-ca-acme ClusterIssuer) + Traefik IngressRoute (devices.iamworkin.lan internal) + ServiceAccount + ClusterRole + ClusterRoleBinding + NetworkPolicy (CNI DNAT-aware backend ports) + OnePasswordItem (5-field consolidated) + ArgoCD Application bootstrap shape + lint coverage. Follow-ups (not merge blockers): - localhost/fc-devicemgmt-{web,operator}:v20260512-cx5 must be imported to all 3 RKE2 nodes; pods will ErrImageNeverPull until imported. - 1Password vault item 'FlowerCore DeviceManagement Runtime' must be created with 5 fields before pods can start. - DNS devices.iamworkin.lan -> 10.0.56.200 already present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 02:56:23 +00:00
Andrew Stoltz	6fe77225ae	fix(github-runner): dedupe DOTNET_INSTALL_DIR+NUGET_PACKAGES on base+sharedpos PR #5 rebase concatenated PR #5 env additions onto PR #7 env additions on the base + sharedpos Deployments, producing duplicate-key validation errors in ArgoCD's structured merge. The DOTNET_INSTALL_DIR and NUGET_PACKAGES values are identical between PR #5 and PR #7; keep the PR #7 originals and retain only the unique new env vars from PR #5 (DOTNET_CLI_TELEMETRY_OPTOUT, DOTNET_NOLOGO, DOTNET_GENERATE_ASPNET_CERTIFICATE). No behavioral change — same final env var set, no duplicates.	2026-05-17 21:53:05 -05:00
bluejay	634b9c4169	feat(github-runner): harden Linux runner fleet (#5 )	2026-05-18 02:51:02 +00:00