WIP: Cx-12 Longhorn PVC growth alarm for RD volumes #17

Draft
bluejay wants to merge 1 commits from sprint39/cx-12-longhorn-pvc-growth-alarm into main
Owner

Summary

  • Add LonghornPVCGrowthRapid to the K8s monitoring migration target in apps/monitoring/noc-monitoring.yaml.
  • Alert when RemoteDesktop Longhorn PVC actual size grows more than 20% in 1h or exceeds 80% capacity.
  • Route with alert_channel: thermal_print plus service: remotedesktop for Grafana/IRC thermal handling.
  • Include TODO annotation documenting the current live Prometheus metric gate.

Verification

  • git fetch --all --prune completed; default Gitea branch is main.
  • PromQL expression parsed through http://10.0.56.10:9090/api/v1/query and returned success with empty result due missing live Longhorn byte metrics.
  • http://10.0.56.10:9090/api/v1/rules reachable: 34 groups / 133 rules; LonghornPVCGrowthRapid is not live yet.
  • dotnet test tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj -c Release ran: 57 passed, 3 existing unrelated failures.

Known Gate

Live noc1 Prometheus currently exposes kube_persistentvolumeclaim_info and kube_persistentvolumeclaim_resource_requests_storage_bytes; it does not expose longhorn_volume_actual_size_bytes, longhorn_volume_capacity_bytes, kube_persistentvolumeclaim_labels, or kubelet_volume_stats_used_bytes yet.

## Summary - Add `LonghornPVCGrowthRapid` to the K8s monitoring migration target in `apps/monitoring/noc-monitoring.yaml`. - Alert when RemoteDesktop Longhorn PVC actual size grows more than 20% in 1h or exceeds 80% capacity. - Route with `alert_channel: thermal_print` plus `service: remotedesktop` for Grafana/IRC thermal handling. - Include TODO annotation documenting the current live Prometheus metric gate. ## Verification - `git fetch --all --prune` completed; default Gitea branch is `main`. - PromQL expression parsed through `http://10.0.56.10:9090/api/v1/query` and returned `success` with empty result due missing live Longhorn byte metrics. - `http://10.0.56.10:9090/api/v1/rules` reachable: 34 groups / 133 rules; `LonghornPVCGrowthRapid` is not live yet. - `dotnet test tests/bluejay-infra-lint/BluejayInfraLint.Tests.csproj -c Release` ran: 57 passed, 3 existing unrelated failures. ## Known Gate Live noc1 Prometheus currently exposes `kube_persistentvolumeclaim_info` and `kube_persistentvolumeclaim_resource_requests_storage_bytes`; it does not expose `longhorn_volume_actual_size_bytes`, `longhorn_volume_capacity_bytes`, `kube_persistentvolumeclaim_labels`, or `kubelet_volume_stats_used_bytes` yet.
bluejay added 1 commit 2026-05-19 17:36:15 +00:00
bluejay changed title from Draft: Cx-12 Longhorn PVC growth alarm for RD volumes to WIP: Cx-12 Longhorn PVC growth alarm for RD volumes 2026-05-19 17:36:31 +00:00
This pull request is marked as a work in progress.
This branch is out-of-date with the base branch
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin sprint39/cx-12-longhorn-pvc-growth-alarm:sprint39/cx-12-longhorn-pvc-growth-alarm
git checkout sprint39/cx-12-longhorn-pvc-growth-alarm
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bluejay/bluejay-infra#17