fix(ci1): switch ISO delivery to containerDisk OCI image (Path C)
OCI image: localhost/win-server-2025:1.0 (8.27 GB)
Built FROM scratch + ADD disk.img → /disk/disk.img on noc1, podman
saved as tar (8.27 GB), SCP'd in parallel to all 3 RKE2 nodes,
imported via ctr in k8s.io namespace. Verified present on all 3
schedulable nodes (rke2-server, rke2-agent1, rke2-agent2).
Why containerDisk over the prior PVC paths:
- Path A (Longhorn Filesystem PVC, sata): OVMF BdsDxe SATA-CDROM
read timeout. Cdrom-backed PVC is too slow for OVMF's first-sector
read window.
- Path B (Synology NFS): uid 107 (qemu) denied at directory level by
Synology export ACL despite file mode 0777. Memory:
feedback_synology_iso_export_root_only_uid_107_denied.
- Path B+SCSI: same OVMF timeout, just on SCSI controller. Bus
choice was not load-bearing — the issue was always the slow PVC
backing.
- Path C (this commit): containerDisk delivers the ISO bytes from
a tmpfs view of the OCI layer, no PVC controller in the read path.
qemu reads at native FS speed; OVMF first-sector read completes
well within timeout. This is also the KubeVirt-recommended pattern
for installer ISOs.
Connects to FlowerCore.Distribution / Provisioning USB story: same
"OCI image of the OS installer + autounattend on a sysprep CDROM"
pattern that the USB provisioning agent will use. The Windows
install proceeds hands-off via the existing autounattend.xml in
ci1-autounattend ConfigMap (RDP enabled, WinRM, UAC disabled,
Administrator password from 1Password vault item
h3ix4mgfk65gmkcmvh6ly3d3hu).
Image lifecycle: bump tag (1.1, 1.2, ...) when ISO version changes,
rebuild on noc1, redistribute to RKE2 nodes, update image: line.
Legacy NFS PVC + PV manifest and CDI Longhorn PVC RETAINED for this
commit so prior states are recoverable. Will prune in follow-up
once containerDisk boot proves.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -396,11 +396,12 @@ spec:
|
||||
# Confirmed via debug pod: PVC content IS a real bootable ISO9660
|
||||
# (file: "ISO 9660 CD-ROM filesystem data ... (bootable)"), so the
|
||||
# only bug was boot priority.
|
||||
# 2026-05-08 PM: cdrom bus flipped sata→scsi for windows-iso to address
|
||||
# the OVMF SATA-CDROM read timeout (`BdsDxe: failed to start Boot0001 ...
|
||||
# Time out`). The SCSI CDROM uses virtio-scsi controller which has a
|
||||
# longer read window and works cleanly on Filesystem-backed PVCs.
|
||||
# See diagnostic chain in HANDOFF.md / CODEX-STATUS.md "OPEN — ci1".
|
||||
# 2026-05-08 PM: cdrom bus is SCSI (virtio-scsi controller). Bus
|
||||
# choice is no longer load-bearing since the ISO is delivered via
|
||||
# containerDisk (see volumes block below) — both SATA and SCSI
|
||||
# work fine when the cdrom backing isn't a slow PVC. SCSI is kept
|
||||
# because it's the modern bus and matches the standard FC
|
||||
# KubeVirt VM template.
|
||||
- name: windows-iso
|
||||
bootOrder: 1
|
||||
cdrom:
|
||||
@@ -435,25 +436,40 @@ spec:
|
||||
persistentVolumeClaim:
|
||||
claimName: ci1-rootdisk
|
||||
- name: windows-iso
|
||||
# 2026-05-08 PM: REVERTED from NFS Path B back to the original CDI
|
||||
# Longhorn Filesystem PVC. NFS Path B (commit fc2aca0) failed at the
|
||||
# storage layer because the Synology export `/volume1/ISOs` denies
|
||||
# non-root client UIDs at the directory level (qemu uid 107 cannot
|
||||
# `ls /iso/` even with file mode 0777). Confirmed via uid-107 +
|
||||
# uid-0 busybox probe pods on rke2-agent2 — same export-only-root
|
||||
# pattern as `/volume1/kubernetes` documented in
|
||||
# `feedback_synology_nfs_kubernetes_export_root_only`. Memory:
|
||||
# `feedback_synology_iso_export_root_only_uid_107_denied.md`.
|
||||
# 2026-05-08 PM (Path C, CONTAINERDISK): the ISO is now packaged as
|
||||
# a KubeVirt containerDisk OCI image baked from
|
||||
# `FROM scratch ; ADD --chown=107:107 disk.img /disk/disk.img`.
|
||||
# The qemu user (uid 107) reads the ISO directly from a tmpfs view
|
||||
# of the OCI layer, bypassing both:
|
||||
# - Synology NFS export ACL (Path B failed: uid 107 denied at
|
||||
# directory level even with mode 0777, see memory
|
||||
# feedback_synology_iso_export_root_only_uid_107_denied)
|
||||
# - OVMF cdrom read-window timeout (Path A and Path B's SCSI
|
||||
# retry both hit `BdsDxe: failed to start Boot0001 ... Time out`
|
||||
# when the cdrom was backed by a PVC the storage controller
|
||||
# couldn't satisfy reads from fast enough).
|
||||
#
|
||||
# The Longhorn PVC `windows-server-2025-iso` (CDI Filesystem mode,
|
||||
# 10Gi) was confirmed to contain valid ISO bytes that uid 107 CAN
|
||||
# read (mode 0660 root:107). The OVMF SATA-CDROM read timeout from
|
||||
# the original Path A is now addressed by the `bus: scsi` swap on
|
||||
# the disks block above. The NFS PVC + PV are RETAINED on disk so
|
||||
# the Path B state is recoverable; they can be pruned in a
|
||||
# follow-up commit once SCSI boot is proven.
|
||||
persistentVolumeClaim:
|
||||
claimName: windows-server-2025-iso
|
||||
# Image build (one-time, per ISO version):
|
||||
# 1. Copy ISO to disk.img, write Dockerfile
|
||||
# 2. podman build --tag localhost/win-server-2025:1.0 . (on noc1)
|
||||
# 3. podman save -o win-server-2025-1.0.tar localhost/win-server-2025:1.0
|
||||
# 4. SCP tar to all 3 RKE2 nodes (rke2-server, rke2-agent1, rke2-agent2)
|
||||
# 5. sudo /var/lib/rancher/rke2/bin/ctr -a /run/k3s/containerd/containerd.sock \
|
||||
# -n k8s.io images import /tmp/win-server-2025-1.0.tar
|
||||
# Standard FC pattern per `feedback_rke2_localhost_imagepullpolicy`.
|
||||
#
|
||||
# When a new Windows ISO version ships, bump the tag (1.1, 1.2, ...),
|
||||
# rebuild + redistribute, and update the image: line below in a new
|
||||
# commit. KubeVirt picks up the new image via a VM restart.
|
||||
#
|
||||
# The legacy NFS PVC + PV (apps/kubevirt-vms/win2025-iso-nfs-pv.yaml)
|
||||
# and CDI Longhorn PVC (`windows-server-2025-iso`) are RETAINED for
|
||||
# this commit so the prior states are recoverable. Once the
|
||||
# containerDisk path proves on a successful Windows install, both
|
||||
# legacy artifacts can be pruned in a follow-up commit.
|
||||
containerDisk:
|
||||
image: localhost/win-server-2025:1.0
|
||||
imagePullPolicy: Never
|
||||
- name: virtio-drivers
|
||||
containerDisk:
|
||||
# Pinned to v1.8.2 (latest stable as of 2026-05-08).
|
||||
|
||||
Reference in New Issue
Block a user