Compare commits

...

1 Commits

Author SHA1 Message Date
Codex
667777a653 revert(ci1): back to cdrom:scsi (virtio-blk disk hit QEMU flock)
The virtio-blk disk swap (commit 84c9feb) didn't help: qemu fails to
acquire the write lock on the rootdisk PVC because the previous
launcher's qemu process didn't release it cleanly. Same family of
bug as the "stale QEMU flock" already documented in
feedback_kubevirt_iso_first_install_bootorder_and_runstrategy, but
now triggered on rke2-agent1 instead of agent2.

OVMF cdrom timeout is the real blocker and remains open:
  -  Distribution pipeline (build → save → scp → ctr import on all
    3 RKE2 nodes) is proven. localhost/win-server-2025:1.0 lives in
    each node's containerd k8s.io namespace.
  -  containerDisk + cdrom:scsi gets qemu domain Running (no NFS
    Permission denied, no rootdisk flock).
  -  OVMF BdsDxe times out reading the SCSI cdrom regardless of
    SecureBoot setting and bus type.

Reverting the disk type to cdrom:scsi so the VM lands back on the
"qemu Running, OVMF stuck at Boot Manager" state — known-stable and
easier to attack than the QEMU-flock state we hit by trying
virtio-blk disk.

Operator decision for next architectural step (one of):
  - Custom OVMF firmware build with longer Boot0001 timeout
  - KubeVirt version bump (v1.5+ has OVMF fixes)
  - Hyper-V/VirtualBox install + export VHD to ci1
  - BIOS legacy boot (Win Server 2025 needs UEFI but install media
    has a BIOS path)
  - DataVolume HTTP datasource (CDI internalizes ISO bytes via
    different code path)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:35:00 -05:00

View File

@@ -411,24 +411,22 @@ spec:
# Confirmed via debug pod: PVC content IS a real bootable ISO9660 # Confirmed via debug pod: PVC content IS a real bootable ISO9660
# (file: "ISO 9660 CD-ROM filesystem data ... (bootable)"), so the # (file: "ISO 9660 CD-ROM filesystem data ... (bootable)"), so the
# only bug was boot priority. # only bug was boot priority.
# 2026-05-08 PM: ISO presented as a virtio-blk DISK (not cdrom). # 2026-05-08 PM: cdrom bus SCSI + containerDisk delivery. This
# Both SATA and SCSI cdrom buses hit OVMF BdsDxe "starting Boot0001 # combination boots qemu cleanly and reaches OVMF, but OVMF
# ... Time out" regardless of storage backend (NFS, Longhorn PVC, # BdsDxe still hits "starting Boot0001 ... Time out" on the
# containerDisk tmpfs — all rule out IO speed). The qemu cdrom # cdrom — see HANDOFF.md / CODEX-STATUS.md "OPEN — ci1" for the
# emulation path appears to have a deep-seated read window issue # full diagnostic chain. virtio-blk disk swap was attempted as a
# under KubeVirt v1.4.0's OVMF firmware. # workaround but introduced a separate QEMU rootdisk flock issue
# # without fixing the underlying OVMF cdrom problem; reverted.
# Workaround: present the ISO bytes as a regular virtio-blk disk # Operator decision needed for next architectural step (OVMF
# (model="virtio-non-transitional"). UEFI/OVMF still recognizes # custom build with extended timeout, KubeVirt version bump,
# ISO9660 + El Torito boot records on a regular disk, so it can # Hyper-V/VirtualBox-and-export, or BIOS legacy boot). The
# boot the EFI bootloader the same way it would from a USB stick. # containerDisk distribution pipeline (build/save/scp/ctr import)
# This is also closer to the FlowerCore.Distribution USB-key # is proven and ready to reuse for any of those.
# pattern: the ISO bytes live on a block device, UEFI boots from
# the GPT/El Torito boot record, Windows installer runs.
- name: windows-iso - name: windows-iso
bootOrder: 1 bootOrder: 1
disk: cdrom:
bus: virtio bus: scsi
- name: rootdisk - name: rootdisk
bootOrder: 2 bootOrder: 2
disk: disk: