Issues booting a Virtual Machine with Single GPU Passthrough

I recently managed to get a virtual machine working with Single GPU Passthrough but I've been having issues with getting it started. I've been able to get into it, but very unreliably. Most of the time it either kicks me back to the login screen (Which is normal when the VM quits gracefully) or I just lose all display output and my system seems to hang, not responding to Ctrl+Alt+Del, requiring a hard power off.

Last time I managed to get the VM running, earlier today, was when I SSH'd into my host from a separate machine and ran sudo virsh start to see if I'd get any console output on a failiure. I didn't have the time to try it again to see if it would fail with errors.

I'm confident the start hook script works as the VM will go into windows and work perfectly fine when it does want to work and the release script will reliably rebind the GPU and set me back to the host. Nor do I get any errors when I run the scripts in an SSH session.

I've tried looking at the var log but I can't make heads or tails of it, so I'll share that here for extra info.

Kernel: 6.0.10-zen2-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=e9fb4e39-76f0-470b-b069-4298b6b21895 rw [email protected]
nvidia-drm.modeset=1 amd_iommu=on iommu=pt video=efifb:off quiet splash
rd.udev.log_priority=3 vt.global_cursor_default=0 loglevel=3 ibt=off
Desktop: KDE Plasma v: 5.26.3 tk: Qt v: 5.15.7 info: latte-dock
wm: kwin_x11 vt: 1 dm: SDDM Distro: Garuda Linux base: Arch Linux
Type: Desktop Mobo: ASUSTeK model: PRIME B350-PLUS v: Rev X.0x
serial: <superuser required> UEFI: American Megatrends v: 5602
date: 07/14/2020
Device-1: ps-controller-battery-4c:b9:9b:8e:d3:95 model: N/A serial: N/A
charge: N/A status: charging
Info: model: AMD Ryzen 5 3600 bits: 64 type: MT MCP arch: Zen 2 gen: 3
level: v3 note: check built: 2020-22 process: TSMC n7 (7nm)
family: 0x17 (23) model-id: 0x71 (113) stepping: 0 microcode: 0x8701021
Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 3 MiB desc: 6x512 KiB
L3: 32 MiB desc: 2x16 MiB
Speed (MHz): avg: 3637 high: 3883 min/max: 2200/4208 boost: enabled
scaling: driver: acpi-cpufreq governor: performance cores: 1: 3600 2: 3600
3: 3883 4: 3697 5: 3595 6: 3654 7: 3622 8: 3600 9: 3600 10: 3600 11: 3600
12: 3600 bogomips: 86241
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: mmio_stale_data status: Not affected
Type: retbleed mitigation: untrained return thunk; SMT enabled with STIBP
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, STIBP:
always-on, RSB filling, PBRSB-eIBRS: Not affected
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Device-1: NVIDIA GA104 [GeForce RTX 3070] vendor: Palit Microsystems
driver: nvidia v: 520.56.06 alternate: nouveau,nvidia_drm non-free: 520.xx+
status: current (as of 2022-10) arch: Ampere code: GAxxx
process: TSMC n7 (7nm) built: 2020-22 pcie: gen: 2 speed: 5 GT/s lanes: 16
link-max: gen: 4 speed: 16 GT/s ports: active: none off: DP-1,HDMI-A-1
empty: DP-2,DP-3 bus-ID: 0a:00.0 chip-ID: 10de:2484 class-ID: 0300
Display: x11 server: X.Org v: 21.1.4 with: Xwayland v: 22.1.5
compositor: kwin_x11 driver: X: loaded: nvidia unloaded: modesetting
alternate: fbdev,nouveau,nv,vesa gpu: nvidia,nvidia-nvswitch
display-ID: :0 screens: 1
Screen-1: 0 s-res: 4480x1440 s-dpi: 108 s-size: 1054x342mm (41.50x13.46")
s-diag: 1108mm (43.63")
Monitor-1: DP-1 mapped: DP-0 note: disabled pos: primary,top-right
model: Samsung LC27G7xT serial: <filter> built: 2020 res: 2560x1440
dpi: 65024 gamma: 1.2 size: 1x1mm (0.04x0.04") diag: 686mm (27")
ratio: 16:9 modes: max: 2560x1440 min: 640x480
Monitor-2: HDMI-A-1 mapped: HDMI-0 note: disabled pos: bottom-l
model: Acer KG271 C serial: <filter> built: 2018 res: 1920x1080 dpi: 82
gamma: 1.2 size: 598x336mm (23.54x13.23") diag: 686mm (27") ratio: 16:9
modes: max: 1920x1080 min: 640x480
API: OpenGL v: 4.6.0 NVIDIA 520.56.06 renderer: NVIDIA GeForce RTX
3070/PCIe/SSE2 direct render: Yes
Device-1: NVIDIA GA104 High Definition Audio vendor: Palit Microsystems
driver: snd_hda_intel v: kernel bus-ID: 3-1:2 pcie: chip-ID: 1852:7022
class-ID: 0102 gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4
speed: 16 GT/s bus-ID: 0a:00.1 chip-ID: 10de:228b class-ID: 0403
Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
bus-ID: 0c:00.4 chip-ID: 1022:1487 class-ID: 0403
Device-3: GYROCOM C&C Fiio E10 type: USB
driver: hid-generic,snd-usb-audio,usbhid
Device-4: Sony DualSense wireless controller (PS5) type: USB
driver: playstation,snd-usb-audio,usbhid bus-ID: 3-4:4 chip-ID: 054c:0ce6
class-ID: 0300
Sound API: ALSA v: k6.0.10-zen2-1-zen running: yes
Sound Interface: sndio v: N/A running: no
Sound Server-1: PulseAudio v: 16.1 running: no
Sound Server-2: PipeWire v: 0.3.61 running: yes
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: ASUSTeK PRIME B450M-A driver: r8169 v: kernel pcie: gen: 1
speed: 2.5 GT/s lanes: 1 port: f000 bus-ID: 04:00.0 chip-ID: 10ec:8168
class-ID: 0200
IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel pcie: gen: 2
speed: 5 GT/s lanes: 1 bus-ID: 09:00.0 chip-ID: 8086:2723 class-ID: 0280
IF: wlp9s0 state: down mac: <filter>
Device-3: Sony DualSense wireless controller (PS5) type: USB
driver: playstation,snd-usb-audio,usbhid bus-ID: 3-4:4 chip-ID: 054c:0ce6
class-ID: 0300
IF-ID-1: virbr0 state: down mac: <filter>
IF-ID-2: wg-mullvad state: unknown speed: N/A duplex: N/A mac: N/A
Device-1: Intel AX200 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 1-9:3
chip-ID: 8087:0029 class-ID: e001
Report: bt-adapter ID: hci0 rfk-id: 1 state: up address: <filter>
Local Storage: total: 6.82 TiB used: 1.26 TiB (18.5%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Crucial model: CT1000P1SSD8
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
lanes: 4 type: SSD serial: <filter> rev: P3CR013 temp: 34.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Crucial model: CT1000BX500SSD1
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 030 scheme: GPT
ID-3: /dev/sdb maj-min: 8:16 vendor: Seagate model: ST4000DM004-2CV104
size: 3.64 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
type: HDD rpm: 5425 serial: <filter> rev: 0001 scheme: GPT
ID-4: /dev/sdc maj-min: 8:32 vendor: Samsung model: SSD 870 EVO 500GB
size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 1B6Q scheme: MBR
ID-5: /dev/sdd maj-min: 8:48 type: USB vendor: Seagate model: Portable
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B type: N/A
serial: <filter> rev: 0712 scheme: MBR
ID-1: / raw-size: 923.02 GiB size: 923.02 GiB (100.00%)
used: 326.36 GiB (35.4%) fs: btrfs dev: /dev/nvme0n1p3 maj-min: 259:3
ID-2: /boot/efi raw-size: 498 MiB size: 497 MiB (99.80%)
used: 152.7 MiB (30.7%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
ID-3: /home raw-size: 923.02 GiB size: 923.02 GiB (100.00%)
used: 326.36 GiB (35.4%) fs: btrfs dev: /dev/nvme0n1p3 maj-min: 259:3
ID-4: /var/log raw-size: 923.02 GiB size: 923.02 GiB (100.00%)
used: 326.36 GiB (35.4%) fs: btrfs dev: /dev/nvme0n1p3 maj-min: 259:3
ID-5: /var/tmp raw-size: 923.02 GiB size: 923.02 GiB (100.00%)
used: 326.36 GiB (35.4%) fs: btrfs dev: /dev/nvme0n1p3 maj-min: 259:3
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 31.28 GiB used: 1.5 MiB (0.0%) priority: 100
dev: /dev/zram0
ID-2: swap-2 type: partition size: 4 GiB used: 0 KiB (0.0%) priority: -2
dev: /dev/nvme0n1p4 maj-min: 259:4
System Temperatures: cpu: 49.2 C mobo: N/A gpu: nvidia temp: 32 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
Processes: 439 Uptime: 25m wakeups: 2 Memory: 31.28 GiB
used: 5.69 GiB (18.2%) Init: systemd v: 252 default: graphical
tool: systemctl Compilers: gcc: 12.2.0 clang: 14.0.6 Packages: pm: pacman
pkgs: 1961 libs: 494 tools: octopi,pamac,paru Shell: fish v: 3.5.1
running-in: yakuake inxi: 3.3.23
Garuda (2.6.9-1):
System install date:     2022-03-03
Last full system update: 2022-11-29 ↻
Is partially upgraded:   No
Relevant software:       NetworkManager
Windows dual boot:       Probably (Run as root to verify)
Snapshots:               Snapper
Failed units:            plymouth-deactivate.service plymouth-start.service

Here is the output of the VM log file.

2022-11-29 17:33:46.224+0000: starting up libvirt version: 8.9.0, qemu version: 7.1.0, kernel: 6.0.10-zen2-1-zen, hostname: BaulMachine2
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin \
HOME=/var/lib/libvirt/qemu/domain-1-win10-2 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-win10-2/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-win10-2/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-win10-2/.config \
/usr/bin/qemu-system-x86_64 \
-name guest=win10-2,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-win10-2/master-key.aes"}' \
-blockdev '{"driver":"file","filename":"/usr/share/edk2-ovmf/x64/OVMF_CODE.secboot.4m.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10-2_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \
-machine pc-q35-7.1,usb=off,vmport=off,smm=on,dump-guest-core=off,memory-backend=pc.ram,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \
-accel kvm \
-cpu host,migratable=on,hv-time=on,hv-relaxed=on,hv-vapic=on,hv-spinlocks=0x1fff \
-global driver=cfi.pflash01,property=secure,value=on \
-m 16384 \
-object '{"qom-type":"memory-backend-memfd","id":"pc.ram","share":true,"x-use-canonical-path-for-ramblock-id":false,"size":17179869184}' \
-overcommit mem-lock=off \
-smp 6,sockets=1,dies=1,cores=6,threads=1 \
-uuid 2add6cf0-b813-44c7-b482-66e1d2e54dd3 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=30,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot menu=off,strict=on \
-device '{"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}' \
-device '{"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}' \
-device '{"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}' \
-device '{"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}' \
-device '{"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}' \
-device '{"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}' \
-device '{"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}' \
-device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \
-device '{"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}' \
-device '{"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}' \
-device '{"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}' \
-device '{"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}' \
-device '{"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}' \
-device '{"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}' \
-device '{"driver":"pcie-root-port","port":30,"chassis":15,"id":"pci.15","bus":"pcie.0","addr":"0x3.0x6"}' \
-device '{"driver":"pcie-pci-bridge","id":"pci.16","bus":"pci.1","addr":"0x0"}' \
-device '{"driver":"qemu-xhci","p2":15,"p3":15,"id":"usb","bus":"pci.2","addr":"0x0"}' \
-device '{"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.3","addr":"0x0"}' \
-blockdev '{"driver":"file","filename":"/mnt/Speedy/QEMU/win10-2.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-pci","bus":"pci.6","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=32,id=hostnet0 \
-device '{"driver":"rtl8139","netdev":"hostnet0","id":"net0","mac":"52:54:00:4d:4c:b9","bus":"pci.16","addr":"0x1"}' \
-chardev pty,id=charserial0 \
-device '{"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0}' \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"com.redhat.spice.0"}' \
-device '{"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"}' \
-audiodev '{"id":"audio1","driver":"spice"}' \
-vnc,audiodev=audio1 \
-spice port=0,disable-ticketing=on,image-compression=off,seamless-migration=on \
-device '{"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"}' \
-device '{"driver":"ich9-intel-hda","id":"sound0","bus":"pcie.0","addr":"0x1b"}' \
-device '{"driver":"hda-duplex","id":"sound0-codec0","bus":"sound0.0","cad":0,"audiodev":"audio1"}' \
-chardev spicevmc,id=charredir0,name=usbredir \
-device '{"driver":"usb-redir","chardev":"charredir0","id":"redir0","bus":"usb.0","port":"2"}' \
-chardev spicevmc,id=charredir1,name=usbredir \
-device '{"driver":"usb-redir","chardev":"charredir1","id":"redir1","bus":"usb.0","port":"3"}' \
-device '{"driver":"vfio-pci","host":"0000:0a:00.0","id":"hostdev0","bus":"pci.7","addr":"0x0"}' \
-device '{"driver":"vfio-pci","host":"0000:0a:00.1","id":"hostdev1","bus":"pci.8","addr":"0x0"}' \
-device '{"driver":"usb-host","hostdevice":"/dev/bus/usb/001/004","id":"hostdev2","bus":"usb.0","port":"4"}' \
-device '{"driver":"usb-host","hostdevice":"/dev/bus/usb/003/003","id":"hostdev3","bus":"usb.0","port":"5"}' \
-device '{"driver":"usb-host","hostdevice":"/dev/bus/usb/003/002","id":"hostdev4","bus":"usb.0","port":"6"}' \
-device '{"driver":"usb-host","hostdevice":"/dev/bus/usb/001/002","id":"hostdev5","bus":"usb.0","port":"7"}' \
-device '{"driver":"usb-host","hostdevice":"/dev/bus/usb/003/004","id":"hostdev6","bus":"usb.0","port":"8"}' \
-device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.4","addr":"0x0"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/0 (label charserial0)
2022-11-29 17:33:52.702+0000: shutting down, reason=failed
2022-11-29T17:33:52.703313Z qemu-system-x86_64: terminating on signal 15 from pid 87246 (/usr/bin/libvirtd)

