Cant start VFIO vms

For some reason, “unknown” I can no longer start VMs using VFIO pass thru. It has been working flawlessly up till now. If I remove my video card I am passing through the VM starts.

when I start the VM the screen shows no error, attempts to start and immediatly stops. If I boot into windows, that particular card I am passing through works fine.

What I have tried:

reboots
updating
verifying vfio_PCI.ids is correct
verified grub line has not changed in last in 6 months
verified pci group is the same
verified iommu and vtd is on in system bios.

Any tips on what else to look at would be greatly apprecaited.

Brett

╭─brett@I9 in ~
╰─λ garuda-inxi
System:
Kernel: 6.13.3-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
clocksource: tsc avail: hpet,acpi_pm
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=c31be1c8-8b6d-4962-95cf-1bed6c6e8ce7 rw rootflags=subvol=@
rd.driver.pre=vfio-pci intel_iommu=on rd.udev.log_priority=3
vt.global_cursor_default=0 loglevel=3 vfio_pci.ids=10de:1c82,10de:0fb9
pcie_acs_override=downstream,multifunction video=efifb:off iommu=pt
ibt=off
Desktop: KDE Plasma v: 6.3.1 tk: Qt v: N/A info: frameworks v: 6.11.0
wm: kwin_wayland tools: avail: hypridle,hyprlock vt: 1 dm: SDDM
Distro: Garuda base: Arch Linux
Machine:
Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
Mobo: ASUSTeK model: TUF GAMING Z590-PLUS WIFI v: Rev 1.xx
serial: <superuser required> part-nu: SKU uuid: <superuser required>
UEFI: American Megatrends v: 1801 date: 12/26/2022
CPU:
Info: model: 11th Gen Intel Core i9-11900KF bits: 64 type: MT MCP
arch: Rocket Lake gen: core 11 level: v4 note: check built: 2021+
process: Intel 14nm family: 6 model-id: 0xA7 (167) stepping: 1
microcode: 0x63
Topology: cpus: 1x dies: 1 clusters: 8 cores: 8 threads: 16 tpc: 2
smt: enabled cache: L1: 640 KiB desc: d-8x48 KiB; i-8x32 KiB L2: 4 MiB
desc: 8x512 KiB L3: 16 MiB desc: 1x16 MiB
Speed (MHz): avg: 800 min/max: 800/5100:5300 scaling: driver: intel_pstate
governor: powersave cores: 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800
8: 800 9: 800 10: 800 11: 800 12: 800 13: 800 14: 800 15: 800 16: 800
bogomips: 112128
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities: <filter>
Graphics:
Device-1: NVIDIA GA106 [GeForce RTX 3060 Lite Hash Rate] vendor: Dell
driver: nvidia v: 570.86.16 alternate: nouveau,nvidia_drm
non-free: 550/565.xx+ status: current (as of 2025-01; EOL~2026-12-xx)
arch: Ampere code: GAxxx process: TSMC n7 (7nm) built: 2020-2023 pcie:
gen: 4 speed: 16 GT/s lanes: 16 ports: active: none
off: DP-2,DP-3,HDMI-A-1 empty: DP-1 bus-ID: 01:00.0 chip-ID: 10de:2504
class-ID: 0300
Device-2: NVIDIA GP107 [GeForce GTX 1050 Ti] vendor: Micro-Star MSI
driver: vfio-pci v: N/A alternate: nouveau,nvidia_drm,nvidia
non-free: 550/565.xx+ status: current (as of 2025-01; EOL~2026-12-xx)
arch: Pascal code: GP10x process: TSMC 16nm built: 2016-2021 pcie: gen: 1
speed: 2.5 GT/s lanes: 4 link-max: gen: 3 speed: 8 GT/s lanes: 16
bus-ID: 04:00.0 chip-ID: 10de:1c82 class-ID: 0300
Device-3: Adomax Nuroum C40
driver: hid-generic,snd-usb-audio,usbhid,uvcvideo type: USB rev: 2.0
speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-7.4.2:15 chip-ID: 0627:a6bf
class-ID: 0102 serial: <filter>
Device-4: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-7.4.3:16
chip-ID: 046d:082d class-ID: 0102 serial: <filter>
Display: wayland server: X.org v: 1.21.1.15 with: Xwayland v: 24.1.5
compositor: kwin_wayland driver: X: loaded: nvidia unloaded: modesetting
alternate: fbdev,nouveau,nv,vesa gpu: nvidia,nvidia-nvswitch
d-rect: 6400x2520 display-ID: 0
Monitor-1: DP-2 pos: bottom-l model: HP 27es serial: <filter> built: 2016
res: mode: 1920x1080 hz: 60 scale: 100% (1) dpi: 82 gamma: 1.2
size: 598x336mm (23.54x13.23") diag: 686mm (27") ratio: 16:9 modes:
max: 1920x1080 min: 640x480
Monitor-2: DP-3 pos: top-center model: Gigabyte AORUS FI32Q
serial: <filter> built: 2021 res: mode: 2560x1440 hz: 120 scale: 100% (1)
dpi: 93 gamma: 1.2 size: 698x392mm (27.48x15.43") diag: 801mm (31.5")
ratio: 16:9 modes: max: 2560x1440 min: 640x480
Monitor-3: HDMI-A-1 pos: bottom-r model: HP 27es serial: <filter>
built: 2016 res: mode: 1920x1080 hz: 60 scale: 100% (1) dpi: 82 gamma: 1.2
size: 598x336mm (23.54x13.23") diag: 686mm (27") ratio: 16:9 modes:
max: 1920x1080 min: 640x480
API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia gbm:
drv: nvidia surfaceless: drv: nvidia wayland: drv: nvidia x11: drv: nvidia
API: OpenGL v: 4.6.0 vendor: nvidia v: 570.86.16 glx-v: 1.4
direct-render: yes renderer: NVIDIA GeForce RTX 3060/PCIe/SSE2
memory: 11.72 GiB display-ID: :1.0
API: Vulkan v: 1.4.303 layers: 5 device: 0 type: discrete-gpu
name: NVIDIA GeForce RTX 3060 driver: N/A device-ID: 10de:2504
surfaces: xcb,xlib,wayland
Info: Tools: api: clinfo, eglinfo, glxinfo, vulkaninfo
de: kscreen-console,kscreen-doctor gpu: nvidia-settings,nvidia-smi
wl: nwg-displays,wayland-info x11: xdpyinfo, xprop, xrandr
Audio:
Device-1: Intel Tiger Lake-H HD Audio vendor: ASUSTeK driver: snd_hda_intel
v: kernel alternate: snd_soc_avs,snd_sof_pci_intel_tgl bus-ID: 00:1f.3
chip-ID: 8086:43c8 class-ID: 0403
Device-2: NVIDIA GA106 High Definition Audio vendor: Dell
driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
bus-ID: 01:00.1 chip-ID: 10de:228e class-ID: 0403
Device-3: NVIDIA GP107GL High Definition Audio vendor: Micro-Star MSI
driver: vfio-pci alternate: snd_hda_intel pcie: speed: Unknown lanes: 63
link-max: gen: 3 speed: 8 GT/s bus-ID: 04:00.1 chip-ID: 10de:0fb9
class-ID: 0403
Device-4: SteelSeries ApS Arctis 7
driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 1.1 speed: 12 Mb/s
lanes: 1 mode: 1.1 bus-ID: 1-7.2:10 chip-ID: 1038:12ad class-ID: 0300
Device-5: GN Netcom Jabra Engage 75 driver: jabra,snd-usb-audio,usbhid
type: USB rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-7.4.1:14
chip-ID: 0b0e:1112 class-ID: 0300 serial: <filter>
Device-6: Adomax Nuroum C40
driver: hid-generic,snd-usb-audio,usbhid,uvcvideo type: USB rev: 2.0
speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-7.4.2:15 chip-ID: 0627:a6bf
class-ID: 0102 serial: <filter>
Device-7: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-7.4.3:16
chip-ID: 046d:082d class-ID: 0102 serial: <filter>
API: ALSA v: k6.13.3-zen1-1-zen status: kernel-api tools: N/A
Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
Server-2: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Intel Tiger Lake PCH CNVi WiFi driver: iwlwifi v: kernel
bus-ID: 00:14.3 chip-ID: 8086:43f0 class-ID: 0280
IF: wlo1 state: down mac: <filter>
Device-2: Intel Ethernet I225-V vendor: ASUSTeK driver: igc v: kernel
pcie: gen: 2 speed: 5 GT/s lanes: 1 port: N/A bus-ID: 07:00.0
chip-ID: 8086:15f3 class-ID: 0200
IF: enp7s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
IF-ID-1: virbr0 state: down mac: <filter>
Info: services: NetworkManager, smbd, systemd-timesyncd, wpa_supplicant
Bluetooth:
Device-1: Intel AX201 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:13 chip-ID: 8087:0026
class-ID: e001
Report: btmgmt ID: hci0 rfk-id: 0 state: down bt-service: enabled,running
rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
status: discoverable: no pairing: no
Drives:
Local Storage: total: 9.48 TiB used: 605.33 GiB (6.2%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:11 model: PCIe SSD size: 931.51 GiB
block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4
tech: SSD serial: <filter> fw-rev: EHFM60.0 temp: 29.9 C scheme: MBR
ID-2: /dev/nvme1n1 maj-min: 259:3 vendor: Samsung model: SSD 970 EVO 1TB
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
lanes: 4 tech: SSD serial: <filter> fw-rev: 2B2QEXE7 temp: 35.9 C
scheme: GPT
ID-3: /dev/nvme2n1 maj-min: 259:0 vendor: Western Digital
model: WDS100T2B0C-00PXH0 size: 931.51 GiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: <filter>
fw-rev: 233010WD temp: 28.9 C scheme: MBR
ID-4: /dev/sda maj-min: 8:0 vendor: SanDisk model: SSD PLUS 1000GB
size: 931.52 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 00RL scheme: MBR
ID-5: /dev/sdb maj-min: 8:16 vendor: SanDisk model: SSD PLUS 1000GB
size: 931.52 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 00RL scheme: GPT
ID-6: /dev/sdc maj-min: 8:32 vendor: SanDisk model: SDSSDH3 512G
size: 476.94 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 1000 scheme: MBR
ID-7: /dev/sdd maj-min: 8:48 vendor: ASMedia model: T ASM236X NVME
size: 119.24 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 2.1 spd: 480 Mb/s lanes: 1 mode: 2.0 tech: SSD serial: <filter>
scheme: GPT
ID-8: /dev/sde maj-min: 8:64 vendor: Samsung model: SSD 860 EVO 500G
size: 465.76 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.0 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0301 scheme: GPT
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
ID-9: /dev/sdf maj-min: 8:80 vendor: Samsung model: SSD 860 EVO 500GB
size: 465.76 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.0 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0301 scheme: GPT
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
ID-10: /dev/sdg maj-min: 8:96 vendor: Samsung model: SSD 860 EVO 500G
size: 465.76 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.0 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0301 scheme: GPT
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
ID-11: /dev/sdh maj-min: 8:112 vendor: Samsung model: SSD 860 EVO 500G
size: 465.76 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.0 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0301 scheme: GPT
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
ID-12: /dev/sdi maj-min: 8:128 model: SATA SSD size: 476.94 GiB
block-size: physical: 512 B logical: 512 B type: USB rev: 3.0 spd: 5 Gb/s
lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter> fw-rev: 0301
scheme: GPT
ID-13: /dev/sdj maj-min: 8:144 model: External USB3.0 size: 223.57 GiB
block-size: physical: 512 B logical: 512 B type: USB rev: 3.0 spd: 5 Gb/s
lanes: 1 mode: 3.2 gen-1x1 tech: N/A serial: <filter> fw-rev: 0301
scheme: GPT
ID-14: /dev/sdk maj-min: 8:160 model: External USB3.0 size: 476.94 GiB
block-size: physical: 512 B logical: 512 B type: USB rev: 3.0 spd: 5 Gb/s
lanes: 1 mode: 3.2 gen-1x1 tech: N/A serial: <filter> fw-rev: 0301
scheme: GPT
ID-15: /dev/sdl maj-min: 8:176 model: External USB3.0 size: 931.51 GiB
block-size: physical: 512 B logical: 512 B type: USB rev: 3.0 spd: 5 Gb/s
lanes: 1 mode: 3.2 gen-1x1 tech: N/A serial: <filter> fw-rev: 0301
scheme: MBR
ID-16: /dev/sdm maj-min: 8:192 vendor: Samsung model: SSD 850 PRO 256GB
size: 238.47 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.1 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0103 scheme: MBR
ID-17: /dev/sdn maj-min: 8:208 vendor: Samsung model: SSD 850 PRO 256GB
size: 238.47 GiB block-size: physical: 512 B logical: 512 B type: USB
rev: 3.1 spd: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 tech: SSD serial: <filter>
fw-rev: 0103 scheme: GPT
Partition:
ID-1: / raw-size: 205.13 GiB size: 205.13 GiB (100.00%)
used: 178.45 GiB (87.0%) fs: btrfs dev: /dev/nvme1n1p5 maj-min: 259:8
ID-2: /boot/efi raw-size: 100 MiB size: 96 MiB (96.00%)
used: 44.9 MiB (46.8%) fs: vfat dev: /dev/nvme1n1p1 maj-min: 259:4
ID-3: /home raw-size: 205.13 GiB size: 205.13 GiB (100.00%)
used: 178.45 GiB (87.0%) fs: btrfs dev: /dev/nvme1n1p5 maj-min: 259:8
ID-4: /var/log raw-size: 205.13 GiB size: 205.13 GiB (100.00%)
used: 178.45 GiB (87.0%) fs: btrfs dev: /dev/nvme1n1p5 maj-min: 259:8
ID-5: /var/tmp raw-size: 205.13 GiB size: 205.13 GiB (100.00%)
used: 178.45 GiB (87.0%) fs: btrfs dev: /dev/nvme1n1p5 maj-min: 259:8
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 31.15 GiB used: 0 KiB (0.0%) priority: 100
comp: zstd avail: lzo-rle,lzo,lz4,lz4hc,deflate,842 max-streams: 16
dev: /dev/zram0
Sensors:
System Temperatures: cpu: 35.0 C mobo: 27.0 C
Fan Speeds (rpm): fan-1: 0 fan-2: 1396 fan-3: 0 fan-4: 0 fan-5: 0
fan-6: 4041 fan-7: 0
Info:
Memory: total: 32 GiB note: est. available: 31.15 GiB used: 4.4 GiB (14.1%)
Processes: 444 Power: uptime: 0m states: freeze,mem,disk suspend: deep
avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
suspend, test_resume image: 12.39 GiB services: org_kde_powerdevil,
power-profiles-daemon, upowerd Init: systemd v: 257 default: graphical
tool: systemctl
Packages: 1974 pm: pacman pkgs: 1967 libs: 503
tools: octopi,pacseek,pamac,paru,yay pm: flatpak pkgs: 7 Compilers:
clang: 19.1.7 gcc: 14.2.1 Shell: garuda-inxi default: fish v: 3.7.1
running-in: konsole inxi: 3.3.37
Garuda (2.6.26-1.1):
System install date:     2024-11-23
Last full system update: 2025-02-22
Is partially upgraded:   No
Relevant software:       snapper NetworkManager dracut nvidia-dkms
Windows dual boot:       Probably (Run as root to verify)
Failed units:

In looking at livbvirt logs I noticed DMA errors. After the most recent QEMU upgrade all my vm .xml configs smacked these lines with see below “Current: (corrupt)” this for some reason:

After the replacing the lines, it puts them back to the marked up “Current: (corrupt)” version. I think this is the issue but not sure why or how to stop it.

The VM will run and I hook up a monitor to it, I just can’t use looking glass now.

Here is the setup doc to follow and its been working, but it clearly hates something now.

https://looking-glass.io/docs/B7-rc1-21-ecd3692e/ivshmem_kvmfr/

Current: (corrupt)

See screen shot:

What it should be:

qemu:commandline
<qemu:arg value=‘-device’/>
<qemu:arg value=‘{“driver”:“ivshmem-plain”,“id”:“shmem0”,“memdev”:“looking-glass”}’/>
<qemu:arg value=‘-object’/>
<qemu:arg value=‘{“qom-type”:“memory-backend-file”,“id”:“looking-glass”,“mem-path”:“/dev/kvmfr0”,“size”:33554432,“share”:true}’/>
</qemu:commandline>

Well after visiting some forums, this is not the issue. In my VM logs I am getting th following error:

025-02-23 00:07:16.253+0000: Domain id=4 is tainted: high-privileges
2025-02-23 00:07:16.253+0000: Domain id=4 is tainted: custom-argv
char device redirected to /dev/pts/2 (label charserial0)
2025-02-23T00:07:17.601135Z qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
2025-02-23T00:07:17.601156Z qemu-system-x86_64: vfio_container_dma_map(0x5573edd15bf0, 0x7100000000, 0x2000000, 0x7aa8a1fff000) = -2 (No such file or directory)
qemu: hardware error: vfio: DMA mapping failed, unable to continue

It would appear for some reason KVMR kernel module is now missing. For the life of me I can not get it back?

insmod: ERROR: could not load module kvmfr.ko: No such file or directory

Narrowed it down to this error, can anyone direct me on how to fix this. I am missing my looking-glass ;-).

( 4/12) Install DKMS modules

==> dkms install --no-depmod kvmfr/0.0.11 -k 6.13.4-zen1-1-zen

Error! Bad return status for module build on kernel: 6.13.4-zen1-1-zen (x86_64)

Consult /var/lib/dkms/kvmfr/0.0.11/build/make.log for more information.

==> WARNING: `dkms install --no-depmod kvmfr/0.0.11 -k 6.13.4-zen1-1-zen’ exited 10

I have no idea about the subject.

I recently came across the vfio-lts kernel and installed* it out of curiosity. It worked, but I immediately uninstalled it because I don’t use a VM.

*Of course not without reading this documentation: VFIO - “Virtual Function I/O” — The Linux Kernel documentation

Was able to resolve this. There is post on looking glass got to patch a file so kernel 6.13 loads the module.

We can make this post the solution again if you post the patch or the link to the solution.

We want to avoid frustration from those seeking help in our forum.

Thanks :slight_smile:

2 Likes

Solution:

For those finding this who have a 6.13 kernel, instructions for fixing the dkms kvmfr module:

  1. Go to your looking glass source folder (you probably have it already if you installed, if not re-download the source for bleeding edge looking-glass-B7-rc1-34-e25492a3.
  2. CD into that directory and download this patch:wget -O linux-613-patch.patch https://github.com/gnif/LookingGlass/commit/4251a5c5fe7723c5dc068839debd76a5148953b2.patch
  3. Patch the files (it’s just two files, a one line change in the module source, and a version bump).patch -p1 -i linux-613-patch.patch
  4. Re-compile the module, then re-add it to dkms (the Dynamic Kernel Module System)cd module make sudo dkms install “.”
  5. Then finally load the module as normal, or if you have it set up to auto-load on boot, just reboot. See the manual for the right number to put in static_size_mb=xmodprobe kvmfr static_size_mb=[YOUR MEMORY FOR SCREEN SIZE HERE]
1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.