[Intel+Nvidia] Unable to start Xorg/Wayland in Discrete GPU mode

For the last 6 months or so I have been using Garuda without problem in my laptop's Dynamic Graphics mode (i.e. Hybrid Graphics). Today I tried switching to Discrete Graphics mode in the Bios, and I can't get any graphical environment to load. It freezes at the splash screen and never loads the login manager. I am able to switch to another TTY, but I can't start any X or Wayland sessions. Looking at /dev, I don't see any GPU's listed, and the /dev/dri folder is absent, so I suspect it is a driver loading issue.

I have a fresh vanilla Arch install on another partition that boots in Discrete Graphics mode with no issue. I checked the nvidia related settings in my grub and mkinitcpio and tried to match up Garuda's settings so what I set in arch, but no dice. Any ideas?

Btw, the garuda-inxi output listed below was run in Dynamic mode with a working system. I'm not in a position to reboot now, but I will re-run in Discrete mode later and post output.

❯ sudo garuda-inxi
System:
  Kernel: 5.17.9-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.1.0
    parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
    root=UUID=85934eef-40e6-42a1-94c8-6961030da57f rw rootflags=subvol=@
    quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0
    systemd.unified_cgroup_hierarchy=1 loglevel=3 intel_pstate=passive
  Console: pty pts/2 DM: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 81Y6 v: LEGION 5 26AMR5 LAPTOP
    serial: <filter> Chassis: type: 10 v: LEGION 5 26AMR5 LAPTOP
    serial: <filter>
  Mobo: LENOVO model: INVALID v: SDK0K17763 Win8 Pro MBR IPG
    serial: <filter> UEFI: LENOVO v: EFCN30WW date: 04/21/2020
Battery:
  ID-1: BAT0 charge: 51.8 Wh (99.0%) condition: 52.3/60.0 Wh (87.2%)
    volts: 17.2 min: 15.4 model: SMP L19M4PC0 type: Li-poly serial: <filter>
    status: N/A cycles: 234
  Device-1: ps-controller-battery-4c:b9:9b:f5:5a:46 model: N/A
    serial: N/A charge: N/A status: discharging
CPU:
  Info: model: Intel Core i7-10750H socket: U3E1 bits: 64 type: MT MCP
    arch: Comet Lake family: 6 model-id: 0xA5 (165) stepping: 2
    microcode: 0xF0
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB
    L3: 12 MiB desc: 1x12 MiB
  Speed (MHz): avg: 4073 high: 4266 min/max: 800/5000
    base/boost: 2475/8300 scaling: driver: intel_cpufreq
    governor: performance volts: 0.8 V ext-clock: 100 MHz cores: 1: 4200
    2: 4200 3: 4200 4: 4200 5: 4034 6: 4200 7: 4200 8: 4095 9: 4200
    10: 4263 11: 4266 12: 2829 bogomips: 62399
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass
    mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1
    mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2
    mitigation: Enhanced IBRS, IBPB: conditional, RSB filling
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Intel CometLake-H GT2 [UHD Graphics] vendor: Lenovo
    driver: i915 v: kernel ports: active: eDP-1 empty: none bus-ID: 00:02.0
    chip-ID: 8086:9bc4 class-ID: 0300
  Device-2: NVIDIA TU116M [GeForce GTX 1660 Ti Mobile] vendor: Lenovo
    driver: nvidia v: 515.43.04 alternate: nouveau,nvidia_drm
    non-free: 515.xx+ status: current (as of 2022-05) arch: Turing pcie:
    gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3 speed: 8 GT/s ports:
    active: none empty: DP-1,HDMI-A-1,eDP-2 bus-ID: 01:00.0
    chip-ID: 10de:2191 class-ID: 0300
  Device-3: Chicony Integrated Camera type: USB driver: uvcvideo
    bus-ID: 1-6:3 chip-ID: 04f2:b6c2 class-ID: 0e02 serial: <filter>
  Display: server: X.org v: 1.21.1.3 with: Xwayland v: 22.1.2 driver: X:
    loaded: modesetting,nvidia gpu: i915 display-ID: :1
  Monitor-1: eDP-1 model: BOE Display 0x0900 built: 2019 res: 1920x1080
    dpi: 142 gamma: 1.2 size: 344x194mm (13.54x7.64") diag: 395mm (15.5")
    ratio: 16:9 modes: 1920x1080
  Message: GL data unavailable for root.
Audio:
  Device-1: Intel Comet Lake PCH cAVS vendor: Lenovo driver: snd_hda_intel
    v: kernel alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3
    chip-ID: 8086:06c8 class-ID: 0403
  Device-2: NVIDIA TU116 High Definition Audio vendor: Lenovo
    driver: snd_hda_intel v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 16
    link-max: gen: 3 speed: 8 GT/s bus-ID: 01:00.1 chip-ID: 10de:1aeb
    class-ID: 0403
  Sound Server-1: ALSA v: k5.17.9-zen1-1-zen running: yes
  Sound Server-2: sndio v: N/A running: no
  Sound Server-3: PulseAudio v: 15.0 running: no
  Sound Server-4: PipeWire v: 0.3.51 running: yes
Network:
  Device-1: Intel Comet Lake PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:06f0 class-ID: 0280
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Lenovo driver: r8168 v: 8.050.00-NAPI modules: r8169 pcie:
    gen: 1 speed: 2.5 GT/s lanes: 1 port: 3000 bus-ID: 08:00.0
    chip-ID: 10ec:8168 class-ID: 0200
  IF: enp8s0 state: down mac: <filter>
Bluetooth:
  Device-1: Intel AX201 Bluetooth type: USB driver: btusb v: 0.8
    bus-ID: 1-14:6 chip-ID: 8087:0026 class-ID: e001
  Report: bt-adapter ID: hci0 rfk-id: 2 state: up address: <filter>
Drives:
  Local Storage: total: 1.38 TiB used: 776.76 GiB (55.2%)
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Smart Modular Tech.
    model: SHGP31-1000GM size: 931.51 GiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 41062C20 temp: 44.9 C scheme: GPT
  SMART: yes health: PASSED on: 7d 22h cycles: 330
    read-units: 2,079,703 [1.06 TB] written-units: 3,432,021 [1.75 TB]
  ID-2: /dev/nvme1n1 maj-min: 259:5 vendor: Toshiba
    model: KBG40ZNT512G MEMORY size: 476.94 GiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 0105AELA temp: 41.9 C scheme: GPT
  SMART: yes health: PASSED on: 144d 13h cycles: 13,687
    read-units: 176,645,009 [90.4 TB] written-units: 42,284,609 [21.6 TB]
Partition:
  ID-1: / raw-size: 328.07 GiB size: 328.07 GiB (100.00%)
    used: 206.66 GiB (63.0%) fs: btrfs block-size: 4096 B
    dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-2: /boot/efi raw-size: 260 MiB size: 256 MiB (98.46%)
    used: 218.3 MiB (85.3%) fs: vfat block-size: 512 B dev: /dev/nvme1n1p1
    maj-min: 259:6
  ID-3: /home raw-size: 328.07 GiB size: 328.07 GiB (100.00%)
    used: 206.66 GiB (63.0%) fs: btrfs block-size: 4096 B
    dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-4: /var/log raw-size: 328.07 GiB size: 328.07 GiB (100.00%)
    used: 206.66 GiB (63.0%) fs: btrfs block-size: 4096 B
    dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-5: /var/tmp raw-size: 328.07 GiB size: 328.07 GiB (100.00%)
    used: 206.66 GiB (63.0%) fs: btrfs block-size: 4096 B
    dev: /dev/nvme1n1p5 maj-min: 259:10
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: zram size: 15.49 GiB used: 33 MiB (0.2%)
    priority: 100 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 72.0 C pch: 51.0 C mobo: N/A
  Fan Speeds (RPM): N/A
Info:
  Processes: 421 Uptime: 1h 21m wakeups: 2 Memory: 15.49 GiB
  used: 8.45 GiB (54.6%) Init: systemd v: 251 tool: systemctl Compilers:
  gcc: 12.1.0 clang: 13.0.1 Packages: 2705 pacman: 2672 lib: 596
  flatpak: 25 snap: 8 Shell: garuda-inxi (sudo) default: Bash v: 5.1.16
  running-in: foot inxi: 3.3.16
Garuda (2.6.3-2):
  System install date:     2021-11-12
  Last full system update: 2022-05-25
  Is partially upgraded:   No
  Relevant software:       NetworkManager
  Windows dual boot:       Yes
  Snapshots:               Snapper
  Failed units:            

But... why? Why not just use the officially supported "hybrid mode" as you call it?

I've found a Wayland compositor (Hyprland) that I really like, and I'm trying to get it to the point where I can use it as my daily-driver. The one deal-breaker is that I can't get multiple monitors working with wlroots in Hybrid mode. I found out while tinkering in vanilla Arch that Hyprland/wlroots works perfectly in Discrete mode, but I can't get Garuda to launch in that mode.

This seems very warm for the fan to not be kicking on. Is that right?

This seems like a good idea to me. What do you mean by “no dice”? What differences did you find, or were they exactly the same?

What kernel is your Arch install running?

Running hot because I've got Final Fantasy XIV running in the background. Fans are definitely on. I don't think fan reporting works on my system.

mkinitcpio didn't have nvidia kernel modules loading in garuda, so I added nvidia nvidia_modeset nvidia_uvm nvidia_dmr to mkinitcpio.conf. Grub didn't have anything that seemed relevant. Adding the Nvidia modules in Garuda didn't change the situation.

Running linux-zen on both installs.

For the record, here is the MODULES part of my mkinitcpio.conf
MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm crc32c-intel intel_agp i915 amdgpu radeon nouveau)

I added the nvidia modules in Garuda in my troubleshooting. My arch install ony has the nvidia modules loading. And for grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0 systemd.unified_cgroup_hierarchy=1 loglevel=3"

Arch only has loglevel=3 quiet

output of garuda-inxi in discrete mode, run from a tty:

System:
  Kernel: 5.17.9-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.1.0
    parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen root=UUID=85934eef-40e6-42a1-94c8-6961030da57f
    rw rootflags=subvol=@ quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0
    systemd.unified_cgroup_hierarchy=1 loglevel=3 intel_pstate=passive
  Console: tty 2 Distro: Garuda Linux base: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 81Y6 v: LEGION 5 26AMR5 LAPTOP serial: <filter> Chassis:
    type: 10 v: LEGION 5 26AMR5 LAPTOP serial: <filter>
  Mobo: LENOVO model: INVALID v: SDK0K17763 Win8 Pro MBR IPG serial: <filter> UEFI: LENOVO
    v: EFCN30WW date: 04/21/2020
Battery:
  ID-1: BAT0 charge: 50.0 Wh (98.0%) condition: 51.0/60.0 Wh (84.9%) volts: 17.1 min: 15.4
    model: SMP L19M4PC0 type: Li-poly serial: <filter> status: N/A cycles: 234
  Device-1: ps-controller-battery-4c:b9:9b:f5:5a:46 model: N/A serial: N/A charge: N/A
    status: charging
CPU:
  Info: model: Intel Core i7-10750H socket: U3E1 bits: 64 type: MT MCP arch: Comet Lake family: 6
    model-id: 0xA5 (165) stepping: 2 microcode: 0xF0
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache: L1: 384 KiB
    desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB L3: 12 MiB desc: 1x12 MiB
  Speed (MHz): avg: 1674 high: 3571 min/max: 800/5000 base/boost: 2475/8300 scaling:
    driver: intel_cpufreq governor: schedutil volts: 0.8 V ext-clock: 100 MHz cores: 1: 3571
    2: 3200 3: 867 4: 800 5: 800 6: 800 7: 903 8: 1154 9: 800 10: 800 11: 3200 12: 3200
    bogomips: 62399
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Enhanced IBRS, IBPB: conditional, RSB filling
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA TU116M [GeForce GTX 1660 Ti Mobile] vendor: Lenovo driver: nvidia v: 515.43.04
    alternate: nouveau,nvidia_drm non-free: 515.xx+ status: current (as of 2022-05) arch: Turing
    pcie: gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3 speed: 8 GT/s bus-ID: 01:00.0
    chip-ID: 10de:2191 class-ID: 0300
  Device-2: Chicony Integrated Camera type: USB driver: uvcvideo bus-ID: 1-6:3
    chip-ID: 04f2:b6c2 class-ID: 0e02 serial: <filter>
  Display: server: X.org v: 1.21.1.3 with: Xwayland v: 22.1.2 driver: X:
    loaded: modesetting,nvidia gpu: nvidia tty: 192x54
  Message: GL data unavailable in console for root.
Audio:
  Device-1: Intel Comet Lake PCH cAVS vendor: Lenovo driver: snd_hda_intel v: kernel
    alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3 chip-ID: 8086:06c8 class-ID: 0403
  Device-2: NVIDIA TU116 High Definition Audio vendor: Lenovo driver: snd_hda_intel v: kernel
    pcie: gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3 speed: 8 GT/s bus-ID: 01:00.1
    chip-ID: 10de:1aeb class-ID: 0403
  Device-3: Sony DualSense wireless controller (PS5) type: USB
    driver: playstation,snd-usb-audio,usbhid bus-ID: 1-4:2 chip-ID: 054c:0ce6 class-ID: 0300
  Sound Server-1: ALSA v: k5.17.9-zen1-1-zen running: yes
  Sound Server-2: sndio v: N/A running: no
  Sound Server-3: PulseAudio v: 15.0 running: no
  Sound Server-4: PipeWire v: 0.3.51 running: yes
Network:
  Device-1: Intel Comet Lake PCH CNVi WiFi driver: iwlwifi v: kernel bus-ID: 00:14.3
    chip-ID: 8086:06f0 class-ID: 0280
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Lenovo driver: r8168
    v: 8.050.00-NAPI modules: r8169 pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: 3000
    bus-ID: 08:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp8s0 state: down mac: <filter>
  Device-3: Sony DualSense wireless controller (PS5) type: USB
    driver: playstation,snd-usb-audio,usbhid bus-ID: 1-4:2 chip-ID: 054c:0ce6 class-ID: 0300
Bluetooth:
  Device-1: Intel AX201 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 1-14:5
    chip-ID: 8087:0026 class-ID: e001
  Report: bt-adapter ID: hci0 rfk-id: 2 state: down bt-service: enabled,running rfk-block:
    hardware: no software: no address: <filter>
Drives:
  Local Storage: total: 1.38 TiB used: 776.59 GiB (55.1%)
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Smart Modular Tech. model: SHGP31-1000GM
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD
    serial: <filter> rev: 41062C20 temp: 47.9 C scheme: GPT
  SMART: yes health: PASSED on: 8d 2h cycles: 332 read-units: 2,137,412 [1.09 TB]
    written-units: 3,432,074 [1.75 TB]
  ID-2: /dev/nvme1n1 maj-min: 259:5 vendor: Toshiba model: KBG40ZNT512G MEMORY size: 476.94 GiB
    block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 0105AELA temp: 42.9 C scheme: GPT
  SMART: yes health: PASSED on: 144d 15h cycles: 13,689 read-units: 176,657,569 [90.4 TB]
    written-units: 42,300,680 [21.6 TB]
Partition:
  ID-1: / raw-size: 328.07 GiB size: 328.07 GiB (100.00%) used: 206.49 GiB (62.9%) fs: btrfs
    block-size: 4096 B dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-2: /boot/efi raw-size: 260 MiB size: 256 MiB (98.46%) used: 218.3 MiB (85.3%) fs: vfat
    block-size: 512 B dev: /dev/nvme1n1p1 maj-min: 259:6
  ID-3: /home raw-size: 328.07 GiB size: 328.07 GiB (100.00%) used: 206.49 GiB (62.9%)
    fs: btrfs block-size: 4096 B dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-4: /var/log raw-size: 328.07 GiB size: 328.07 GiB (100.00%) used: 206.49 GiB (62.9%)
    fs: btrfs block-size: 4096 B dev: /dev/nvme1n1p5 maj-min: 259:10
  ID-5: /var/tmp raw-size: 328.07 GiB size: 328.07 GiB (100.00%) used: 206.49 GiB (62.9%)
    fs: btrfs block-size: 4096 B dev: /dev/nvme1n1p5 maj-min: 259:10
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: zram size: 15.53 GiB used: 0 KiB (0.0%) priority: 100 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 58.0 C pch: 49.0 C mobo: N/A gpu: nvidia temp: 50 C
  Fan Speeds (RPM): N/A
Info:
  Processes: 296 Uptime: 2m wakeups: 2 Memory: 15.53 GiB used: 1.23 GiB (7.9%) Init: systemd
  v: 251 tool: systemctl Compilers: gcc: 12.1.0 clang: 13.0.1 Packages: 2705 pacman: 2672
  lib: 596 flatpak: 25 snap: 8 Shell: garuda-inxi (sudo) default: Bash v: 5.1.16
  running-in: tty 2 inxi: 3.3.16
Garuda (2.6.3-2):
  System install date:     2021-11-12
  Last full system update: 2022-05-25
  Is partially upgraded:   No
  Relevant software:       NetworkManager
  Windows dual boot:       Yes
  Snapshots:               Snapper
  Failed units:            systemd-guest-user.service 

I would remove all those unnecessary modules. It doesn’t seem like it would prevent your system from coming up, but then again they are obviously not helping so it might be worth a shot.

Are they both on the same kernel version? Same graphics driver? Are they both booting to SDDM?

This could be a clue–you could peel off the other parameters to test–but then again it could be a red herring. If you remove a module and it doesn’t fix the issue I would put it back. I’m honestly not sure I understand what the cgroup parameter is doing, but at least half of those modules are related to quiet boot functions; it seems unlikely they would break anything either way, but you never know.

1 Like

Alright, I figured this out. Turns out this was caused by optimus-manager or one of it's dependencies. Replacing garuda-optimus-manager-config with garuda-nvidia-prime-config resolved the issue. Optimus manager doesn't work in wayland sessions anyway, so not a big loss.

3 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.