Blank screen if left alone for too long! (GPU has fallen off the bus!)

So, I turned on my computer and went to put baby to sleep. When I came back, screens were off - I thought they just went to sleep as usual, and tried moving my mouse. Nothing happened. Tried using keyboard, even ctrl+alt+f1~4 to change terminals, and nothing. So I hard reset the computer.

Got back in, used journalctl -o short-precise -k -b -1 (thank you Stack Exchange) and there’s this huge amount of nvidia-modeset error messages:

out 21 21:42:24.001362 johnny-g55590 kernel: NVRM: GPU at PCI:0000:01:00: GPU-699bec45-43f5-ce35-be5f-6f624a2850a7
out 21 21:42:24.001434 johnny-g55590 kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
out 21 21:42:24.001454 johnny-g55590 kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
out 21 21:42:24.001467 johnny-g55590 kernel: NVRM: A GPU crash dump has been created. If possible, please run
                                             NVRM: nvidia-bug-report.sh as root to collect this data before
                                             NVRM: the NVIDIA kernel module is unloaded.
out 21 21:42:29.123205 johnny-g55590 kernel: NVRM: Error in service of callback 
out 21 21:42:37.680253 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57d:0:0:0x00>
out 21 21:42:37.680338 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x00>
out 21 21:42:37.680357 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:1:0:0x00>
out 21 21:42:37.680370 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:2:0:0x00>
out 21 21:42:37.680381 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:3:0:0x00>
out 21 21:42:37.680393 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:4:0:0x00>
out 21 21:42:37.680404 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x00>
out 21 21:42:37.680419 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x00>
out 21 21:42:37.680451 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x00>
out 21 22:13:33.941283 johnny-g55590 kernel: nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress: 0x0000c57d:0 2:0:4048:4040

the last line repeats a LOT of times.

I see the message in there saying “please run nvidia-bug-report”, but nothing allows me to interact with the computer after this as no screen shows up.

Wondering if I should try to remove the things I changed to disable the intel graphics card here to try to investigate this - since looking around the internet, it seems that “GPU has fallen off the bus.” is said to be a pretty vague error that could be caused by numerous things, like power supply or physical graphics card connection to motherboard…

Which, in both accounts, I think isn’t the case since this is a laptop (Dell G5), properly maintained (I think I open and clean it once a month - ok, maybe it’s over-zealously maintained).

I don’t remember ever having this kind of problem before disabling the intel graphics and following the steps from the Arch Wiki to use only my nVidia card

(thinking about other topics I’ve been opening… am I cursed? =p )

System:
  Kernel: 6.5.8-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    clocksource: tsc available: acpi_pm
    parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
    root=UUID=468e3250-834f-4678-85b1-f50f268e557d rw rootflags=subvol=@
    quiet console=tty0 console=ttyS0,115200n8 cryptomgr.notests
    initcall_debug intel_iommu=igfx_off kvm-intel.nested=1 no_timer_check
    noreplace-smp page_alloc.shuffle=1 rcupdate.rcu_expedited=1
    rootfstype=ext4,btrfs,xfs,f2fs tsc=reliable rd.udev.log_priority=3
    vt.global_cursor_default=0
    resume=UUID=92d5bc58-440e-4eab-9f01-4fa35d34e02b loglevel=3 rw ibt=off
  Desktop: KDE Plasma v: 5.27.8 tk: Qt v: 5.15.11 wm: kwin_x11 vt: 2
    dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
  Type: Laptop System: Dell product: G5 5590 v: N/A
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: Dell model: 0F3T2G v: A00 serial: <superuser required> UEFI: Dell
    v: 1.22.0 date: 11/10/2022
Battery:
  ID-1: BAT0 charge: 45.4 Wh (100.0%) condition: 45.4/60.0 Wh (75.6%)
    volts: 16.9 min: 15.2 model: SMP DELL JJPFK87 type: Li-poly serial: <filter>
    status: full
CPU:
  Info: model: Intel Core i7-9750H bits: 64 type: MT MCP arch: Coffee Lake
    gen: core 9 level: v3 note: check built: 2018 process: Intel 14nm family: 6
    model-id: 0x9E (158) stepping: 0xA (10) microcode: 0xF4
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB
    L3: 12 MiB desc: 1x12 MiB
  Speed (MHz): avg: 2217 high: 4203 min/max: 800/4500 scaling:
    driver: intel_pstate governor: powersave cores: 1: 4203 2: 4203 3: 800
    4: 800 5: 800 6: 800 7: 4201 8: 800 9: 800 10: 4203 11: 800 12: 4203
    bogomips: 62399
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities: <filter>
Graphics:
  Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] vendor: Dell
    driver: i915 v: kernel arch: Gen-9.5 process: Intel 14nm built: 2016-20
    ports: active: none off: eDP-1 empty: DP-1, DP-2, HDMI-A-1, HDMI-A-2
    bus-ID: 00:02.0 chip-ID: 8086:3e9b class-ID: 0300
  Device-2: NVIDIA TU106M [GeForce RTX 2060 Mobile] vendor: Dell
    driver: nvidia v: 535.113.01 alternate: nouveau,nvidia_drm non-free: 535.xx+
    status: current (as of 2023-09) arch: Turing code: TUxxx
    process: TSMC 12nm FF built: 2018-22 pcie: gen: 1 speed: 2.5 GT/s lanes: 8
    link-max: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 01:00.0
    chip-ID: 10de:1f11 class-ID: 0300
  Device-3: Microdia Integrated_Webcam_HD driver: uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-5:3 chip-ID: 0c45:671f
    class-ID: 0e02
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.2.1
    compositor: kwin_x11 driver: X: loaded: modesetting,nvidia unloaded: nouveau
    alternate: fbdev,intel,nv,vesa dri: iris gpu: i915 display-ID: :0
    screens: 1
  Screen-1: 0 s-res: 2560x2160 s-dpi: 114 s-size: 571x482mm (22.48x18.98")
    s-diag: 747mm (29.42")
  Monitor-1: DP-0 pos: primary,top res: 2560x1080 hz: 60 dpi: 81
    size: 798x334mm (31.42x13.15") diag: 865mm (34.06") modes: N/A
  Monitor-2: HDMI-0 pos: bottom res: 2560x1080 hz: 60 dpi: 96
    size: 677x290mm (26.65x11.42") diag: 736mm (29") modes: N/A
  Monitor-3: eDP-1-1 size-res: N/A modes: N/A
  API: EGL v: 1.5 hw: drv: nvidia platforms: gbm: drv: nvidia
  API: OpenGL v: 4.6.0 vendor: nvidia v: 535.113.01 glx-v: 1.4
    direct-render: yes renderer: NVIDIA GeForce RTX 2060/PCIe/SSE2
    memory: 5.86 GiB
  API: Vulkan v: 1.3.264 layers: 14 device: 0 type: integrated-gpu
    name: Intel UHD Graphics 630 (CFL GT2) driver: mesa intel v: 23.2.1-arch1.2
    device-ID: 8086:3e9b surfaces: xcb,xlib device: 1 type: discrete-gpu
    name: NVIDIA GeForce RTX 2060 driver: nvidia v: 535.113.01
    device-ID: 10de:1f11 surfaces: xcb,xlib device: 2 type: cpu name: llvmpipe
    (LLVM 16.0.6 256 bits) driver: mesa llvmpipe v: 23.2.1-arch1.2 (LLVM
    16.0.6) device-ID: 10005:0000 surfaces: xcb,xlib
Audio:
  Device-1: Intel Cannon Lake PCH cAVS vendor: Dell driver: snd_hda_intel
    v: kernel alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3
    chip-ID: 8086:a348 class-ID: 0403
  Device-2: NVIDIA TU106 High Definition Audio vendor: Dell
    driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 8
    link-max: lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:10f9 class-ID: 0403
  Device-3: Generalplus USB Audio Device
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 1.1 speed: 12 Mb/s
    lanes: 1 mode: 1.1 bus-ID: 1-4.1:4 chip-ID: 1b3f:2008 class-ID: 0300
  Device-4: Realtek USB Audio driver: snd-usb-audio type: USB rev: 2.0
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-4.5:8 chip-ID: 0bda:4014
    class-ID: 0102 serial: <filter>
  API: ALSA v: k6.5.8-zen1-1-zen status: kernel-api with: aoss
    type: oss-emulator tools: N/A
  Server-1: PipeWire v: 0.3.83 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek vendor: Dell driver: r8169 v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: 3000 bus-ID: 3c:00.0 chip-ID: 10ec:2502
    class-ID: 0200
  IF: enp60s0 state: down mac: <filter>
  Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
    vendor: Dell driver: ath10k_pci v: kernel pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 bus-ID: 3d:00.0 chip-ID: 168c:003e class-ID: 0280 temp: 48.0 C
  IF: wlp61s0 state: down mac: <filter>
  Device-3: Realtek RTL8153 Gigabit Ethernet Adapter driver: r8152 type: USB
    rev: 3.0 speed: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 bus-ID: 6-1.2:3
    chip-ID: 0bda:8153 class-ID: 0000 serial: <filter>
  IF: enp58s0u1u2 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Qualcomm Atheros driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:7 chip-ID: 0cf3:e007
    class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 0 state: up address: <filter> bt-v: 4.2
    lmp-v: 8 status: discoverable: no pairing: no class-ID: 7c010c
Drives:
  Local Storage: total: 1.14 TiB used: 489.5 GiB (41.8%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital
    model: PC SN520 NVMe WDC 256GB size: 238.47 GiB block-size: physical: 512 B
    logical: 512 B speed: 15.8 Gb/s lanes: 2 tech: SSD serial: <filter>
    fw-rev: 20240012 temp: 55.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Western Digital
    model: WD10SPZX-75Z10T3 size: 931.51 GiB block-size: physical: 4096 B
    logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5400 serial: <filter>
    fw-rev: 4514 scheme: GPT
Partition:
  ID-1: / raw-size: 221.19 GiB size: 221.19 GiB (100.00%)
    used: 105.23 GiB (47.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 632 KiB (0.2%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
  ID-3: /home raw-size: 221.19 GiB size: 221.19 GiB (100.00%)
    used: 105.23 GiB (47.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-4: /var/log raw-size: 221.19 GiB size: 221.19 GiB (100.00%)
    used: 105.23 GiB (47.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-5: /var/tmp raw-size: 221.19 GiB size: 221.19 GiB (100.00%)
    used: 105.23 GiB (47.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: zram size: 15.31 GiB used: 11.8 MiB (0.1%)
    priority: 100 comp: zstd avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 12
    dev: /dev/zram0
  ID-2: swap-2 type: partition size: 16.98 GiB used: 0 KiB (0.0%)
    priority: -2 dev: /dev/nvme0n1p3 maj-min: 259:3
Sensors:
  System Temperatures: cpu: 57.0 C pch: 71.0 C mobo: N/A gpu: nvidia
    temp: 52 C
  Fan Speeds (rpm): N/A
Info:
  Processes: 330 Uptime: 46m wakeups: 5 Memory: total: 16 GiB note: est.
  available: 15.31 GiB used: 5.95 GiB (38.9%) Init: systemd v: 254
  default: graphical tool: systemctl Compilers: gcc: 13.2.1 clang: 16.0.6
  Packages: 1946 pm: pacman pkgs: 1937 libs: 488
  tools: gnome-software,octopi,pamac,paru,yay pm: flatpak pkgs: 9 Shell: Zsh
  v: 5.9 running-in: kitty inxi: 3.3.30
Garuda (2.6.17-1):
  System install date:     2023-04-01
  Last full system update: 2023-10-21
  Is partially upgraded:   No
  Relevant software:       snapper NetworkManager dracut nvidia-dkms
  Windows dual boot:       No/Undetected
  Failed units:            

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.