Garuda Zen Mokka - apparently random System freezes - lots of errors

Hi, I’m a new Garuda and Linux user since three days ago.
I used to have windows mainly for games (I use mac for work) and I wanted to switch to Linux tired of microsoft.

I am currently dual booting, since I’m still testing the environment.
I’ve been messing around with heroic launcher to boot games previously installed: i have my windows drive on one ssd, and other several external drives; decided to test this on a drive without the win system installed.

I experienced a freeze while playing a game (The Last Spell, on GOG):
everything on screen freezing, audio keeps going regularly, computer totally unresponsive. Waited a couple minutes, tried all key combinations I could source online but nothing would work and had to hard reboot.
Afterwards, three disks on the computer where “impossible to mount”; had to login into windows and run chkdsk on all three, which fixed it. (those were the three ssd on which i keep applications, includind the win system drive).

I was having problems with reading the original save file from the game, which doesn’t save in cloud, and figured I’d better not mess with those directory, so decided to restore what changes I made and just play a new game.
The freeze happend again, in the same way, however no problem with the drives this time.

Last time, the freeze happened while just watching netflix, with another browser open doing nothing. The mode was slightly different: the playback froze, video and audio, from the player. Than it stuttered for a fraction of a second, and froze again. Everything unresponsive, had to hard reboot. This happened over two days. I’ve checked system temperatures and they were nothing unusual.

I’ve checked some other threads about seemingly random freezes, but they were describing different situation (audio also freezing, black screens, which did not happen). One thread which looked promising got abandoned, they were refering to an older post on the arch forum suggesting to downgrade nvidia graphics driver to 460xx, or changing kernel (which I still don’t really understand as a process and has led me to ask here).

I want to point out that a deciding factor that made me want to try linux is a freeze I got on windows while playing the same game which happened in the exact same fashion as described for the first two. Which makes me suspect this might have something to do with the gpu rather than the software but it’s pure conjecture.

I also tried to run journalctl -p3 and encountered A LOT of errors, which really I wouldn’t know where to start to troubleshoot. I’ll post it here below the inxi.

i tried to be as thorough as possible, hope I was not too thorough. Let me know if there’s any other relevant info I left out. Thank you very much in advance

garuda-inxi

System:
Kernel: 6.14.4-zen1-2-zen arch: x86_64 bits: 64 compiler: gcc v: 15.1.1
clocksource: tsc avail: hpet,acpi_pm
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=f85fd176-bb04-4ae5-825e-710a7ed6402a rw rootflags=subvol=@
vt.default_red=30,243,166,249,137,245,148,186,88,243,166,249,137,245,148,166
vt.default_grn=30,139,227,226,180,194,226,194,91,139,227,226,180,194,226,173
vt.default_blu=46,168,161,175,250,231,213,222,112,168,161,175,250,231,213,200
quiet resume=UUID=7de06f12-1e0a-479b-9fff-1e950a46da81 loglevel=3 ibt=off
Desktop: KDE Plasma v: 6.3.4 tk: Qt v: N/A info: frameworks v: 6.13.0
wm: kwin_wayland vt: 1 dm: SDDM Distro: Garuda base: Arch Linux
Machine:
Type: Desktop Mobo: ASUSTeK model: ROG STRIX Z390-F GAMING v: Rev 1.xx
serial: <superuser required> part-nu: ASUS_MB_CNL uuid: <superuser required>
UEFI: American Megatrends v: 1302 date: 09/02/2019
CPU:
Info: model: Intel Core i5-9600KF bits: 64 type: MCP arch: Coffee Lake
gen: core 9 level: v3 note: check built: 2018 process: Intel 14nm family: 6
model-id: 0x9E (158) stepping: 0xD (13) microcode: 0x102
Topology: cpus: 1x dies: 1 clusters: 6 cores: 6 smt: <unsupported> cache:
L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB
L3: 9 MiB desc: 1x9 MiB
Speed (MHz): avg: 4400 min/max: 800/4600 scaling: driver: intel_pstate
governor: powersave cores: 1: 4400 2: 4400 3: 4400 4: 4400 5: 4400 6: 4400
bogomips: 44398
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities: <filter>
Graphics:
Device-1: NVIDIA TU106 [GeForce RTX 2060 SUPER] vendor: Micro-Star MSI
driver: nvidia v: 570.144 alternate: nouveau,nvidia_drm
non-free: 550-570.xx+ status: current (as of 2025-04; EOL~2026-12-xx)
arch: Turing code: TUxxx process: TSMC 12nm FF built: 2018-2022 pcie:
gen: 3 speed: 8 GT/s lanes: 2 link-max: lanes: 16 ports: active: none
off: DP-3,HDMI-A-1 empty: DP-1,DP-2 bus-ID: 01:00.0 chip-ID: 10de:1f06
class-ID: 0300
Device-2: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-13:10
chip-ID: 046d:082d class-ID: 0102 serial: <filter>
Display: wayland server: X.org v: 1.21.1.16 with: Xwayland v: 24.1.6
compositor: kwin_wayland driver: X: loaded: nvidia unloaded: modesetting
alternate: fbdev,nouveau,nv,vesa gpu: nvidia,nvidia-nvswitch
d-rect: 5760x2160 display-ID: 0
Monitor-1: DP-3 pos: right model: ASUS VG289 serial: <filter> built: 2023
res: mode: 3840x2160 hz: 60 scale: 170% (1.7) to: 2259x1271 dpi: 157
gamma: 1.2 size: 621x341mm (24.45x13.43") diag: 708mm (27.9") ratio: 16:9
modes: max: 3840x2160 min: 640x480
Monitor-2: HDMI-A-1 pos: primary,left model: Dell S2240L serial: <filter>
built: 2013 res: mode: 1920x1080 hz: 60 scale: 105% (1.05) to: 1829x1029
dpi: 102 gamma: 1.2 size: 476x267mm (18.74x10.51") diag: 546mm (21.5")
ratio: 16:9 modes: max: 1920x1080 min: 640x480
API: EGL v: 1.5 hw: drv: nvidia nouveau drv: nvidia platforms: device: 0
drv: nvidia device: 1 drv: nouveau device: 2 drv: swrast gbm: drv: nvidia
surfaceless: drv: nvidia wayland: drv: nvidia x11: drv: nvidia
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 570.144
glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 2060
SUPER/PCIe/SSE2 memory: 7.81 GiB display-ID: :1.0
API: Vulkan v: 1.4.309 layers: 5 device: 0 type: discrete-gpu name: NVIDIA
GeForce RTX 2060 SUPER driver: nvidia v: 570.144 device-ID: 10de:1f06
surfaces: xcb,xlib,wayland
Info: Tools: api: clinfo, eglinfo, glxinfo, vulkaninfo
de: kscreen-console,kscreen-doctor gpu: nvidia-settings,nvidia-smi
wl: wayland-info x11: xdpyinfo, xprop, xrandr
Audio:
Device-1: Intel Cannon Lake PCH cAVS vendor: ASUSTeK driver: snd_hda_intel
v: kernel alternate: snd_soc_avs,snd_sof_pci_intel_cnl bus-ID: 00:1f.3
chip-ID: 8086:a348 class-ID: 0403
Device-2: NVIDIA TU106 High Definition Audio vendor: Micro-Star MSI
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 2
link-max: lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:10f9 class-ID: 0403
Device-3: Focusrite-Novation Scarlett 2i4 driver: snd-usb-audio type: USB
rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-1:2 chip-ID: 1235:800a
class-ID: fe01
Device-4: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-13:10
chip-ID: 046d:082d class-ID: 0102 serial: <filter>
API: ALSA v: k6.14.4-zen1-2-zen status: kernel-api tools: N/A
Server-1: PipeWire v: 1.4.2 status: active with: 1: pipewire-pulse
status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Intel Ethernet I219-V vendor: ASUSTeK driver: e1000e v: kernel
port: N/A bus-ID: 00:1f.6 chip-ID: 8086:15bc class-ID: 0200
IF: eno1 state: up speed: 100 Mbps duplex: full mac: <filter>
Info: services: NetworkManager, smbd, systemd-timesyncd
Drives:
Local Storage: total: 6.58 TiB used: 33.59 GiB (0.5%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Crucial model: CT1000P3SSD8
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
lanes: 4 tech: SSD serial: <filter> fw-rev: P9CR311 temp: 42.9 C
scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Western Digital
model: WD40EFAX-68JH4N1 size: 3.64 TiB block-size: physical: 4096 B
logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5400 serial: <filter>
fw-rev: 0A83 scheme: GPT
ID-3: /dev/sdb maj-min: 8:16 vendor: Crucial model: CT480BX500SSD1
size: 447.13 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 054 scheme: GPT
ID-4: /dev/sdc maj-min: 8:32 vendor: Samsung model: SSD 840 EVO 250GB
size: 232.89 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: BB0Q scheme: GPT
ID-5: /dev/sdd maj-min: 8:48 vendor: Western Digital
model: WD1002FAEX-00Z3A0 size: 931.51 GiB block-size: physical: 512 B
logical: 512 B speed: 6.0 Gb/s tech: N/A serial: <filter> fw-rev: 1D05
scheme: MBR
ID-6: /dev/sde maj-min: 8:64 vendor: Crucial model: CT500MX500SSD1
size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
tech: SSD serial: <filter> fw-rev: 023 scheme: GPT
Partition:
ID-1: / raw-size: 198.21 GiB size: 198.21 GiB (100.00%)
used: 33.58 GiB (16.9%) fs: btrfs dev: /dev/sdc2 maj-min: 8:34
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 612 KiB (0.2%) fs: vfat dev: /dev/sdc1 maj-min: 8:33
ID-3: /home raw-size: 198.21 GiB size: 198.21 GiB (100.00%)
used: 33.58 GiB (16.9%) fs: btrfs dev: /dev/sdc2 maj-min: 8:34
ID-4: /var/log raw-size: 198.21 GiB size: 198.21 GiB (100.00%)
used: 33.58 GiB (16.9%) fs: btrfs dev: /dev/sdc2 maj-min: 8:34
ID-5: /var/tmp raw-size: 198.21 GiB size: 198.21 GiB (100.00%)
used: 33.58 GiB (16.9%) fs: btrfs dev: /dev/sdc2 maj-min: 8:34
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 31.25 GiB used: 0 KiB (0.0%) priority: 100
comp: zstd avail: lzo-rle,lzo,lz4,lz4hc,deflate,842 max-streams: 6
dev: /dev/zram0
ID-2: swap-2 type: partition size: 34.38 GiB used: 0 KiB (0.0%)
priority: -2 dev: /dev/sdc3 maj-min: 8:35
Sensors:
System Temperatures: cpu: 44.2 C mobo: N/A
Fan Speeds (rpm): cpu: 0
Info:
Memory: total: 32 GiB available: 31.25 GiB used: 5.44 GiB (17.4%)
Processes: 317 Power: uptime: 26m states: freeze,mem,disk suspend: deep
avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
suspend, test_resume image: 12.42 GiB services: org_kde_powerdevil,
power-profiles-daemon, upowerd Init: systemd v: 257 default: graphical
tool: systemctl
Packages: 1970 pm: pacman pkgs: 1964 libs: 473 tools: octopi,paru
pm: flatpak pkgs: 6 Compilers: clang: 19.1.7 gcc: 15.1.1 Shell: garuda-inxi
default: fish v: 4.0.2 running-in: konsole inxi: 3.3.38
Garuda (2.7.2-1):
System install date:     2025-05-01
Last full system update: 2025-05-03
Is partially upgraded:   No
Relevant software:       snapper NetworkManager dracut nvidia-dkms
Windows dual boot:       Probably (Run as root to verify)
Failed units:

journalctl -p3:

mag 03 13:48:55 home-tower-station systemd-tmpfiles[960]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 03 13:48:55 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 03 13:48:55 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 03 13:48:55 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 03 13:48:55 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 03 14:01:52 home-tower-station dbus-broker-launch[972]: Activation request for 'org.freedesktop.nm_dispatcher' failed.
-- Boot 520d284ce235427eb54ab4ec1529100e --
mag 03 14:06:40 home-tower-station kernel: x86/cpu: SGX disabled or unsupported by BIOS.
mag 03 14:06:43 home-tower-station kernel:
mag 03 14:06:45 home-tower-station systemd-sslh-generator: Configuration directory '/etc/sslh/' does not exist! No units generated.
mag 03 14:06:45 home-tower-station systemd-udevd[682]: /usr/lib/udev/rules.d/75-davincipanel.rules:2 Unknown group 'resolve', ignoring.
mag 03 14:06:46 home-tower-station systemd-tmpfiles[963]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 03 14:06:47 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 03 14:06:47 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 03 14:06:47 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 03 14:06:47 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 03 14:15:36 home-tower-station plasmashell[36388]: aurorae: Couldn't find QML Decoration  ""
mag 03 14:15:36 home-tower-station org_kde_powerdevil[36435]: [ 36435] Error(s) opening ddc devices
mag 03 14:15:36 home-tower-station org_kde_powerdevil[36435]: [ 36435] Error OK(0): success opening /dev/i2c-5
mag 03 14:16:02 home-tower-station kioworker[38015]: qt.imageformats.tiff: "TIFF directory is missing required \"StripOffsets\" field"
mag 03 14:16:02 home-tower-station kioworker[38015]: qt.imageformats.tiff: "TIFF directory is missing required \"StripOffsets\" field"
mag 03 14:16:02 home-tower-station kioworker[38015]: qt.imageformats.tiff: "TIFF directory is missing required \"StripOffsets\" field"
mag 03 14:16:02 home-tower-station kioworker[38015]: qt.imageformats.tiff: "TIFF directory is missing required \"StripOffsets\" field"
mag 03 14:16:22 home-tower-station dbus-broker-launch[976]: Activation request for 'org.freedesktop.Avahi' failed.
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/gdbus_get_signal.sh' session luisbocanegr>
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/gdbus_get_signal.sh' session luisbocanegr>
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: python3 '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/service.py' 3 11 15 1
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/gdbus_get_signal.sh' session luisbocanegr>
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/gdbus_get_signal.sh' session luisbocanegr>
mag 03 14:16:22 home-tower-station plasmashell[36388]: qml: python3 '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/service.py' 25 26 15 1
mag 03 14:16:22 home-tower-station dbus-broker-launch[36202]: Activation request for 'org.freedesktop.portal.Desktop' failed.
mag 03 14:17:02 home-tower-station dbus-broker-launch[976]: Activation request for 'org.freedesktop.nm_dispatcher' failed.
-- Boot 270eb1fde6b64d689a1b56431ab8189e --
mag 03 14:20:06 home-tower-station kernel: x86/cpu: SGX disabled or unsupported by BIOS.
mag 03 14:20:09 home-tower-station kernel:
mag 03 14:20:11 home-tower-station systemd-sslh-generator: Configuration directory '/etc/sslh/' does not exist! No units generated.
mag 03 14:20:11 home-tower-station systemd-udevd[671]: /usr/lib/udev/rules.d/75-davincipanel.rules:2 Unknown group 'resolve', ignoring.
mag 03 14:20:12 home-tower-station systemd-tmpfiles[958]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 03 14:20:13 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 03 14:20:13 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 03 14:20:13 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 03 14:20:13 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 03 14:20:21 home-tower-station 30-systemd-environment-d-generator[3334]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 03 14:20:21 home-tower-station 30-systemd-environment-d-generator[3614]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 03 14:20:25 home-tower-station plasmashell[3796]: aurorae: Couldn't find QML Decoration  ""
mag 03 14:20:27 home-tower-station org_kde_powerdevil[3840]: [  3840] Error detecting VCP version using VCP feature xDF: Error_Info[DDCRC_RETRIES in ddc_write_read_with_retry, ca>
mag 03 14:20:27 home-tower-station org_kde_powerdevil[3840]: [  3840] Error(s) opening ddc devices
mag 03 14:20:27 home-tower-station org_kde_powerdevil[3840]: [  3840] Error OK(0): success opening /dev/i2c-5
mag 03 14:20:38 home-tower-station sudo[5453]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 03 14:42:58 home-tower-station dbus-broker-launch[967]: Activation request for 'org.freedesktop.Avahi' failed.
mag 03 14:42:58 home-tower-station dbus-broker-launch[3611]: Activation request for 'org.freedesktop.portal.Desktop' failed.
mag 03 14:42:58 home-tower-station plasmashell[3796]: qml: '/usr/share/plasma/plasmoids/luisbocanegra.panel.colorizer/contents/ui/tools/gdbus_get_signal.sh' session luisbocanegra>
...skipping...
mag 04 20:44:34 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:35 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:36 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:37 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:38 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 04 20:44:39 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 04 20:44:39 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 04 20:44:39 home-tower-station kwin_wayland[3678]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
-- Boot dc52f84b69e14535b44fb676ed82c2a9 --
mag 04 20:45:03 home-tower-station kernel:
mag 04 20:45:05 home-tower-station systemd-sslh-generator: Configuration directory '/etc/sslh/' does not exist! No units generated.
mag 04 20:45:05 home-tower-station systemd-udevd[676]: /usr/lib/udev/rules.d/75-davincipanel.rules:2 Unknown group 'resolve', ignoring.
mag 04 20:45:06 home-tower-station systemd-tmpfiles[963]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 04 20:45:07 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 04 20:45:07 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 04 20:45:07 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 04 20:45:07 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 04 20:45:15 home-tower-station 30-systemd-environment-d-generator[3363]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 04 20:45:16 home-tower-station 30-systemd-environment-d-generator[3661]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 04 20:45:19 home-tower-station wireplumber[3774]: spa.alsa: can't open control for card hw:3: No such file or directory
mag 04 20:45:20 home-tower-station plasmashell[3844]: aurorae: Couldn't find QML Decoration  ""
mag 04 20:45:22 home-tower-station org_kde_powerdevil[3888]: [  3888] Error detecting VCP version using VCP feature xDF: Error_Info[DDCRC_RETRIES in ddc_write_read_with_retry, ca>
mag 04 20:45:22 home-tower-station org_kde_powerdevil[3888]: [  3888] Error(s) opening ddc devices
mag 04 20:45:22 home-tower-station org_kde_powerdevil[3888]: [  3888] Error OK(0): success opening /dev/i2c-5
mag 04 20:45:25 home-tower-station kwin_wayland[3672]: kwin_scene_opengl: Invalid framebuffer status:  "GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT"
mag 04 20:46:09 home-tower-station sudo[7126]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:03:29 home-tower-station sudo[31560]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:07:59 home-tower-station sudo[37754]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:08:01 home-tower-station sudo[37819]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:11:12 home-tower-station sudo[43161]:   thomas : a password is required ; TTY=pts/1 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:11:19 home-tower-station sudo[43463]:   thomas : a password is required ; TTY=pts/1 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 04 21:32:25 home-tower-station sudo[73240]:   thomas : a password is required ; TTY=pts/2 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true

First, bios update pls

After this, think on the right settings inside bios. (fast-boot + secure-boot off) + deactivate fast-boot inside M$ + eachtime if you start M$ pls shutdown after using. No Warm-boot from M$ to Linux. Reverse no problem.

You know these drives are ntfs partitions. For permanently using ntfs → no deal with Linux.
And of course eachtime through the hard reboot the ntfs partitions are flagd as corrupt.
If, test a game installed inside ext4 or btrfs partition. Do not using a installed game on a ntfs partition → never. Disable the automount from ntfs drives (plasma systemsettings + no entry inside /etc/fstab about these drives. (handmade or over gparted or partitionmanager)
You describe itself since I’m still testing the environment.

Bios update → install the game inside your home folder → play → result ?

log messages → most are just info and/or harmless
Variety of errors can happen for all sorts of reasons (racy behaviour due to multi-threaded startup leading to a few things being initalized “too soon” e.g.)
Your error messages were generated mostly by the hard reboot.
but example this:

Perhaps you find these option inside the bios and then pls enable this.
That´s for encryption, isolation, verification (cpu stuff) and that´s a reason for
Keep your BIOS up to date

Do not downgrade the driver. You use 6.14.4.zen kernel with the correct driver for your hardware.

Not at the moment, perhaps if your system is running without issues, then, of course is this a option. (my mind)

One question from my side, to:

Which Monitor is used ?

1 Like

First of all thank you for your time and the very thorough reply.

I hadn’t realised my BIOS version was that old, I’m updating right now.

These settings I already have in place. I’ll double check fast-boot in MS

Good to know, tysm.

The second and third freezes happend while running programs only on the ext4 linux system drive. I will disable automount for ntfs drives as well and see if that changes anything.

I have a two-monitor setup:
Asus VG289 (3840x2160, 60hz)
Dell S2240L (1920x1080, 60hz)
Both plugged directly in the GPU, the first one hdmi-to-hdmi, second one hdmi-to-displayport.

The game freezes happened with the game on the asus monitor, the netflix freeze happened with netflix on the dell monitor and another browser tab (static) on the asus monitor.

EDIT:
As I was writing (on my laptop) I turned on my linux system and as I came back to it, it was frozen.

Update:

I’ve just updated my BIOS.
Everything went fine.
I went into BIOS setting before booting anything a checked that everything was in order, disabling safe-boot and fastboot again.
I booted into windows first and then shut down.

I turned on the machine, booted into linux.
The scale of the text is now much much smaller on the screen when the “loading linux” line is displayed.

Entering my password, there was quite some input lag (half a second or so)
After pressing enter, the screen froze, then the monitors turned off (not on with black screen, but not receiving any signal and entered power save mode)

Waited a couple min, had to hard reboot.
Now i’ve logged into the system and everything went fine.

will edit with the critical errors log:


mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:4:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57d:0:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:1:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:2:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:3:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:4:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x0000000f
mag 05 12:59:18 home-tower-station sddm[1612]: Failed to read display number from pipe
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:18 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:22 home-tower-station wireplumber[5947]: spa.alsa: Mapping hdmi-stereo-extra1: snd_pcm_info() failed Permission denied:
mag 05 12:59:22 home-tower-station wireplumber[5947]: spa.alsa: can't open control for card hw:1: No such file or directory
mag 05 12:59:22 home-tower-station wireplumber[5947]: spa.alsa: Card can't get card_name from card_index 3
mag 05 12:59:22 home-tower-station sddm[1612]: Failed to read display number from pipe
mag 05 12:59:22 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:22 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:22 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:24 home-tower-station sddm[1612]: Failed to read display number from pipe
mag 05 12:59:24 home-tower-station sddm[1612]: Could not start Display server on vt 2
mag 05 12:59:24 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:24 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:24 home-tower-station kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
mag 05 12:59:24 home-tower-station systemd[3330]: Failed to start KDE Plasma Workspace.
-- Boot 18fb5ee4067c4f24bf200a0f25bff269 --
mag 05 13:01:37 home-tower-station kernel: x86/cpu: VMX (outside TXT) disabled by BIOS
mag 05 13:01:37 home-tower-station kernel: x86/cpu: SGX disabled or unsupported by BIOS.
mag 05 13:01:40 home-tower-station kernel:
mag 05 13:01:42 home-tower-station systemd-sslh-generator: Configuration directory '/etc/sslh/' does not exist! No units generated.
mag 05 13:01:42 home-tower-station systemd-udevd[675]: /usr/lib/udev/rules.d/75-davincipanel.rules:2 Unknown group 'resolve', ignoring.
mag 05 13:01:43 home-tower-station systemd-tmpfiles[953]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 05 13:01:44 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 05 13:01:44 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 05 13:01:44 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 05 13:01:44 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 05 13:02:20 home-tower-station 30-systemd-environment-d-generator[4313]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 05 13:02:20 home-tower-station 30-systemd-environment-d-generator[4610]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignor>
mag 05 13:02:24 home-tower-station plasmashell[4791]: aurorae: Couldn't find QML Decoration  ""
mag 05 13:02:27 home-tower-station org_kde_powerdevil[4835]: [  4835] busno=3, sleep-multiplier= 2,00, Testing for unsupported feature 0xdd returned Error_Info[DDCRC_RETRIES in d>
mag 05 13:02:27 home-tower-station org_kde_powerdevil[4835]: [  4835] Turning off dynamic sleep and retrying
mag 05 13:02:27 home-tower-station org_kde_powerdevil[4835]: [  4835] busno=3, sleep-multiplier = 1,00, Retesting for unsupported feature 0xdd returned Error_Info[DDCRC_REPORTED_>
mag 05 13:02:28 home-tower-station org_kde_powerdevil[4835]: [  4835] Error(s) opening ddc devices
mag 05 13:02:28 home-tower-station org_kde_powerdevil[4835]: [  4835] Error OK(0): success opening /dev/i2c-5
mag 05 13:02:30 home-tower-station sudo[5883]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 05 13:02:31 home-tower-station kwin_wayland[4638]: kwin_scene_opengl: Invalid framebuffer status:  "GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT"
mag 05 13:02:44 home-tower-station sudo[7010]:   thomas : a password is required ; TTY=pts/1 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true

You can ignore the message, SGX is no longer implemented in many BIOSes, and if it is, then it is faulty. The message has existed for as long as the linux kernel has been around.

This is just an attempt to initialize a non-existent usb port on the nvidia GPU - you can ignore this message as well as the rest.

These error messages are the most important.

It looks like the wrong driver is installed. The recommended driver for Turing GPUs is nvidia-open-dkms. Install it with:

sudo pacman -S nvidia-open-dkms egl-wayland lib32-nvidia-utils lib32-opencl-nvidia nvidia-settings opencl-nvidia nvidia-utils

If there are conflicts during installation/replacement, uninstall the nvidia-dkms drivers (including garuda-nvidia-config, if present) first.

4 Likes

After updating the bios, I did this. Had to uninstall both garuda-nvidia-config and nvidia-dkms manually and the install of the correct driver was ok.
I will do some test and post updates, see if this will fix it.

Thank you so much for the help :slight_smile:

Update: I’ve left the computer on idling with netflix (just to see at a glance if it would freeze and to replicate past conditions) going on for the last 3-4 hours.
Got back to it 20 minutes ago or so and closed netflix, I’ve gotten another freeze.
I hadn’t prevented auto-mount for ntfs drives yet in order to proceed by steps and exclude one bit at a time. The error journal from 16.08 to 16.37 is almost 340’000 characters long and I can’t even post it here. I’ll try to extract what I think could be relevant, adding some comments (the original file was over 3000 lines):

mag 05 16:08:12 home-tower-station kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
mag 05 16:08:12 home-tower-station kernel: CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Tainted: G           OE      6.14.4-zen1-2-zen #1 5ebf8709a7a4a4d9e2b75723b74cfc48d75c3151
mag 05 16:08:12 home-tower-station kernel: Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
mag 05 16:08:12 home-tower-station kernel: Hardware name: ASUS System Product Name/ROG STRIX Z390-F GAMING, BIOS 2102 05/03/2024
mag 05 16:08:12 home-tower-station kernel: Call Trace:
mag 05 16:08:12 home-tower-station kernel:  <IRQ>
mag 05 16:08:12 home-tower-station kernel:  dump_stack_lvl+0x5d/0x80
mag 05 16:08:12 home-tower-station kernel:  __report_bad_irq+0x35/0xa7
mag 05 16:08:12 home-tower-station kernel:  note_interrupt.cold+0xb/0x66
mag 05 16:08:12 home-tower-station kernel:  handle_irq_event+0x72/0x90
mag 05 16:08:12 home-tower-station kernel:  handle_fasteoi_irq+0xa3/0x200
mag 05 16:08:12 home-tower-station kernel:  __common_interrupt+0x3e/0xa0
mag 05 16:08:12 home-tower-station kernel:  common_interrupt+0x80/0xa0
mag 05 16:08:12 home-tower-station kernel:  </IRQ>
mag 05 16:08:12 home-tower-station kernel:  <TASK>
mag 05 16:08:12 home-tower-station kernel:  asm_common_interrupt+0x26/0x40
mag 05 16:08:12 home-tower-station kernel: RIP: 0010:cpuidle_enter_state+0xc2/0x7f0
mag 05 16:08:12 home-tower-station kernel: Code: 00 00 e8 f1 50 ec fe e8 dc f0 ff ff 49 89 c4 0f 1f 44 00 00 31 ff e8 8d 77 ea fe 45 84 ff 0f 85 d6 04 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 8c 02 00 00 49 63 f6 4c 89 e2 48 2b 14 24 48 6b ce
mag 05 16:08:12 home-tower-station kernel: RSP: 0018:ffffa646c0157e78 EFLAGS: 00000246
mag 05 16:08:12 home-tower-station kernel: RAX: ffff8d8d0dc00000 RBX: 0000000000000001 RCX: 0000000000000000
mag 05 16:08:12 home-tower-station kernel: RDX: 000006a08d097883 RSI: 000003a6f8d4397f RDI: 0000000000000000
mag 05 16:08:12 home-tower-station kernel: RBP: ffff8d8d0dc41878 R08: 0000000000000002 R09: 0000000000000005
mag 05 16:08:12 home-tower-station kernel: R10: 0000000000000006 R11: 0000000000000005 R12: 000006a08d097883
mag 05 16:08:12 home-tower-station kernel: R13: ffffffffbb9e93e0 R14: 0000000000000001 R15: 0000000000000000
mag 05 16:08:12 home-tower-station kernel:  ? cpuidle_enter_state+0xb3/0x7f0
mag 05 16:08:12 home-tower-station kernel:  cpuidle_enter+0x31/0x50
mag 05 16:08:12 home-tower-station kernel:  do_idle+0x1b3/0x210
mag 05 16:08:12 home-tower-station kernel:  cpu_startup_entry+0x29/0x30
mag 05 16:08:12 home-tower-station kernel:  start_secondary+0x11e/0x140
mag 05 16:08:12 home-tower-station kernel:  common_startup_64+0x13e/0x141
mag 05 16:08:12 home-tower-station kernel:  </TASK>
mag 05 16:08:12 home-tower-station kernel: handlers:
mag 05 16:08:12 home-tower-station kernel: [<0000000004182682>] i801_isr [i2c_i801]
mag 05 16:08:12 home-tower-station kernel: Disabling IRQ #16
mag 05 16:08:15 home-tower-station kernel: NVRM: GPU at PCI:0000:01:00: GPU-ba4e5f60-83e9-1b80-9112-23fd91f76e5d
mag 05 16:08:15 home-tower-station kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=3707, name=kwin_wayland, GPU has fallen off the bus.
mag 05 16:08:15 home-tower-station kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
mag 05 16:08:15 home-tower-station kernel: NVRM: kgspRcAndNotifyAllChannels_IMPL: RC all channels for critical error 79.
mag 05 16:08:15 home-tower-station kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Requested object not found [NV_ERR_OBJECT_NOT_FOUND] (0x00000057) returned from krcErrorWriteNotifier_HAL(pGpu, pKernelRc, pKernelChannel, exceptType, localRmEngineType, 0xffff , &flushFlags) @ kernel_rc_notification.c:329
mag 05 16:08:15 home-tower-station kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Requested object not found [NV_ERR_OBJECT_NOT_FOUND] (0x00000057) returned from krcErrorSetNotifier(pGpu, pKernelRc, pKernelChannel, exceptType, kchannelGetEngineType(pKernelChannel), RC_NOTIFIER_SCOPE_CHANNEL) @ kernel_gsp.c:710
//
mag 05 16:08:15 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
mag 05 16:08:15 home-tower-station kernel: NVRM: nvCheckOkFailedNoLog: Check failed: GPU lost from the bus [NV_ERR_GPU_IS_LOST] (0x0000000F) returned from nvdEngineDumpCallbackHelper(pGpu, pPrbEnc, pNvDumpState,

// these last lines repeat a few times //

mag 05 16:08:16 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0006f; hObject=0xbeef0403; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:08:16 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!
mag 05 16:08:16 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0006f; hObject=0xbeef4903; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:08:16 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_client.c:843
mag 05 16:08:16 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_server.c:259
mag 05 16:08:16 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_server.c:1375
mag 05 16:08:16 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!

// Also looping several times //

mag 05 16:08:16 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 76!

// repeating again //

mag 05 16:08:17 home-tower-station kwin_wayland[3707]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 05 16:08:17 home-tower-station kwin_wayland[3707]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux
mag 05 16:08:17 home-tower-station kwin_wayland[3707]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'
mag 05 16:08:18 home-tower-station systemd-coredump[186376]: Process 4114 (firedragon) of user 1000 dumped core.

                                                             Stack trace of thread 4544:
                                                             #0  0x0000795adb4a774c n/a (libc.so.6 + 0x9774c)
                                                             #1  0x0000795adb44ddc0 raise (libc.so.6 + 0x3ddc0)
                                                             #2  0x0000795ad38ac727 n/a (libxul.so + 0x6cac727)
                                                             #3  0x0000795ad43106f6 n/a (libxul.so + 0x77106f6)
                                                             #4  0x0000795adb44def0 n/a (libc.so.6 + 0x3def0)
                                                             #5  0x0000795ab7b16630 n/a (libnvidia-eglcore.so.570.144 + 0xb16630)
                                                             #6  0x0000795ab7b168b6 n/a (libnvidia-eglcore.so.570.144 + 0xb168b6)
                                                             #7  0x0000795ab76eabbd n/a (libnvidia-eglcore.so.570.144 + 0x6eabbd)
                                                             #8  0x0000795ab76f4d1f n/a (libnvidia-eglcore.so.570.144 + 0x6f4d1f)
                                                             #9  0x0000795ab77ffaaa n/a (libnvidia-eglcore.so.570.144 + 0x7ffaaa)
                                                             #10 0x0000795ab780090f n/a (libnvidia-eglcore.so.570.144 + 0x80090f)
                                                             #11 0x0000795ab7800b2e n/a (libnvidia-eglcore.so.570.144 + 0x800b2e)
                                                             #12 0x0000795ad4b4f123 n/a (libxul.so + 0x7f4f123)
                                                             #13 0x0000795ad49cd190 n/a (libxul.so + 0x7dcd190)
                                                             #14 0x0000795ad49ccb6c n/a (libxul.so + 0x7dccb6c)
                                                             #15 0x0000795ad49ccf04 n/a (libxul.so + 0x7dccf04)
                                                             #16 0x0000795ad4a72448 n/a (libxul.so + 0x7e72448)
                                                             #17 0x0000795ad4a5e51c n/a (libxul.so + 0x7e5e51c)
                                                             #18 0x0000795ad4a5d8e6 n/a (libxul.so + 0x7e5d8e6)
                                                             #19 0x0000795ad4868d8f n/a (libxul.so + 0x7c68d8f)
                                                             #20 0x0000795ad0c95907 n/a (libxul.so + 0x4095907)
                                                             #21 0x0000795ad0c95170 n/a (libxul.so + 0x4095170)
                                                             #22 0x0000795ad0c94ab0 n/a (libxul.so + 0x4094ab0)
                                                             #23 0x0000795ad0c98144 n/a (libxul.so + 0x4098144)
                                                             #24 0x0000795acfe3a64b n/a (libxul.so + 0x323a64b)
                                                             #25 0x0000795ad0508eba n/a (libxul.so + 0x3908eba)
                                                             #26 0x0000795acfe464c6 n/a (libxul.so + 0x32464c6)
                                                             #27 0x0000795ada6a968f n/a (libnspr4.so + 0x2a68f)
                                                             #28 0x0000795adb4a57eb n/a (libc.so.6 + 0x957eb)
                                                             #29 0x0000795adb52918c n/a (libc.so.6 + 0x11918c)

                                                             Stack trace of thread 4114:
                                                             #0  0x0000795adb4ade22 n/a (libc.so.6 + 0x9de22)
                                                             #1  0x0000795adb4a1fda n/a (libc.so.6 + 0x91fda)
                                                             #2  0x0000795adb4a2024 n/a (libc.so.6 + 0x92024)
                                                             #3  0x0000795adb51c05e __poll (libc.so.6 + 0x10c05e)
                                                             #4  0x0000795ad2e874a5 n/a (libxul.so + 0x62874a5)
                                                             #5  0x0000795ad9369dde n/a (libglib-2.0.so.0 + 0xc1dde)
                                                             #6  0x0000795ad9305615 g_main_context_iteration (libglib-2.0.so.0 + 0x5d615)
                                                             #7  0x0000795acfe39398 n/a (libxul.so + 0x3239398)
                                                             #8  0x0000795ad05085a6 n/a (libxul.so + 0x39085a6)
                                                             #9  0x0000795ad2df5f0e n/a (libxul.so + 0x61f5f0e)
                                                             #10 0x0000795ad2e865fe n/a (libxul.so + 0x62865fe)
                                                             #11 0x0000795ad37f12f5 n/a (libxul.so + 0x6bf12f5)
                                                             #12 0x0000795ad38c0486 n/a (libxul.so + 0x6cc0486)
                                                             #13 0x0000795ad38c0deb n/a (libxul.so + 0x6cc0deb)
                                                             #14 0x0000795ad38c1548 n/a (libxul.so + 0x6cc1548)
                                                             #15 0x0000627b915562ba n/a (/usr/lib/firedragon/firedragon + 0x1f2ba)
                                                             #16 0x0000795adb4376b5 n/a (libc.so.6 + 0x276b5)
                                                             #17 0x0000795adb437769 __libc_start_main (libc.so.6 + 0x27769)
                                                             #18 0x0000627b91555ad5 _start (/usr/lib/firedragon/firedragon + 0x1ead5)

                                                             // this bit goes on for very many lines, not a loop //

// [...] //

mag 05 16:08:18 home-tower-station kernel:  dm_region_hash dm_log dm_mod iscsi_tcp nvidia(OE) libiscsi_tcp libiscsi scsi_transport_iscsi i2c_dev crypto_user nfnetlink
mag 05 16:08:18 home-tower-station kernel: CPU: 3 UID: 1000 PID: 4114 Comm: firedragon Tainted: G        W  OE      6.14.4-zen1-2-zen #1 5ebf8709a7a4a4d9e2b75723b74cfc48d75c3151
mag 05 16:08:18 home-tower-station kernel: Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
mag 05 16:08:18 home-tower-station kernel: Hardware name: ASUS System Product Name/ROG STRIX Z390-F GAMING, BIOS 2102 05/03/2024
mag 05 16:08:18 home-tower-station kernel: RIP: 0010:nvidia_close+0x396/0x3a0 [nvidia]
mag 05 16:08:18 home-tower-station kernel: Code: bb d8 00 00 00 e9 fa fd ff ff 48 89 ef e8 92 dc ff ff 48 c7 c7 20 dc 75 c0 e8 d6 46 2d f9 e9 0e fd ff ff 0f 0b e9 07 fd ff ff <0f> 0b e9 68 fe ff ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90
mag 05 16:08:18 home-tower-station kernel: RSP: 0018:ffffa646c6be7918 EFLAGS: 00010202
mag 05 16:08:18 home-tower-station kernel: RAX: 0000000000000026 RBX: ffff8d8647734800 RCX: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: RDX: ffffa646c6be7888 RSI: 0000000000000286 RDI: ffffa646c6be7848
mag 05 16:08:18 home-tower-station kernel: RBP: ffff8d85d5fc86a8 R08: ffffa646c6be7888 R09: ffffffffc0761f98
mag 05 16:08:18 home-tower-station kernel: R10: ffffa646c6be78b8 R11: ffffcbd9472d6ec0 R12: ffff8d86ea1529c0
mag 05 16:08:18 home-tower-station kernel: R13: 0000000000000000 R14: ffff8d85d5fc8000 R15: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: FS:  0000000000000000(0000) GS:ffff8d8d0db80000(0000) knlGS:0000000000000000
mag 05 16:08:18 home-tower-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
mag 05 16:08:18 home-tower-station kernel: CR2: 00005e6816100ac7 CR3: 0000000421a22006 CR4: 00000000003706f0
mag 05 16:08:18 home-tower-station kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
mag 05 16:08:18 home-tower-station kernel: Call Trace:
mag 05 16:08:18 home-tower-station kernel:  <TASK>
mag 05 16:08:18 home-tower-station kernel:  __fput+0xe2/0x2b0
mag 05 16:08:18 home-tower-station kernel:  task_work_run+0x5a/0x90
mag 05 16:08:18 home-tower-station kernel:  do_exit+0x316/0xb70
mag 05 16:08:18 home-tower-station kernel:  do_group_exit+0x2d/0xc0
mag 05 16:08:18 home-tower-station kernel:  ? _raw_spin_lock_irq+0x2f/0x40
mag 05 16:08:18 home-tower-station kernel:  get_signal+0x930/0x930
mag 05 16:08:18 home-tower-station kernel:  arch_do_signal_or_restart+0x40/0x280
mag 05 16:08:18 home-tower-station kernel:  syscall_exit_to_user_mode+0x156/0x210
mag 05 16:08:18 home-tower-station kernel:  do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? __x64_sys_poll+0xc6/0x190
mag 05 16:08:18 home-tower-station kernel:  ? syscall_exit_to_user_mode+0x10/0x210
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? __x64_sys_poll+0xc6/0x190
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? syscall_exit_to_user_mode+0x10/0x210
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? syscall_exit_to_user_mode+0x10/0x210
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? syscall_exit_to_user_mode+0x10/0x210
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? rcu_core+0x1a5/0x3e0
mag 05 16:08:18 home-tower-station kernel:  ? sched_clock+0x10/0x30
mag 05 16:08:18 home-tower-station kernel:  ? sched_clock_cpu+0xb/0x30
mag 05 16:08:18 home-tower-station kernel:  ? irqtime_account_irq+0x3c/0xc0
mag 05 16:08:18 home-tower-station kernel:  ? handle_softirqs+0x192/0x2a0
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
mag 05 16:08:18 home-tower-station kernel: RIP: 0033:0x795adb4ade22
mag 05 16:08:18 home-tower-station kernel: Code: Unable to access opcode bytes at 0x795adb4addf8.
mag 05 16:08:18 home-tower-station kernel: RSP: 002b:00007fff99ce6ec8 EFLAGS: 00000246 ORIG_RAX: 0000000000000007
mag 05 16:08:18 home-tower-station kernel: RAX: fffffffffffffdfc RBX: 0000795adb2aae00 RCX: 0000795adb4ade22
mag 05 16:08:18 home-tower-station kernel: RDX: 00000000000003e2 RSI: 0000000000000004 RDI: 0000795aa7deeca0
mag 05 16:08:18 home-tower-station kernel: RBP: 00007fff99ce6f00 R08: 0000000000000000 R09: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
mag 05 16:08:18 home-tower-station kernel: R13: 0000795adb279c80 R14: 0000795aa7deeca0 R15: 00000000000003e2
mag 05 16:08:18 home-tower-station kernel:  </TASK>
mag 05 16:08:18 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!

// this section repeats a couple of times //

mag 05 16:08:18 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0006d; hObject=0xfade0002; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:08:18 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: NV_OK == status @ vaspace_api.c:538
mag 05 16:08:18 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!

// this bit loops for many lines //

mag 05 16:08:18 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 76!

// also loops //

mag 05 16:08:18 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0007f; hObject=0xbeef9097; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:08:18 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_client.c:843

// loop //

mag 05 16:08:18 home-tower-station kernel:  intel_powerclamp snd_hda_codec_hdmi ac97_bus snd_pcm_dmaengine coretemp rapl uvcvideo asus_wmi intel_cstate snd_hda_intel platform_profile snd_usb_audio videobuf2_vmalloc snd_intel_dspcfg i8042 intel_uncore snd_usbmidi_lib uvc sparse_keymap videobuf2_memops videobuf2_v4l2 snd_ump snd_intel_sdw_acpi videobuf2_common snd_hda_codec serio snd_rawmidi rfkill videodev snd_hda_core snd_seq_device i2c_i801 mc mxm_wmi snd_hwdep wmi_bmof spi_intel_pci snd_pcm i2c_smbus mousedev i2c_mux spi_intel joydev snd_timer snd i2c_nvidia_gpu soundcore intel_pmc_core pmt_telemetry pmt_class intel_vsec acpi_pad acpi_tad mac_hid loop zram 842_decompress 842_compress lz4hc_compress lz4_compress ip_tables x_tables hid_corsair hid_generic usbhid polyval_clmulni polyval_generic ghash_clmulni_intel nvidia_drm(OE) sha512_ssse3 sha256_ssse3 nvme nvidia_modeset(OE) sha1_ssse3 aesni_intel e1000e crypto_simd nvme_core cryptd ptp nvme_auth drm_ttm_helper pps_core ttm video wmi pinctrl_cannonlake uinput nvidia_uvm(OE) sunrpc dm_mirror
mag 05 16:08:18 home-tower-station kernel:  dm_region_hash dm_log dm_mod iscsi_tcp nvidia(OE) libiscsi_tcp libiscsi scsi_transport_iscsi i2c_dev crypto_user nfnetlink
mag 05 16:08:18 home-tower-station kernel: CPU: 3 UID: 1000 PID: 4944 Comm: plasma-browser- Tainted: G        W  OE      6.14.4-zen1-2-zen #1 5ebf8709a7a4a4d9e2b75723b74cfc48d75c3151
mag 05 16:08:18 home-tower-station kernel: Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
mag 05 16:08:18 home-tower-station kernel: Hardware name: ASUS System Product Name/ROG STRIX Z390-F GAMING, BIOS 2102 05/03/2024
mag 05 16:08:18 home-tower-station kernel: RIP: 0010:nvidia_close+0x396/0x3a0 [nvidia]
mag 05 16:08:18 home-tower-station kernel: Code: bb d8 00 00 00 e9 fa fd ff ff 48 89 ef e8 92 dc ff ff 48 c7 c7 20 dc 75 c0 e8 d6 46 2d f9 e9 0e fd ff ff 0f 0b e9 07 fd ff ff <0f> 0b e9 68 fe ff ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90
mag 05 16:08:18 home-tower-station kernel: RSP: 0018:ffffa646cb783ad0 EFLAGS: 00010202
mag 05 16:08:18 home-tower-station kernel: RAX: 0000000000000026 RBX: ffff8d8730b08800 RCX: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: RDX: ffffa646cb783a40 RSI: 0000000000000286 RDI: ffffa646cb783a00
mag 05 16:08:18 home-tower-station kernel: RBP: ffff8d85d5fc86a8 R08: ffffa646cb783a40 R09: ffffffffc0761f98
mag 05 16:08:18 home-tower-station kernel: R10: ffffa646cb783a70 R11: ffffcbd9472d6ec0 R12: ffff8d8689189240
mag 05 16:08:18 home-tower-station kernel: R13: 0000000000000000 R14: ffff8d85d5fc8000 R15: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: FS:  000070cdd8003f40(0000) GS:ffff8d8d0db80000(0000) knlGS:0000000000000000
mag 05 16:08:18 home-tower-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
mag 05 16:08:18 home-tower-station kernel: CR2: 000070cdd5c8f000 CR3: 00000001795c6003 CR4: 00000000003706f0
mag 05 16:08:18 home-tower-station kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
mag 05 16:08:18 home-tower-station kernel: Call Trace:
mag 05 16:08:18 home-tower-station kernel:  <TASK>
mag 05 16:08:18 home-tower-station kernel:  __fput+0xe2/0x2b0
mag 05 16:08:18 home-tower-station kernel:  __x64_sys_close+0x91/0x120
mag 05 16:08:18 home-tower-station kernel:  do_syscall_64+0x7b/0x190
mag 05 16:08:18 home-tower-station kernel:  ? rmapiFreeWithSecInfoTls+0x67/0x80 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? os_acquire_spinlock+0x12/0x30 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? portSyncSpinlockAcquire+0x1d/0x50 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? free_os_events+0x25/0x90 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? RmIoctl+0x42e/0xdb0 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? os_get_current_tick+0x3b/0xa0 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? os_acquire_spinlock+0x12/0x30 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? portSyncSpinlockAcquire+0x1d/0x50 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? threadStateFree+0xd5/0x210 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? rm_ioctl+0x8c/0x5f0 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? __check_object_size+0x1ff/0x220
mag 05 16:08:18 home-tower-station kernel:  ? nvidia_unlocked_ioctl+0x164/0xa40 [nvidia a0f5be1cdec84cff9cd2a4147723833a993c4a18]
mag 05 16:08:18 home-tower-station kernel:  ? __x64_sys_ioctl+0x56/0xc0
mag 05 16:08:18 home-tower-station kernel:  ? syscall_exit_to_user_mode+0x10/0x210
mag 05 16:08:18 home-tower-station kernel:  ? do_syscall_64+0x87/0x190
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  ? clear_bhb_loop+0x25/0x80
mag 05 16:08:18 home-tower-station kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
mag 05 16:08:18 home-tower-station kernel: RIP: 0033:0x70cdde4ade22
mag 05 16:08:18 home-tower-station kernel: Code: 08 0f 85 21 41 ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66
mag 05 16:08:18 home-tower-station kernel: RSP: 002b:00007ffc4b0019e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
mag 05 16:08:18 home-tower-station kernel: RAX: ffffffffffffffda RBX: 000059abcf5b9300 RCX: 000070cdde4ade22
mag 05 16:08:18 home-tower-station kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000013
mag 05 16:08:18 home-tower-station kernel: RBP: 00007ffc4b001a20 R08: 0000000000000000 R09: 0000000000000000
mag 05 16:08:18 home-tower-station kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000059abcf5abf48
mag 05 16:08:18 home-tower-station kernel: R13: 00007ffc4b001c50 R14: 00007ffc4b001c48 R15: 000059abcf5914e0
mag 05 16:08:18 home-tower-station kernel:  </TASK>
mag 05 16:08:18 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!
mag 05 16:08:18 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0007d; hObject=0xfade0002; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:08:18 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: NV_OK == status @ vaspace_api.c:538
mag 05 16:08:18 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!
mag 05 16:08:18 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d0007d; hObject=0xfade0001; paramsStatus=0x00000000; status=0x0000000f

//

mag 05 16:08:18 home-tower-station kwin_wayland[3707]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 05 16:08:18 home-tower-station kwin_wayland[3707]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux

// loop //

// then there's more of the above, tons of messages with time stamp 16:09:00 and then this //

mag 05 16:09:00 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d00036; hObject=0xfade0002; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:09:00 home-tower-station kernel: NVRM: nvAssertFailedNoLog: Assertion failed: NV_OK == status @ vaspace_api.c:538
mag 05 16:09:00 home-tower-station kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 10!
mag 05 16:09:00 home-tower-station kernel: NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d00036; hObject=0xfade0001; paramsStatus=0x00000000; status=0x0000000f
mag 05 16:09:00 home-tower-station kwin_wayland[3707]: kwin_core: The used windowing system requires compositing
mag 05 16:09:00 home-tower-station kwin_wayland[3707]: kwin_core: We are going to quit KWin now as it is broken
mag 05 16:09:03 home-tower-station kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 1
mag 05 16:11:16 home-tower-station systemd[1]: systemd-logind.service: Watchdog timeout (limit 3min)!
mag 05 16:11:42 home-tower-station kernel: INFO: task systemd-logind:1014 blocked for more than 122 seconds.
mag 05 16:11:42 home-tower-station kernel:       Tainted: G        W  OE      6.14.4-zen1-2-zen #1
mag 05 16:11:42 home-tower-station kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mag 05 16:11:42 home-tower-station kernel: INFO: task kwin_wayland:3707 blocked for more than 122 seconds.
mag 05 16:11:42 home-tower-station kernel:       Tainted: G        W  OE      6.14.4-zen1-2-zen #1
// [...] repeat //
mag 05 16:13:45 home-tower-station kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mag 05 16:13:45 home-tower-station kernel: INFO: task plasma-browser-:4821 blocked for more than 245 seconds.
mag 05 16:13:45 home-tower-station kernel:       Tainted: G        W  OE      6.14.4-zen1-2-zen #1
mag 05 16:13:45 home-tower-station kernel:       Blocked by coredump.
mag 05 16:13:45 home-tower-station kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
-- Boot 7a91f83c0b174cf8bee5f639d41eed70 --
mag 05 16:28:53 home-tower-station kernel: x86/cpu: VMX (outside TXT) disabled by BIOS
mag 05 16:28:53 home-tower-station kernel: x86/cpu: SGX disabled or unsupported by BIOS.
mag 05 16:28:56 home-tower-station kernel:
mag 05 16:28:58 home-tower-station systemd-sslh-generator: Configuration directory '/etc/sslh/' does not exist! No units generated.
mag 05 16:28:58 home-tower-station systemd-udevd[671]: /usr/lib/udev/rules.d/75-davincipanel.rules:2 Unknown group 'resolve', ignoring.
mag 05 16:28:59 home-tower-station systemd-tmpfiles[952]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted
mag 05 16:29:00 home-tower-station kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
mag 05 16:29:00 home-tower-station kernel: ucsi_ccg 5-0008: i2c_transfer failed -110
mag 05 16:29:00 home-tower-station kernel: ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
mag 05 16:29:00 home-tower-station kernel: ucsi_ccg 5-0008: probe with driver ucsi_ccg failed with error -110
mag 05 16:29:07 home-tower-station 30-systemd-environment-d-generator[3340]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignoring.
mag 05 16:29:08 home-tower-station 30-systemd-environment-d-generator[3637]: /home/thomas/.config/environment.d/firefox.conf:2: invalid variable name "env MOZ_USE_XINPUT2", ignoring.
mag 05 16:29:12 home-tower-station plasmashell[3822]: aurorae: Couldn't find QML Decoration  ""
mag 05 16:29:14 home-tower-station org_kde_powerdevil[3866]: [  3866] Error detecting VCP version using VCP feature xDF: Error_Info[DDCRC_RETRIES in ddc_write_read_with_retry, causes: DDCRC_READ_ALL_ZERO(10)]
mag 05 16:29:14 home-tower-station org_kde_powerdevil[3866]: [  3866] Error(s) opening ddc devices
mag 05 16:29:14 home-tower-station org_kde_powerdevil[3866]: [  3866] Error OK(0): success opening /dev/i2c-5
mag 05 16:29:17 home-tower-station sudo[4589]:   thomas : a password is required ; TTY=pts/0 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 05 16:29:18 home-tower-station kwin_wayland[3665]: kwin_scene_opengl: Invalid framebuffer status:  "GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT"
mag 05 16:29:25 home-tower-station sudo[5872]:   thomas : a password is required ; TTY=pts/1 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true
mag 05 16:37:42 home-tower-station sudo[18374]:   thomas : a password is required ; TTY=pts/1 ; PWD=/home/thomas ; USER=root ; COMMAND=/usr/bin/true

I’m assuming from these lines that there’s still a problem with gpu driver. I will check into that.

This is the reason for your nvidia issues:

Xid 79 points to driver or hardware problem, mostly hardware related. Go through these points:

You can also try a different power manager, for example tlp.

Edit: You can also try to disable all power management settings in the BIOS and in KDE (the same applies to overclocking/undervolting).

3 Likes

Update:

I found recommended drivers to download as .run file from Nvidia here.
I tried looking into how to install .run drivers and i encountered an error because i was running X server.
looked into that as well.
I managed to disable it with sudo systemctl stop sddm.service, accessing terminal with ctrl+alt+F2 and following this guide. which did not go well and i had to restore from a snapshot.

Any help to properly install those drivers would be appreciated.

Great, not exactly what I was hoping for but I kind of was afraid of.
I will check what I can.

Could you please elaborate a bit on the power manager bit and tlp ?

On Arch/archbased Linux, never install drivers from the nvidia website, but only from the official repositories.

Power management - ArchWiki

3 Likes

Thank you so much for everything.

I’ll start with hardware troubleshooting first.
I changed PCI-e Slot. Will check the cables and will try with a different GPU I have at home if problems persist.
If nothing works I’ll look into software solutions.

I’ll post updates and/or edit this reply for minor things

Just an arrow in the dark but try different kernels such as linux and linux-lts

Install them both with: sudo pacman -S linux linux-headers linux-lts linux-lts-headers and reboot.

Running uname -r upon reboot should show you the booted version.

1 Like

Choose kernel from grub check later with

mhwd-kernel -li

Why not reading and doing what´s in last posts from @nepti?
Instead you’re driving all over the place and potentially rendering your system completely unusable.

Maybe disabling all power management settings of your display is already enough.

For this reason:

It is a good idea to start troubleshooting at hardware level. :slight_smile:

2 Likes

Sorry, I seem to have gotten something mixed up. I thought the monitor was in power save mode when freeze happened.

1 Like

Quick Update:

Last night I switched the GPU PCI-e slot. While performing the swap, I noticed the GPU was not sitting at a right angle as it should have, and reinstalling I also noticed some screws wouldn’t properly thread all the way. So it is a possibility that the weight of the card over the years bent the slot it was sitting in. Now it’s installed properly and it does not move around. The system has been running for the last 12 hours with no freezes. I’ve run journalctl a couple times just to check. The errors are very few. Some persist (notably this, although with occurrence in one moment only)

And nothing major to report.

I haven’t changed anything at software level yet. I’ve set powerprofilesctl configure-battery-aware --disable, which was enabled, but did not change anything else at software level. I’ve let the machine run at idle, with netflix, gaming, and done some test with unigine to load the gpu a bit more.

I will keep testing in the next days but it is very likely to have been a hardware issue all in all.
If the system is stable I will try to swap kernel to see if the nvidia driver error gets fixed.

UPDATE:
just as I was going to turn off the pc, a new freeze.

the journalctl -p3 is over 3000 lines over the cours of 40 minutes of repeating these lines in blocks alternating:

ag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia>
mag 06 22:22:26 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.>

the log ends with these lines:

mag 06 22:22:22 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:22 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:22 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:22 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:22 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:22 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->
mag 06 22:22:23 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:23 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:23 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->
mag 06 22:22:23 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:24 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:24 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:24 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->
mag 06 22:22:25 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:25 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:25 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:26 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:26 home-tower-station kernel: NVRM: _instmemAddHashEntry: Display Hash table is FULL!!
mag 06 22:22:27 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Pageflip timed out! This is a bug in the nvidia-drm kernel driver
mag 06 22:22:27 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: Please report this at https://forums.developer.nvidia.com/c/gpu-gra>
mag 06 22:22:27 home-tower-station kwin_wayland[5978]: kwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma->

This time the event was a bit different. I pressed the win key (idk what it’s called in linux yet) to shut down the machine, the menu did not display, i pressed a couple more times and i noticed the last active window flashing active/in background; than the total freeze. I tried to ctrl+alt+t and then type in sudo reboot to see if it was just a video freeze but nothing happened. hard reboot and reported back here.

I want to add that the pc was idle for the last hour before the freeze, so the errors occured while nothing was going on.

Also want to point this edit out: