GPU has fallen off the bus

Hi! I’ve posted about this issue before here, but it’s returned after a hiatus, and I figured I’d share what I’ve got and see if it’s possible to get an answer together.

garuda-inxi:

System:
Kernel: 6.8.5-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
clocksource: tsc avail: acpi_pm
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=ff218736-6729-4d49-a2d2-0dc7487c0fc3 rw rootflags=subvol=@
rd.udev.log_priority=3 vt.global_cursor_default=0 loglevel=3
mem_sleep_default=s2idle ibt=off
Desktop: KDE Plasma v: 6.0.3 tk: Qt v: N/A info: frameworks v: 6.1.0
wm: kwin_x11 vt: 2 dm: SDDM Distro: Garuda base: Arch Linux
Machine:
Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
Mobo: ASUSTeK model: ROG STRIX B760-G GAMING WIFI v: Rev 1.xx
serial: <superuser required> part-nu: SKU uuid: <superuser required>
UEFI: American Megatrends v: 1210 date: 07/14/2023
CPU:
Info: model: 13th Gen Intel Core i9-13900K bits: 64 type: MST AMCP
arch: Raptor Lake gen: core 13 level: v3 note: check built: 2022+
process: Intel 7 (10nm) family: 6 model-id: 0xB7 (183) stepping: 1
microcode: 0x122
Topology: cpus: 1x cores: 24 mt: 8 tpc: 2 st: 16 threads: 32 smt: enabled
cache: L1: 2.1 MiB desc: d-16x32 KiB, 8x48 KiB; i-8x32 KiB, 16x64 KiB
L2: 32 MiB desc: 8x2 MiB, 4x4 MiB L3: 36 MiB desc: 1x36 MiB
Speed (MHz): avg: 882 high: 1100 min/max: 800/5500:5800:4300 scaling:
driver: intel_pstate governor: powersave cores: 1: 1100 2: 800 3: 1100
4: 800 5: 1100 6: 800 7: 1100 8: 800 9: 1100 10: 1100 11: 1100 12: 800
13: 1028 14: 800 15: 1100 16: 800 17: 800 18: 800 19: 800 20: 800 21: 800
22: 800 23: 800 24: 800 25: 800 26: 800 27: 800 28: 800 29: 800 30: 800
31: 800 32: 800 bogomips: 191692
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities: <filter>
Graphics:
Device-1: NVIDIA AD104 [GeForce RTX 4070 Ti] vendor: Gigabyte driver: nvidia
v: 550.67 alternate: nouveau,nvidia_drm non-free: 550.xx+
status: current (as of 2024-04) arch: Lovelace code: AD1xx
process: TSMC n4 (5nm) built: 2022+ pcie: gen: 4 speed: 16 GT/s lanes: 16
ports: active: none off: DP-1 empty: DP-2,DP-3,HDMI-A-1 bus-ID: 01:00.0
chip-ID: 10de:2782 class-ID: 0300
Display: x11 server: X.Org v: 21.1.13 with: Xwayland v: 23.2.6
compositor: kwin_x11 driver: X: loaded: nvidia unloaded: modesetting,nouveau
alternate: fbdev,nv,vesa gpu: nvidia,nvidia-nvswitch display-ID: :0
screens: 1
Screen-1: 0 s-res: 2560x1080 s-dpi: 81 s-size: 803x343mm (31.61x13.50")
s-diag: 873mm (34.38")
Monitor-1: DP-1 mapped: DP-0 note: disabled model: LG (GoldStar) HDR WFHD
serial: <filter> built: 2020 res: 2560x1080 hz: 60 dpi: 81 gamma: 1.2
size: 798x334mm (31.42x13.15") diag: 869mm (34.2") modes: max: 2560x1080
min: 640x480
API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 2
drv: swrast gbm: drv: nvidia surfaceless: drv: nvidia x11: drv: nvidia
inactive: wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.67
glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 4070 Ti/PCIe/SSE2
memory: 11.71 GiB
API: Vulkan v: 1.3.279 layers: 10 device: 0 type: discrete-gpu name: NVIDIA
GeForce RTX 4070 Ti driver: nvidia v: 550.67 device-ID: 10de:2782
surfaces: xcb,xlib device: 1 type: cpu name: llvmpipe (LLVM 17.0.6 256
bits) driver: mesa llvmpipe v: 24.0.5-arch1.1 (LLVM 17.0.6)
device-ID: 10005:0000 surfaces: xcb,xlib
Audio:
Device-1: Intel Raptor Lake High Definition Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel alternate: snd_sof_pci_intel_tgl
bus-ID: 00:1f.3 chip-ID: 8086:7a50 class-ID: 0403
Device-2: NVIDIA vendor: Gigabyte driver: snd_hda_intel v: kernel pcie:
gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:22bc
class-ID: 0403
Device-3: Blue Microphones Yeti Stereo Microphone
driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 1.1 speed: 12 Mb/s
lanes: 1 mode: 1.1 bus-ID: 1-7:5 chip-ID: b58e:9e84 class-ID: 0300
serial: <filter>
API: ALSA v: k6.8.5-zen1-1-zen status: kernel-api with: aoss
type: oss-emulator tools: N/A
Server-1: PipeWire v: 1.0.5 status: active with: 1: pipewire-pulse
status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Intel Raptor Lake-S PCH CNVi WiFi driver: iwlwifi v: kernel
bus-ID: 00:14.3 chip-ID: 8086:7a70 class-ID: 0280
IF: wlp0s20f3 state: up mac: <filter>
Device-2: Intel Ethernet I226-V vendor: ASUSTeK driver: igc v: kernel
pcie: gen: 2 speed: 5 GT/s lanes: 1 port: N/A bus-ID: 05:00.0
chip-ID: 8086:125c class-ID: 0200
IF: eno1 state: down mac: <filter>
IF-ID-1: wg0-mullvad state: unknown speed: N/A duplex: N/A mac: N/A
Info: services: NetworkManager, smbd, systemd-timesyncd, wpa_supplicant
Bluetooth:
Device-1: Intel AX211 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:8 chip-ID: 8087:0033
class-ID: e001
Report: btmgmt ID: hci0 rfk-id: 0 state: up address: <filter> bt-v: 5.3
lmp-v: 12 status: discoverable: no pairing: no class-ID: 6c0104
RAID:
Hardware-1: Intel Volume Management Device NVMe RAID Controller Intel
driver: vmd v: 0.6 port: N/A bus-ID: 00:0e.0 chip-ID: 8086:a77f rev:
class-ID: 0104
Drives:
Local Storage: total: 1.82 TiB used: 455.32 GiB (24.4%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 PRO with
Heatsink 2TB size: 1.82 TiB block-size: physical: 512 B logical: 512 B
speed: 63.2 Gb/s lanes: 4 tech: SSD serial: <filter> fw-rev: 5B2QGXA7
temp: 35.9 C scheme: GPT
Partition:
ID-1: / raw-size: 1.82 TiB size: 1.82 TiB (100.00%) used: 455.32 GiB (24.4%)
fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 584 KiB (0.2%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
ID-3: /home raw-size: 1.82 TiB size: 1.82 TiB (100.00%)
used: 455.32 GiB (24.4%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
ID-4: /var/log raw-size: 1.82 TiB size: 1.82 TiB (100.00%)
used: 455.32 GiB (24.4%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
ID-5: /var/tmp raw-size: 1.82 TiB size: 1.82 TiB (100.00%)
used: 455.32 GiB (24.4%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 62.54 GiB used: 11.8 MiB (0.0%)
priority: 100 comp: zstd avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 32
dev: /dev/zram0
Sensors:
System Temperatures: cpu: 27.0 C mobo: N/A gpu: nvidia temp: 33 C
Fan Speeds (rpm): N/A gpu: nvidia fan: 0%
Info:
Memory: total: 64 GiB available: 62.54 GiB used: 6.8 GiB (10.9%)
Processes: 596 Power: uptime: 5m states: freeze,mem,disk suspend: s2idle
avail: deep wakeups: 0 hibernate: platform avail: shutdown, reboot,
suspend, test_resume image: 24.96 GiB services: org_kde_powerdevil,
power-profiles-daemon, upowerd Init: systemd v: 255 default: graphical
tool: systemctl
Packages: pm: pacman pkgs: 2169 libs: 597 tools: octopi,paru,yay
Compilers: clang: 17.0.6 gcc: 13.2.1 Shell: garuda-inxi default: Bash
v: 5.2.26 running-in: konsole inxi: 3.3.34
Garuda (2.6.25-1):
System install date:     2023-08-22
Last full system update: 2024-04-16 ↻
Is partially upgraded:   No
Relevant software:       snapper NetworkManager dracut nvidia-dkms
Windows dual boot:       No/Undetected
Failed units:

Basically, after a recent update, the spinny-fan GPU thing would happen. More specifically, GPU fans specifically start spinning really hard, simultaneously the screen goes black. Unlike what I mentioned in the OP, I lose control of the session - keyboard input is not read.

The GPU itself is not hot, and the power supply remains low. I’ve checked the cables and everything seems plugged in properly.

Earlier today I saw the old GPU has fallen off the bus error around crash-time in journalctl, but for the last two I’m not seeing it. I also sometimes see this:

pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)

I’ve searched on the errors to the best of my ability, and the general answers seem to be either physical hardware issue or power management. I’ve disabled ASPM, have now set PowerMizer to max performance, and once again found the ReBar thing (UEFI Asus option) on, so I turned it off. I’m still getting crashes. As a result, any change mentioned here has at least one reboot, and probably many more, afterwards.

I feel like there is probably a way to zone in on better info on what’s causing the crashes in the logs, but idk what to look for. I can’t dump a Nvidia bug report when I have an event since I lose the session whenever it happens. I’m hoping someone here has some ideas, or even can point me to a better place to ask. Thanks!

Link goes nowhere.

And please reboot.

2 Likes

I apologize about the link - it should be fixed now. I have rebooted since the last time any of the things I mentioned were tinkered with, as a result of the machine consistently crashing; I’ll edit the original to state that explicitly too.

Post the output from (replace the date with that of your update):

cat /var/log/pacman.log | grep '2024-04-16' | grep -E 'installed|upgraded|removed'

Sure!

[ALPM] upgraded alsa-card-profiles (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded xcb-proto (1.16.0-1 -> 1.17.0-1)
[ALPM] upgraded libxcb (1.16.1-1 -> 1.17.0-1)
[ALPM] upgraded libcap-ng (0.8.4-1 -> 0.8.5-1)
[ALPM] upgraded libwacom (2.10.0-1 -> 2.11.0-1)
[ALPM] upgraded attica (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kconfig (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcoreaddons (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcrash (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kdbusaddons (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded karchive (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcodecs (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded ki18n (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded libpipewire (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded libwireplumber (0.5.1-1 -> 0.5.1-2)
[ALPM] upgraded wireplumber (0.5.1-1 -> 0.5.1-2)
[ALPM] upgraded bluez-libs (5.73-4 -> 5.75-1)
[ALPM] upgraded pipewire-audio (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire-jack (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded libplacebo (6.338.2-2 -> 6.338.2-4)
[ALPM] upgraded kfilemetadata (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kidletime (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kwindowsystem (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kauth (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kguiaddons (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kcolorscheme (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kwidgetsaddons (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kconfigwidgets (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kbookmarks (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcompletion (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kiconthemes (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kitemviews (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded knotifications (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kjobwidgets (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kservice (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kwallet (6.0.0-3 -> 6.1.0-1)
[ALPM] upgraded solid (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kio (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded baloo (6.0.0-3 -> 6.1.0-1)
[ALPM] upgraded bluez (5.73-4 -> 5.75-1)
[ALPM] upgraded bluez-cups (5.73-4 -> 5.75-1)
[ALPM] upgraded bluez-hid2hci (5.73-4 -> 5.75-1)
[ALPM] upgraded bluez-qt (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded bluez-utils (5.73-4 -> 5.75-1)
[ALPM] upgraded breeze-icons (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded cachyos-ananicy-rules-git (20240409.r312.gde55e2f-1 -> 20240416.r313.g7abaddd-1)
[ALPM] upgraded candy-icons-git (r1132.d98566b-1 -> r1140.700085e-1)
[ALPM] upgraded code (1.88.0-1 -> 1.88.1-1)
[ALPM] upgraded dxvk-mingw-git (2.3.1.r8.g133f0794-1 -> 2.3.1.r9.g571948cf-1)
[ALPM] upgraded python-capng (0.8.4-1 -> 0.8.5-1)
[ALPM] upgraded firewalld (2.1.1-1 -> 2.1.2-1)
[ALPM] upgraded fish (3.7.1-1 -> 3.7.1-2)
[ALPM] upgraded kpackage (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded syndication (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded knewstuff (6.0.0-5 -> 6.1.0-1)
[ALPM] upgraded frameworkintegration (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded plasma5-themes-sweet-full-git (r362.1c31bf0-1 -> r363.460042a-1)
[ALPM] upgraded ttf-firacode-nerd (3.2.0-1 -> 3.2.1-1)
[ALPM] upgraded kirigami (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kitemmodels (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded ksvg (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kglobalaccel (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kxmlgui (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcmutils (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kdeclarative (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kded (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kholidays (1:6.0.0-1 -> 1:6.1.0-1)
[ALPM] upgraded knotifyconfig (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kparts (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kquickcharts (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded krunner (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kstatusnotifieritem (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded sonnet (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded ktextwidgets (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded syntax-highlighting (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded ktexteditor (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded kuserfeedback (6.0.0-2 -> 6.1.0-1)
[ALPM] upgraded prison (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kpty (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kdesu (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kdnssd (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded qqc2-desktop-style (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded xorg-server-common (21.1.12-1 -> 21.1.13-1)
[ALPM] upgraded garuda-dr460nized (4.2.8-4 -> 4.3.0-4)
[ALPM] upgraded gst-plugin-pipewire (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded inxi (3.3.33.1-1 -> 3.3.34.1-1)
[ALPM] upgraded kcalendarcore (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kcontacts (1:6.0.0-1 -> 1:6.1.0-1)
[ALPM] upgraded kpeople (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kplotting (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded ktexttemplate (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded kunitconversion (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded less (1:643-1 -> 1:643-2)
[ALPM] upgraded lib32-libpipewire (1:1.0.4-1 -> 1:1.0.5-1)
[ALPM] upgraded lib32-pipewire (1:1.0.4-1 -> 1:1.0.5-1)
[ALPM] upgraded lib32-pipewire-jack (1:1.0.4-1 -> 1:1.0.5-1)
[ALPM] upgraded libsynctex (2024.0-1 -> 2024.2-1)
[ALPM] upgraded lutris (0.5.16-2 -> 0.5.17-1)
[ALPM] upgraded modemmanager-qt (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded pcsclite (2.0.3-1 -> 2.1.0-2)
[ALPM] upgraded networkmanager-qt (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded passim (0.1.5-1 -> 0.1.6-1)
[ALPM] upgraded pipewire-alsa (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire-pulse (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire-v4l2 (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire-x11-bell (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded pipewire-zeroconf (1:1.0.4-4 -> 1:1.0.5-1)
[ALPM] upgraded proton-ge-custom (2:GE.Proton9.2-1 -> 2:GE.Proton9.4-1)
[ALPM-SCRIPTLET]    version, installed according to upstream instructions, feel free to report it through
[ALPM] upgraded purpose (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded shiboken6 (6.7.0git20240406-1 -> 6.7.0-3)
[ALPM] upgraded pyside6 (6.7.0git20240406-1 -> 6.7.0-3)
[ALPM] upgraded python-pydantic-core (1:2.16.3-1 -> 1:2.18.1-1)
[ALPM] upgraded python-pydantic (2.6.4-2 -> 2.7.0-1)
[ALPM] upgraded reshade-shaders-git (r63.2b1dbb5-1 -> r66.9367bed-1)
[ALPM-SCRIPTLET] The files are now installed in a new location to be usable within Steam Proton.
[ALPM] upgraded retroarch-autoconfig-udev-git (r2295.4e1261f-1 -> r2298.525ae73-1)
[ALPM] upgraded rnnoise (0.4.1-1 -> 1:0.2-1)
[ALPM] upgraded spotube-bin (3.5.0-1 -> 3.6.0-1)
[ALPM] upgraded telegram-desktop (4.16.6-1 -> 4.16.7-1)
[ALPM] upgraded texlive-bin (2024.0-1 -> 2024.2-1)
[ALPM] upgraded texlive-basic (2024.0-3 -> 2024.2-1)
[ALPM] upgraded texlive-binextra (2024.0-3 -> 2024.2-1)
[ALPM] upgraded texlive-latex (2024.0-3 -> 2024.2-1)
[ALPM] upgraded texlive-latexrecommended (2024.0-3 -> 2024.2-1)
[ALPM] upgraded threadweaver (6.0.0-1 -> 6.1.0-1)
[ALPM] upgraded tree-sitter (0.22.2-1 -> 0.22.5-1)
[ALPM] upgraded ttf-fantasque-nerd (3.2.0-1 -> 3.2.1-1)
[ALPM] upgraded vivaldi (6.6.3271.57-1 -> 6.6.3271.61-1)
[ALPM] upgraded xf86-input-wacom (1.2.1-1 -> 1.2.2-1)
[ALPM] upgraded xorg-server (21.1.12-1 -> 21.1.13-1)
[ALPM] upgraded zoom (5.17.11-1 -> 6.0.0-1)

[ALPM] installed yyjson (0.9.0-2)
[ALPM] upgraded fastfetch (2.9.1-1 -> 2.9.2-1)
[ALPM] upgraded fwupd-efi (1.5-2 -> 1.6-1)

I separated out the last three since they were later in the day.

Just in case the clarification is useful, the I had this issue a bunch a bit more than half a year ago, but it eventually went away. It had resurfaced recently, but infrequently enough that I just kinda dealt with it. The update today had it crashing every few minutes.

I’ll also add that the current session has been going for ~2 hours, but I’m barely doing anything on the machine out of a voodoo-like fear that I’ll run something that will trigger it, even though I see no pattern whatsoever in when it happens.

Thanks!

I have read it.

Sounds like hardware.

Reboot and check again.

1 Like

There has been many version releases since 1210; I would update your BIOS.

Updates since your current version include improvements to system stability, compatibility, and performance. Pretty safe bet this is a good place to start.

8 Likes

TIL that doesn’t happen on its own, big noob moment on my part.

I’m trying to see if there’s a way to do that from inside Garuda, but not finding anything - does that mean I need to flash from inside BIOS? I’m also scared by this, does that mean I need to update this ME thing separately?

(Update: looking at the latest firmware for this specific motherboard here seems to indicate that the Arch wiki is outdated, at least in this specific case, as it mentions that the firmware itself will also update ME, unlike the example linked in the Arch wiki. I really hope I’m reading this right. I like my computer.)

This might be straying from being a Garuda-specific question, I just want to make sure I’m not about to turn my station into a brick. Feel free to point me elsewhere.

How to flash the BIOS can be found in the manual for your motherboard. Short form:
“BIOS FlashBack™ is a safe and simple way to update BIOS. Just drop the (UEFI) BIOS file onto a FAT32-formatted USB stick, plug it into the USB BIOS FlashBack port, and press the button. Updates can even be performed without having memory or a CPU installed.”

Did you read the description?
ROG STRIX B760-G GAMING WIFI BIOS 1645:
“Updating this BIOS will simultaneously update the corresponding Intel ME to version 16.1.30.2307v4.”

A little more initiative would be nice :slight_smile:

3 Likes

Better off goiing into the bios and flashing from there. You get a progress bar, and it tells you when it’s done. The method on the back is for emergencies.

1 Like

I did see that (and put it in the updates), the thing that threw me off was the Arch wiki. If it would be a lower-stakes thing than this, I’d probably be less risk-averse, but really don’t want to brick my machine. This is a process I don’t know anything about. I’m just saying this to say, please believe me when I say I put in over an hour trying to figure it all out before I responded. I don’t think it’s initiative I’m lacking, it’s that I’m a newbie out of their depth, and barely understand what I’m reading.

Just to be clear, I’m not even blaming the Arch wiki, just my own intelligence and ability to parse it.

I really don’t want to be blunt but I’m going to. The reason you have the bios feature on the back of your board is for occasions when the bios does go tits up and you can’t boot your system. In other words you cannot brick your system. Either read the documentation that came with your board, go to your boards Asus page and read it there, or find a YouTube video that explains it.

2 Likes

I think you should put the arch-wiki aside for this case :wink:

This is nonsense :slight_smile:

I agree with @Locutus. Be smart and update your BIOS.


Do you also have the issue in a wayland session?

2 Likes

I’m working on getting the BIOS updated atm, of all of the holdups, I do not have easy access to any USB drives, so that’s the step I’m on.

I haven’t tried launching in Wayland, will give a go after the next crash.

If you don’t have access to a flash drive then how did you install Garuda?

1 Like

To be more precise, all flash drives that I currently have in my possession have been converted previously to live media. I might be misunderstanding, but had thought that after converting them to live that they are no longer considered stable for storage via reformatting. Is this another TIL for me?

Any flash drive can be reset to factory default and you’ll have no issue with it. I would suggest taking the largest one you have and doing so, then put Ventoy on it using a GPT partition scheme. This should have all the information you will need if you decide to do this.

3 Likes

Not just after the next crash, but now.
I don’t think that’s the cause, but it’s about ruling out x11 as the source of the error. That’s why you should use wayland and check whether you have the same issues with it.

You’ve had the problems for a long time - and the error pattern and error messages point more to a driver issue or a hardware problem.

btw: If this results in a problem with Nvidia drivers, I’m out of here anyway - topics with Nvidia driver issues are on my blacklist :grin:

3 Likes

well, TIL again

I reformatted/erased/reset a drive I had around, and updated the BIOS. I was only sure I had irrevocably bricked my machine at one point during the process, which is better than expected. The computer has not crashed so far.

@nepti I had tried a Wayland session previously, and alas, it still crashed.

I’m hoping I don’t need to respond here again. Thanks for taking the time to explain things to me regardless, even if some of it is basic knowledge. As is, just the fact that I am running Linux makes people around me irl gawk at me like I’m some kind of wundernerd. Saying things like “I am actually terrified of updating BIOS” does not help, and generally makes things even worse. (“Whoa,” I hear them mutter, “the wundernerd uses the arcane word bye-ohs in a sentence!”) I do my best, but, you know, there’s always so much more out there to know.

2 Likes

Now you know how to flash a BIOS :slight_smile:

Then you can rule out x11 as the cause and hope that it was just the BIOS.
It’s best to continue using x11, nvidia + wayland is still pretty buggy at the moment.

You are definitely surrounded by Windows users :grin:

4 Likes