Hi Garuda community,
I'm opening this weeb-post b/c I couldn't find anything about my issue on the Garuda forum nor in the net in general. (except some weird RAID-Server issues which misses the topic.)
I've got into some weird problems with my Garuda. It crashes (very hard - without kernel panic) unexpecting. It can happen during (writing this post with my Garuda) Desktop light tasks or gaming. There is just one game (Satisfactory) where it happens for sure. So I could reproduce it.
Hence I did it as often as I could to gasp the POST messages during the crash. And it notes: UUID can't be mounted. Which points to my only used SDD and frankly it holds the whole system. Why on earth would my system unmount my used file system? I never had that issue.
Additional issues (and with estimated rates):
- 3 or 10 cases: SSD with Garuda doesn't appear in BIOS Boot menu (fixed by restarting)
- 1 / hr: crashes unexpected like described above
- every time: crashes like described above while playing Satisfactory (Lutris + Steam)
While having no issue with hard workloads e.g. DaVinci Resolve Studio (fully loaded GPU and CPU a ton of caches files (200GB caching is common like full IO/Speeds)).
My measures:
- I did a forced btrfs-check - no errors found.
- I did a full scrub - no errors found.
- I did balancing - no errors/issues found.
- I did journalctl -b -1 -p3
As you can see here:
Jounalctl
(just showed me my btrfs-assistant qt5 issue but I don't mind it, since I'm using the tty anyways)
Mär 30 23:36:30 Clockwork kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
Mär 30 23:36:31 Clockwork systemd-coredump[593]: [🡕] Process 586 (plymouth) of user 0 dumped core.
Module linux-vdso.so.1 with build-id 48d1f2a0ba6b9bf33e371e464222a5d7f3ba1da9
Module libgcc_s.so.1 with build-id 5d817452a709ca3a213341555ddcf446ecee37fa
Module ld-linux-x86-64.so.2 with build-id c09c6f50f6bcec73c64a0b4be77eadb8f7202410
Module libm.so.6 with build-id 596b63a006a4386dcab30912d2b54a7a61827b07
Module libudev.so.1 with build-id 7dc938362569112855b6086de066cd6a18d1b978
Module libc.so.6 with build-id 85766e9d8458b16e9c7ce6e07c712c02b8471dbc
Module libply.so.5 with build-id 7872fe68a815672f245767251ea58cc6d807e48a
Module plymouth with build-id 5d41380191d68f361d3307b01cd3592755df9b97
Stack trace of thread 586:
#0 0x00007f98e39bbfd4 ply_list_get_first_node (libply.so.5 + 0x9fd4)
#1 0x00007f98e39ba04b n/a (libply.so.5 + 0x804b)
#2 0x00007f98e39ba696 ply_command_parser_free (libply.so.5 + 0x8696)
#3 0x000055a4bcd628d1 n/a (plymouth + 0x28d1)
#4 0x00007f98e37d5310 __libc_start_call_main (libc.so.6 + 0x2d310)
#5 0x00007f98e37d53c1 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d3c1)
#6 0x000055a4bcd62fa5 n/a (plymouth + 0x2fa5)
ELF object binary architecture: AMD x86-64
Mär 30 23:36:32 Clockwork systemd-coredump[664]: [🡕] Process 661 (plymouth) of user 0 dumped core.
Module linux-vdso.so.1 with build-id 48d1f2a0ba6b9bf33e371e464222a5d7f3ba1da9
Module libgcc_s.so.1 with build-id 5d817452a709ca3a213341555ddcf446ecee37fa
Module ld-linux-x86-64.so.2 with build-id c09c6f50f6bcec73c64a0b4be77eadb8f7202410
Module libm.so.6 with build-id 596b63a006a4386dcab30912d2b54a7a61827b07
Module libudev.so.1 with build-id 7dc938362569112855b6086de066cd6a18d1b978
Module libc.so.6 with build-id 85766e9d8458b16e9c7ce6e07c712c02b8471dbc
Module libply.so.5 with build-id 7872fe68a815672f245767251ea58cc6d807e48a
Module plymouth with build-id 5d41380191d68f361d3307b01cd3592755df9b97
Stack trace of thread 661:
#0 0x00007ff65a0e80f4 ply_list_node_get_data (libply.so.5 + 0xa0f4)
#1 0x00007ff65a0e6029 n/a (libply.so.5 + 0x8029)
#2 0x00007ff65a0e6696 ply_command_parser_free (libply.so.5 + 0x8696)
#3 0x000055fe5c2768d1 n/a (plymouth + 0x28d1)
#4 0x00007ff659f01310 __libc_start_call_main (libc.so.6 + 0x2d310)
#5 0x00007ff659f013c1 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d3c1)
#6 0x000055fe5c276fa5 n/a (plymouth + 0x2fa5)
ELF object binary architecture: AMD x86-64
Mär 30 23:37:04 Clockwork btrfs-assistant-bin[4820]: This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, xcb.
Mär 30 23:37:04 Clockwork kernel: ata9.00: exception Emask 0x10 SAct 0xffff00 SErr 0x4c0000 action 0x6 frozen
Mär 30 23:37:04 Clockwork kernel: ata9.00: irq_stat 0x08000000, interface fatal error
Mär 30 23:37:04 Clockwork kernel: ata9: SError: { CommWake 10B8B Handshk }
Mär 30 23:37:04 Clockwork kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Mär 30 23:37:04 Clockwork kernel: ata9.00: cmd 61/28:40:f8:38:5f/00:00:07:00:00/40 tag 8 ncq dma 20480 out
res 40/00:44:f8:38:5f/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
Mär 30 23:37:04 Clockwork kernel: ata9.00: status: { DRDY }
Mär 30 23:37:04 Clockwork kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Mär 30 23:37:04 Clockwork kernel: ata9.00: cmd 61/08:48:38:39:5f/00:00:07:00:00/40 tag 9 ncq dma 4096 out
res 40/00:44:f8:38:5f/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
Mär 30 23:37:04 Clockwork kernel: ata9.00: status: { DRDY }
Mär 30 23:37:04 Clockwork kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Mär 30 23:37:04 Clockwork kernel: ata9.00: cmd 61/28:50:48:39:5f/00:00:07:00:00/40 tag 10 ncq dma 20480 out
res 40/00:44:f8:38:5f/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
Mär 30 23:37:04 Clockwork kernel: ata9.00: status: { DRDY }
Mär 30 23:37:04 Clockwork kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Mär 30 23:37:04 Clockwork kernel: ata9.00: cmd 61/08:58:80:39:5f/00:00:07:00:00/40 tag 11 ncq dma 4096 out
res 40/00:44:f8:38:5f/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
Mär 30 23:37:04 Clockwork kernel: ata9.00: status: { DRDY }
BTRFS is completely new to me. I mostly used ext4 before. Maybe I'm a n00b here xD
I also would exclude GPU (driver) issues since I never had any glitch and as mentioned I can push it hard without issues (including games - except Satisfactory) with no issues.
My inxi:
System:
Kernel: 5.17.1-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 11.2.0
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=be3ca4bc-7720-459e-8370-bd92450c06b0 rw rootflags=subvol=@
quiet quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0
loglevel=3
Desktop: KDE Plasma v: 5.24.4 tk: Qt v: 5.15.3 info: latte-dock
wm: kwin_x11 vt: 1 dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
Type: Desktop Mobo: Micro-Star model: MPG X570 GAMING PLUS (MS-7C37) v: 2.0
serial: <superuser required> UEFI: American Megatrends LLC. v: A.F0
date: 12/16/2021
CPU:
Info: model: AMD Ryzen 7 5800X bits: 64 type: MT MCP arch: Zen 3
family: 0x19 (25) model-id: 0x21 (33) stepping: 0 microcode: 0xA201016
Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache:
L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 4 MiB desc: 8x512 KiB
L3: 32 MiB desc: 1x32 MiB
Speed (MHz): avg: 3674 high: 4505 min/max: 2200/4850 boost: enabled
scaling: driver: acpi-cpufreq governor: performance cores: 1: 3662 2: 3682
3: 3607 4: 3599 5: 3599 6: 3621 7: 3641 8: 3599 9: 3641 10: 3673 11: 3609
12: 3613 13: 3588 14: 3552 15: 4505 16: 3598 bogomips: 121600
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Vulnerabilities:
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: spec_store_bypass
mitigation: Speculative Store Bypass disabled via prctl
Type: spectre_v1
mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
STIBP: always-on, RSB filling
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] vendor: ASUSTeK
driver: amdgpu v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16 ports:
active: DP-2,HDMI-A-1 empty: DP-1,DP-3 bus-ID: 2f:00.0 chip-ID: 1002:73bf
class-ID: 0300
Display: x11 server: X.Org v: 1.21.1.3 compositor: kwin_x11 driver: X:
loaded: amdgpu unloaded: modesetting,radeon alternate: fbdev,vesa
gpu: amdgpu display-ID: :0 screens: 1
Screen-1: 0 s-res: 3840x1080 s-dpi: 96 s-size: 1016x285mm (40.00x11.22")
s-diag: 1055mm (41.54")
Monitor-1: DP-2 mapped: DisplayPort-1 pos: primary,left model: Asus VS248
serial: <filter> built: 2012 res: 1920x1080 hz: 60 dpi: 92 gamma: 1.2
size: 531x299mm (20.91x11.77") diag: 609mm (24") ratio: 16:9 modes:
max: 1920x1080 min: 720x400
Monitor-2: HDMI-A-1 mapped: HDMI-A-0 pos: primary,right model: Asus VS248
serial: <filter> built: 2012 res: 1920x1080 hz: 60 dpi: 92 gamma: 1.2
size: 531x299mm (20.91x11.77") diag: 609mm (24") ratio: 16:9 modes:
max: 1920x1080 min: 720x400
OpenGL: renderer: AMD Radeon RX 6800 XT (sienna_cichlid LLVM 13.0.1 DRM
3.44 5.17.1-zen1-1-zen)
v: 4.6 Mesa 22.0.0 direct render: Yes
Audio:
Device-1: AMD Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
bus-ID: 2f:00.1 chip-ID: 1002:ab28 class-ID: 0403
Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI
driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
bus-ID: 31:00.4 chip-ID: 1022:1487 class-ID: 0403
Sound Server-1: ALSA v: k5.17.1-zen1-1-zen running: yes
Sound Server-2: PulseAudio v: 15.0 running: no
Sound Server-3: PipeWire v: 0.3.49 running: yes
Network:
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: Micro-Star MSI X570-A PRO driver: r8169 v: kernel pcie: gen: 1
speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 27:00.0 chip-ID: 10ec:8168
class-ID: 0200
IF: enp39s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Drives:
Local Storage: total: 1.36 TiB used: 605.67 GiB (43.3%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Crucial model: CT1000P5SSD8
size: 931.51 GiB block-size: physical: 512 B logical: 512 B
speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter> rev: P4CR311
temp: 34.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Crucial model: CT500MX500SSD1
size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 023 scheme: GPT
Partition:
ID-1: / raw-size: 465.46 GiB size: 465.46 GiB (100.00%)
used: 86.35 GiB (18.6%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 576 KiB (0.2%) fs: vfat dev: /dev/sda1 maj-min: 8:1
ID-3: /home raw-size: 465.46 GiB size: 465.46 GiB (100.00%)
used: 86.35 GiB (18.6%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-4: /var/log raw-size: 465.46 GiB size: 465.46 GiB (100.00%)
used: 86.35 GiB (18.6%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
ID-5: /var/tmp raw-size: 465.46 GiB size: 465.46 GiB (100.00%)
used: 86.35 GiB (18.6%) fs: btrfs dev: /dev/sda2 maj-min: 8:2
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 62.78 GiB used: 2.5 MiB (0.0%)
priority: 100 dev: /dev/zram0
Sensors:
System Temperatures: cpu: 48.5 C mobo: N/A gpu: amdgpu temp: 33.0 C
mem: 32.0 C
Fan Speeds (RPM): N/A gpu: amdgpu fan: 0
Info:
Processes: 348 Uptime: 39m wakeups: 0 Memory: 62.78 GiB
used: 3.75 GiB (6.0%) Init: systemd v: 250 tool: systemctl Compilers:
gcc: 11.2.0 clang: 13.0.1 Packages: pacman: 1770 lib: 516 Shell: fish
v: 3.3.1 default: Bash v: 5.1.16 running-in: konsole inxi: 3.3.14
Garuda (2.5.6-2):
System install date: 2022-03-24
Last full system update: 2022-03-30
Is partially upgraded: No
Relevant software: NetworkManager
Windows dual boot: Probably (Run as root to verify)
Snapshots: Snapper
Failed units: bluetooth-autoconnect.service
I'd be happy about any advice or hint.
Atb and thanks in advance!
Bruce