How to reproduce (AMD gpus only):
On a Garuda install
- Install Mprime
- Select test #3 (large FFTs) (simulates a low memory condition)
- Suspend system
- Wake system up and observe that your GPU is not working anymore
- Hard reset system
Tested on AMD rx 580
Here are the related messages to the issue
Oct 24 03:11:04 RyzenCore kernel: amdgpu_ttm_tt_populate+0x39/0x90 [amdgpu 6ba956e1da631d21da560bce53ea1262da29f35f]
Oct 24 03:11:04 RyzenCore kernel: amdgpu_device_suspend+0xb0/0x150 [amdgpu 6ba956e1da631d21da560bce53ea1262da29f35f]
Oct 24 03:11:04 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: PCI CONFIG reset
Oct 24 03:11:04 RyzenCore kernel: [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
Oct 24 03:11:14 RyzenCore kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=17362, emitted seq=17365
Oct 24 03:11:14 RyzenCore kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process latte-dock pid 6552 thread latte-dock:cs0 pid 6633
Oct 24 03:11:14 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Oct 24 03:11:15 RyzenCore kernel: amdgpu: cp is busy, skip halt cp
Oct 24 03:11:15 RyzenCore kernel: amdgpu: rlc is busy, skip halt rlc
Oct 24 03:11:15 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: BACO reset
Oct 24 03:11:16 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume