AMD gpu crashes when waking from suspend after low memory situation occurs

How to reproduce (AMD gpus only):
On a Garuda install

  1. Install Mprime
  2. Select test #3 (large FFTs) (simulates a low memory condition)
  3. Suspend system
  4. Wake system up and observe that your GPU is not working anymore
  5. Hard reset system

Tested on AMD rx 580

Here are the related messages to the issue

Oct 24 03:11:04 RyzenCore kernel:  amdgpu_ttm_tt_populate+0x39/0x90 [amdgpu 6ba956e1da631d21da560bce53ea1262da29f35f]
Oct 24 03:11:04 RyzenCore kernel:  amdgpu_device_suspend+0xb0/0x150 [amdgpu 6ba956e1da631d21da560bce53ea1262da29f35f]
Oct 24 03:11:04 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: PCI CONFIG reset
Oct 24 03:11:04 RyzenCore kernel: [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
Oct 24 03:11:14 RyzenCore kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=17362, emitted seq=17365
Oct 24 03:11:14 RyzenCore kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process latte-dock pid 6552 thread latte-dock:cs0 pid 6633
Oct 24 03:11:14 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Oct 24 03:11:15 RyzenCore kernel: amdgpu: cp is busy, skip halt cp
Oct 24 03:11:15 RyzenCore kernel: amdgpu: rlc is busy, skip halt rlc
Oct 24 03:11:15 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: BACO reset
Oct 24 03:11:16 RyzenCore kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.