Hi All
This is a update to my previous post regarding Garuda hanging. It appears that this is related to the driver for my video card (or the kernel). I am not sure of the actual issue but I do know this works with an older kernel from my quick testing. The Ubuntu test was kernel 5.8. Below is the log information (Thank you @petsam). I can tell the obvious from the log; that is, an issue with the kernel/amdgpu. My obvious question is what is the way out. Is it best to install another kernel and test or perhaps go with the amdpgu-pro driver in aur? I would like to avoid the latter. Perhaps someone will have a simpler solution.
-- Journal begins at Sat 2021-03-13 11:18:44 EST, ends at Sat 2021-03-13 15:53:03 EST. --
Mar 13 12:42:09 altair lightdm[1442]: gkr-pam: unable to locate daemon control file
Mar 13 12:42:11 altair nmbd[1539]: [2021/03/13 12:42:11.374536, 0] ../../lib/util/become_daemon.c:135(daemon_ready)
Mar 13 12:42:11 altair nmbd[1539]: daemon_ready: daemon 'nmbd' finished starting up and ready to serve connections
Mar 13 12:42:11 altair smbd[1541]: [2021/03/13 12:42:11.423338, 0] ../../lib/util/become_daemon.c:135(daemon_ready)
Mar 13 12:42:11 altair smbd[1541]: daemon_ready: daemon 'smbd' finished starting up and ready to serve connections
Mar 13 12:42:40 altair pulseaudio[1721]: GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Mar 13 13:00:47 altair kernel: amdgpu: SMU load firmware failed
Mar 13 13:00:47 altair kernel: amdgpu: fw load failed
Mar 13 13:00:47 altair kernel: amdgpu: smu firmware loading failed
Mar 13 13:00:47 altair kernel: amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_resume failed (-22).
Mar 13 13:00:47 altair kernel: amdgpu: Move buffer fallback to memcpy unavailable
Mar 13 13:00:47 altair kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Mar 13 13:00:47 altair kernel: amdgpu: Move buffer fallback to memcpy unavailable
Mar 13 13:00:47 altair kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Mar 13 13:00:47 altair kernel: snd_hda_intel 0000:01:00.1: CORB reset timeout#1, CORBRP = 0
Mar 13 13:00:57 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=241, emitted seq=243
Mar 13 13:00:57 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=601, emitted seq=603
Mar 13 13:00:57 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Mar 13 13:00:57 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Mar 13 13:00:57 altair kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
Mar 13 13:00:57 altair kernel: #PF: supervisor read access in kernel mode
Mar 13 13:00:57 altair kernel: #PF: error_code(0x0000) - not-present page
Mar 13 13:00:57 altair kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Mar 13 13:00:57 altair kernel: CPU: 1 PID: 14043 Comm: kworker/1:0 Not tainted 5.11.5-zen1-1-zen #1
Mar 13 13:00:57 altair kernel: Hardware name: Supermicro C7Z170-M/C7Z170-M, BIOS 2.2 01/07/2019
Mar 13 13:00:57 altair kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
Mar 13 13:00:57 altair kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 [amdgpu]
Mar 13 13:00:57 altair kernel: Code: 28 48 88 c0 e8 a4 83 01 d9 e9 78 fe ff ff e8 0a 36 66 d9 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 48 8b 7f 08 4c
Mar 13 13:00:57 altair kernel: RSP: 0018:ffffad174fe17d50 EFLAGS: 00010246
Mar 13 13:00:57 altair kernel: RAX: 0000000000000000 RBX: ffff93ac8ad75000 RCX: 0000000080800079
Mar 13 13:00:57 altair kernel: RDX: 000000008080007a RSI: 0000000000000001 RDI: ffff93ac8c27ad80
Mar 13 13:00:57 altair kernel: RBP: ffff93ac8c27ad80 R08: 0000000000000001 R09: 0000000000000001
Mar 13 13:00:57 altair kernel: R10: ffff93ac8c278040 R11: 0000000000000000 R12: ffff93ac8ad750d0
Mar 13 13:00:57 altair kernel: R13: ffff93ac8ad20000 R14: ffff93ac8142c000 R15: ffff93ac8142c0c8
Mar 13 13:00:57 altair kernel: FS: 0000000000000000(0000) GS:ffff93b3cec80000(0000) knlGS:0000000000000000
Mar 13 13:00:57 altair kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 13 13:00:57 altair kernel: CR2: 0000000000000028 CR3: 0000000248c10001 CR4: 00000000003706e0
Mar 13 13:00:57 altair kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 13 13:00:57 altair kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 13 13:00:57 altair kernel: Call Trace:
Mar 13 13:00:57 altair kernel: stop_cpsch+0xa0/0xc0 [amdgpu]
Mar 13 13:00:57 altair kernel: kgd2kfd_pre_reset+0x56/0x80 [amdgpu]
Mar 13 13:00:57 altair kernel: amdgpu_device_gpu_recover.cold+0x36e/0x98a [amdgpu]
Mar 13 13:00:57 altair kernel: amdgpu_job_timedout+0x121/0x140 [amdgpu]
Mar 13 13:00:57 altair kernel: drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
Mar 13 13:00:57 altair kernel: process_one_work+0x214/0x3e0
Mar 13 13:00:57 altair kernel: worker_thread+0x4d/0x470
Mar 13 13:00:57 altair kernel: ? flush_delayed_work+0x40/0x40
Mar 13 13:00:57 altair kernel: kthread+0x181/0x1b0
Mar 13 13:00:57 altair kernel: ? __kthread_init_worker+0x50/0x50
Mar 13 13:00:57 altair kernel: ret_from_fork+0x22/0x30
Mar 13 13:00:57 altair kernel: Modules linked in: ufs hfsplus hfs minix vfat msdos fat jfs xfs ext4 crc16 mbcache jbd2 dm_mod zram rfkill intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg irqbypass soundwire_intel crct10dif_pclmul crc32_pclmul soundwire_generic_allocation ghash_clmulni_intel soundwire_cadence snd_hda_codec snd_hda_core snd_hwdep soundwire_bus iTCO_wdt aesni_intel intel_pmc_bxt ee1004 iTCO_vendor_support crypto_simd cryptd glue_helper rapl snd_soc_core intel_cstate snd_compress e1000e intel_uncore psmouse ac97_bus snd_pcm_dmaengine i2c_i801 i2c_smbus joydev snd_pcm mousedev snd_timer snd soundcore intel_pch_thermal acpi_pad mac_hid uinput crypto_user fuse bpf_preload ip_tables x_tables usbhid btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq crc32c_intel serio_raw sr_mod cdrom xhci_pci xhci_pci_renesas nouveau mxm_wmi wmi radeon
Mar 13 13:00:57 altair kernel: i915 video intel_agp intel_gtt amdgpu gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
Mar 13 13:00:57 altair kernel: CR2: 0000000000000028
Mar 13 13:00:57 altair kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 [amdgpu]
Mar 13 13:00:57 altair kernel: Code: 28 48 88 c0 e8 a4 83 01 d9 e9 78 fe ff ff e8 0a 36 66 d9 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 48 8b 7f 08 4c
Mar 13 13:00:57 altair kernel: RSP: 0018:ffffad174fe17d50 EFLAGS: 00010246
Mar 13 13:00:57 altair kernel: RAX: 0000000000000000 RBX: ffff93ac8ad75000 RCX: 0000000080800079
Mar 13 13:00:57 altair kernel: RDX: 000000008080007a RSI: 0000000000000001 RDI: ffff93ac8c27ad80
Mar 13 13:00:57 altair kernel: RBP: ffff93ac8c27ad80 R08: 0000000000000001 R09: 0000000000000001
Mar 13 13:00:57 altair kernel: R10: ffff93ac8c278040 R11: 0000000000000000 R12: ffff93ac8ad750d0
Mar 13 13:00:57 altair kernel: R13: ffff93ac8ad20000 R14: ffff93ac8142c000 R15: ffff93ac8142c0c8
Mar 13 13:00:57 altair kernel: FS: 0000000000000000(0000) GS:ffff93b3cec80000(0000) knlGS:0000000000000000
Mar 13 13:00:57 altair kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 13 13:00:57 altair kernel: CR2: 0000000000000028 CR3: 0000000248c10001 CR4: 00000000003706e0
Mar 13 13:00:57 altair kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 13 13:00:57 altair kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 13 13:01:08 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=601, emitted seq=603
Mar 13 13:01:08 altair kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0