The egpu Radeon freezes the screen and logs me out

As a new person in the world of Linux, I have almost managed to fully transition to this wonderful environment, with a small problem that I am unable to solve. Can I count on your help?

I have: Garuda Linux x86_64 Host: 82UT (Yoga Slim 7 Pro 14IAH7) Kernel: Linux 6.9.10-zen1-1-zen DE: GNOME 46.3.1 GPU: Intel Iris Xe Graphics @ 1.30 GHz [Integrated] GPU: AMD Radeon RX 6600 [Discrete]

Problem: While using applications such as Brave, Thunderbird, terminal, office, the screen freezes from time to time, everything shuts down, and it goes to the login screen. The time intervals are irregular. There is no problem when I am playing a game from Steam.

What I have already done:

  1. Set the Intel card to handle everything by default and AMD only during games.
  2. Changed the kernel version from zen to regular and amd. By the way, amd doesn’t detect amd at all :slight_smile:
  3. Tried changing the environment from GNOME to KDE
  4. Played around with GRUB configuration, e.g.:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amdgpu.runpm=0 amdgpu.exp_hw_support=1 pcie_aspm=off pci=noaer
  1. Disabled starship
  2. Disabled all GNOME extensions
  3. Disabled power saving for the Thunderbolt port and the entire eGPU
  4. Obvious things like updating everything possible have also been done.

I am pasting the error code:

ip 23 18:33:54 potwor systemd[2057]: Started VTE child process 78309 launched by gnome-terminal-server process 3527.
lip 23 18:33:57 potwor sudo[78371]: pam_systemd_home(sudo:auth): New sd-bus connection (system-bus-pam-systemd-home-78371) opened.
lip 23 18:34:00 potwor sudo[78371]:  norfear : TTY=pts/2 ; PWD=/home/norfear ; USER=root ; TSID=000019 ; COMMAND=/usr/bin/pacman -S gnome->
lip 23 18:34:00 potwor sudo[78371]: pam_unix(sudo:session): session opened for user root(uid=0) by norfear(uid=1000)
lip 23 18:34:04 potwor sudo[78371]: pam_unix(sudo:session): session closed for user root
lip 23 18:34:12 potwor sudo[78470]: pam_systemd_home(sudo:account): New sd-bus connection (system-bus-pam-systemd-home-78470) opened.
lip 23 18:34:12 potwor sudo[78470]:  norfear : TTY=pts/2 ; PWD=/home/norfear ; USER=root ; TSID=00001A ; COMMAND=/usr/bin/pacman -S gnome->
lip 23 18:34:12 potwor sudo[78470]: pam_unix(sudo:session): session opened for user root(uid=0) by norfear(uid=1000)
lip 23 18:34:15 potwor sudo[78470]: pam_unix(sudo:session): session closed for user root
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32771)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:  in process gnome-shell pid 2343 thread gnome-shel:cs0 pid 2389
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:   in page starting at address 0x0000800a44647000 from client 0x1b (UTCL2)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MORE_FAULTS: 0x1
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          WALKER_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MAPPING_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          RW: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32771)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:  in process gnome-shell pid 2343 thread gnome-shel:cs0 pid 2389
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:   in page starting at address 0x0000800a44612000 from client 0x1b (UTCL2)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MORE_FAULTS: 0x1
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          WALKER_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MAPPING_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          RW: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32771)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:  in process gnome-shell pid 2343 thread gnome-shel:cs0 pid 2389
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:   in page starting at address 0x0000800a4464a000 from client 0x1b (UTCL2)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MORE_FAULTS: 0x1
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          WALKER_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MAPPING_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          RW: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32771)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:  in process gnome-shell pid 2343 thread gnome-shel:cs0 pid 2389
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:   in page starting at address 0x0000800a44610000 from client 0x1b (UTCL2)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201030
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MORE_FAULTS: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          WALKER_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          MAPPING_ERROR: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu:          RW: 0x0
lip 23 18:35:51 potwor kernel: amdgpu 0000:2f:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32771)

And more information from inxi

System:
  Kernel: 6.10.0-zen1-2-zen arch: x86_64 bits: 64 compiler: gcc v: 14.1.1
    clocksource: tsc avail: acpi_pm
    parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
    root=UUID=f6fe44ad-75a1-48d3-b9fc-7bcd21b1126f rw rootflags=subvol=@
    quiet splash pci=noaer loglevel=3 ibt=off
  Desktop: GNOME v: 46.3.1 tk: GTK v: 3.24.43 wm: gnome-shell
    tools: gsd-screensaver-proxy dm: GDM v: 46.2 Distro: Garuda base: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 82UT v: Yoga Slim 7 Pro 14IAH7
    serial: <superuser required> Chassis: type: 10 v: Yoga Slim 7 Pro 14IAH7
    serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: SDK0T76463 WIN
    serial: <superuser required> part-nu: LENOVO_MT_82UT_BU_idea_FM_Yoga Slim 7
    Pro 14IAH7 uuid: <superuser required> UEFI: LENOVO v: KRCN14WW
    date: 11/02/2022
Battery:
  ID-1: BAT0 charge: 61.2 Wh (100.0%) condition: 61.2/61.0 Wh (100.4%)
    volts: 16.9 min: 15.4 model: Sunwoda L19D4PH3 type: Li-poly serial: <filter>
    status: full cycles: 21
  ID-2: hidpp_battery_0 charge: 89% condition: N/A volts: 4.1 min: N/A
    model: Logitech G903 LIGHTSPEED Wireless Gaming Mouse w/ HERO type: N/A
    serial: <filter> status: discharging
CPU:
  Info: model: 12th Gen Intel Core i5-12500H bits: 64 type: MST AMCP
    arch: Alder Lake gen: core 12 level: v3 note: check built: 2021+
    process: Intel 7 (10nm ESF) family: 6 model-id: 0x9A (154) stepping: 3
    microcode: 0x433
  Topology: cpus: 1x cores: 12 mt: 4 tpc: 2 st: 8 threads: 16 smt: enabled
    cache: L1: 1.1 MiB desc: d-8x32 KiB, 4x48 KiB; i-4x32 KiB, 8x64 KiB
    L2: 9 MiB desc: 4x1.2 MiB, 2x2 MiB L3: 18 MiB desc: 1x18 MiB
  Speed (MHz): avg: 587 high: 2081 min/max: 400/4500:3300 scaling:
    driver: intel_pstate governor: powersave cores: 1: 400 2: 400 3: 400 4: 400
    5: 400 6: 400 7: 400 8: 400 9: 400 10: 400 11: 400 12: 2081 13: 400
    14: 1724 15: 400 16: 400 bogomips: 99532
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities: <filter>
Graphics:
  Device-1: Intel Alder Lake-P GT2 [Iris Xe Graphics] vendor: Lenovo
    driver: i915 v: kernel alternate: xe arch: Gen-12.2 process: Intel 10nm
    built: 2021-22+ ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4
    bus-ID: 00:02.0 chip-ID: 8086:46a6 class-ID: 0300
  Device-2: AMD Navi 23 [Radeon RX 6600/6600 XT/6600M]
    vendor: Micro-Star MSI driver: amdgpu v: kernel arch: RDNA-2 code: Navi-2x
    process: TSMC n7 (7nm) built: 2020-22 pcie: gen: 4 speed: 16 GT/s
    lanes: 16 ports: active: HDMI-A-1 empty: DP-5, DP-6, DP-7, Writeback-1
    bus-ID: 2f:00.0 chip-ID: 1002:73ff class-ID: 0300
  Device-3: Chicony Integrated Camera driver: uvcvideo type: USB rev: 2.0
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 3-8:2 chip-ID: 04f2:b756
    class-ID: fe01 serial: <filter>
  Display: wayland server: X.org v: 1.21.1.13 with: Xwayland v: 24.1.1
    compositor: gnome-shell driver: gpu: amdgpu,i915 display-ID: 0
  Monitor-1: HDMI-A-1 model: Philips PHL 272B7QPJ serial: <filter>
    built: 2018 res: 2560x1440 dpi: 109 gamma: 1.2 size: 597x336mm (23.5x13.23")
    diag: 685mm (27") ratio: 16:9 modes: max: 2560x1440 min: 720x400
  Monitor-2: eDP-1 model: BOE Display 0x0931 built: 2020 res: 2240x1400
    dpi: 188 gamma: 1.2 size: 302x189mm (11.89x7.44") diag: 356mm (14")
    ratio: 16:10 modes: 2240x1400
  API: EGL v: 1.5 hw: drv: intel iris drv: amd radeonsi platforms: device: 0
    drv: iris device: 1 drv: radeonsi device: 2 drv: swrast gbm: drv: kms_swrast
    surfaceless: drv: iris wayland: drv: iris x11: drv: iris
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 24.1.4-arch1.2
    glx-v: 1.4 direct-render: yes renderer: Mesa Intel Graphics (ADL GT2)
    device-ID: 8086:46a6 memory: 7.49 GiB unified: yes display-ID: :0.0
  API: Vulkan v: 1.3.279 layers: 7 device: 0 type: integrated-gpu name: Intel
    Graphics (ADL GT2) driver: mesa intel v: 24.1.4-arch1.2
    device-ID: 8086:46a6 surfaces: xcb,xlib,wayland device: 1
    type: discrete-gpu name: AMD Radeon RX 6600 (RADV NAVI23)
    driver: mesa radv v: 24.1.4-arch1.2 device-ID: 1002:73ff
    surfaces: xcb,xlib,wayland device: 2 type: cpu name: llvmpipe (LLVM
    18.1.8 256 bits) driver: mesa llvmpipe v: 24.1.4-arch1.2 (LLVM 18.1.8)
    device-ID: 10005:0000 surfaces: xcb,xlib,wayland
Audio:
  Device-1: Intel Alder Lake PCH-P High Definition Audio vendor: Lenovo
    driver: sof-audio-pci-intel-tgl alternate: snd_hda_intel, snd_soc_avs,
    snd_sof_pci_intel_tgl bus-ID: 00:1f.3 chip-ID: 8086:51c8 class-ID: 0401
  Device-2: AMD Navi 21/23 HDMI/DP Audio driver: snd_hda_intel v: kernel
    pcie: gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 2f:00.1 chip-ID: 1002:ab28
    class-ID: 0403
  API: ALSA v: k6.10.0-zen1-2-zen status: kernel-api tools: N/A
  Server-1: PipeWire v: 1.2.1 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Alder Lake-P PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:51f0 class-ID: 0280
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: ASIX AX88179 Gigabit Ethernet driver: ax88179_178a type: USB
    rev: 3.0 speed: 5 Gb/s lanes: 1 mode: 3.2 gen-1x1 bus-ID: 6-1.2:4
    chip-ID: 0b95:1790 class-ID: ff00 serial: <filter>
  IF: enp48s0u1u2 state: down mac: <filter>
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: Intel AX211 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 3-10:3 chip-ID: 8087:0033
    class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 2 state: up address: <filter> bt-v: 5.3
    lmp-v: 12 status: discoverable: no pairing: no class-ID: 6c010c
Drives:
  Local Storage: total: 931.51 GiB used: 238.39 GiB (25.6%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 PRO 1TB
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: 5B2QGXA7 temp: 33.9 C
    scheme: GPT
Partition:
  ID-1: / raw-size: 931.22 GiB size: 931.22 GiB (100.00%)
    used: 238.39 GiB (25.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 4.7 MiB (1.6%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
  ID-3: /home raw-size: 931.22 GiB size: 931.22 GiB (100.00%)
    used: 238.39 GiB (25.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-4: /var/log raw-size: 931.22 GiB size: 931.22 GiB (100.00%)
    used: 238.39 GiB (25.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-5: /var/tmp raw-size: 931.22 GiB size: 931.22 GiB (100.00%)
    used: 238.39 GiB (25.6%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: zram size: 15.35 GiB used: 0 KiB (0.0%) priority: 100
    comp: zstd avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 16 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 45.0 C mobo: N/A gpu: amdgpu temp: 31.0 C
    mem: 32.0 C
  Fan Speeds (rpm): N/A gpu: amdgpu fan: 0
Info:
  Memory: total: 16 GiB note: est. available: 15.35 GiB used: 3.06 GiB (19.9%)
  Processes: 384 Power: uptime: 1h 41m states: freeze,mem,disk
    suspend: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 6.13 GiB services: gsd-power,
    power-profiles-daemon, upowerd Init: systemd v: 256 default: graphical
    tool: systemctl
  Packages: pm: pacman pkgs: 1304 libs: 463 tools: octopi,pamac,paru
    Compilers: gcc: 14.1.1 Shell: garuda-inxi default: fish v: 3.7.1
    running-in: gnome-terminal inxi: 3.3.35
Garuda (2.6.26-1):
  System install date:     2024-07-22
  Last full system update: 2024-07-23
  Is partially upgraded:   No
  Relevant software:       snapper(custom) NetworkManager dracut
  Windows dual boot:       No/Undetected
  Failed units:            thunderbolt.service 

Do you have any solution for this cancer? :slight_smile:

Edit your post and paste the output of garuda-inxi

1 Like

ok i add this

Does this also happen when running X11?

I faund solution.

  1. I connected the laptop alone to the second monitor (Thunderbolt to HDMI) bypassing the eGPU where the Radeon is. This is enough to maintain system stability when I need to work.
  2. For gaming, I switch the Thunderbolt port from HDMI to the eGPU, and then I have a very stable gaming experience
    2.1 Additionally, when connecting only the eGPU, I modified:
  • Grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer loglevel=3"
  • Setting the priority to the Intel card. And I call the Radeon eGPU card in Steam with:
DRI_PRIME=1 %command%
  • Additionally, if I don’t need to use the eGPU card, I turn it off with:
echo 1 | sudo tee /sys/bus/pci/devices/0000:0a:00.0/remove

At the moment, I worked for 10 hours on the laptop yesterday without experiencing a single frozen screen or logout.

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.