NVIDIA freezes randomly

I know that NVIDIA and Linux are two opposite parties. But I want to know if there is any solution to use NVIDIA GPU. Via Optimus Manager, if it is switched to NVIDIA in Cinnamon, then random freezes occurs. Can it be solved by any means?

Please always provide the output of your garuda-inxi when opening a new topic, as text, formatted with a
~~~
before and after., as instructed by the template.
Regarding your problem, you might consider reading this guide:

3 Likes
System:
  Kernel: 5.17.5-zen1-2-zen arch: x86_64 bits: 64 compiler: gcc v: 12.1.0
    parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
    root=UUID=8736464e-32e6-464b-aa69-5aa97de64cf4 rw [email protected]
    quiet quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0
    resume=UUID=4425a162-e3ad-42f1-bc0d-aa3b380435b8 loglevel=3
  Desktop: Cinnamon v: 5.2.7 tk: GTK v: 3.24.33 wm: Muffin dm: LightDM
    v: 1.30.0 Distro: Garuda Linux base: Arch Linux
Machine:
  Type: Laptop System: HP product: HP Pavilion Gaming Laptop 15-ec2xxx v: N/A
    serial: <filter> Chassis: type: 10 serial: <filter>
  Mobo: HP model: 88DE v: 96.31 serial: <filter> UEFI: AMI v: F.15
    date: 08/18/2021
Battery:
  ID-1: BAT0 charge: 50.3 Wh (100.0%) condition: 50.3/50.3 Wh (100.0%)
    volts: 12.8 min: 11.6 model: HP Primary type: Li-ion serial: <filter>
    status: N/A
CPU:
  Info: model: AMD Ryzen 5 5600H with Radeon Graphics socket: FP6 bits: 64
    type: MT MCP arch: Zen 3 family: 0x19 (25) model-id: 0x50 (80) stepping: 0
    microcode: 0xA50000C
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 3 MiB desc: 6x512 KiB
    L3: 16 MiB desc: 1x16 MiB
  Speed (MHz): avg: 1550 high: 3300 min/max: 1200/4280 boost: enabled
    base/boost: 3300/4250 scaling: driver: acpi-cpufreq governor: schedutil
    volts: 1.2 V ext-clock: 100 MHz cores: 1: 3300 2: 1200 3: 1200 4: 1200
    5: 3300 6: 1200 7: 1200 8: 1200 9: 1200 10: 1200 11: 1200 12: 1200
    bogomips: 79050
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass
    mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1
    mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
    STIBP: always-on, RSB filling
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA GA107M [GeForce RTX 3050 Mobile] vendor: Hewlett-Packard
    driver: nvidia v: 510.68.02 alternate: nouveau,nvidia_drm pcie: gen: 1
    speed: 2.5 GT/s lanes: 8 link-max: gen: 4 speed: 16 GT/s lanes: 16 ports:
    active: none empty: HDMI-A-1 bus-ID: 01:00.0 chip-ID: 10de:25a2
    class-ID: 0300
  Device-2: AMD Cezanne vendor: Hewlett-Packard driver: amdgpu v: kernel
    pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4 speed: 16 GT/s ports:
    active: eDP-1 empty: none bus-ID: 05:00.0 chip-ID: 1002:1638
    class-ID: 0300
  Device-3: Luxvisions Innotech HP TrueVision HD Camera type: USB
    driver: uvcvideo bus-ID: 3-3:2 chip-ID: 30c9:0035 class-ID: fe01
    serial: <filter>
  Display: x11 server: X.Org v: 21.1.3 driver: X:
    loaded: modesetting,nvidia gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x286mm (20.00x11.26")
    s-diag: 583mm (22.95")
  Monitor-1: eDP-1 mapped: eDP-1-1 model: AU Optronics 0x2992 built: 2020
    res: 1920x1080 hz: 144 dpi: 142 gamma: 1.2 size: 344x193mm (13.54x7.6")
    diag: 394mm (15.5") ratio: 16:9 modes: max: 1920x1080 min: 640x480
  OpenGL: renderer: NVIDIA GeForce RTX 3050 Laptop GPU/PCIe/SSE2
    v: 4.6.0 NVIDIA 510.68.02 direct render: Yes
Audio:
  Device-1: NVIDIA vendor: Hewlett-Packard driver: snd_hda_intel v: kernel
    pcie: gen: 3 speed: 8 GT/s lanes: 8 link-max: gen: 4 speed: 16 GT/s
    lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:2291 class-ID: 0403
  Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor vendor: Hewlett-Packard
    driver: N/A alternate: snd_pci_acp3x, snd_rn_pci_acp3x, snd_pci_acp5x,
    snd_pci_acp6x, snd_sof_amd_renoir
    pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4 speed: 16 GT/s
    bus-ID: 05:00.5 chip-ID: 1022:15e2 class-ID: 0480
  Device-3: AMD Family 17h/19h HD Audio vendor: Hewlett-Packard
    driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16
    link-max: gen: 4 speed: 16 GT/s bus-ID: 05:00.6 chip-ID: 1022:15e3
    class-ID: 0403
  Sound Server-1: ALSA v: k5.17.5-zen1-2-zen running: yes
  Sound Server-2: PulseAudio v: 15.0 running: no
  Sound Server-3: PipeWire v: 0.3.51 running: yes
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Hewlett-Packard driver: r8169 v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: e000 bus-ID: 02:00.0 chip-ID: 10ec:8168
    class-ID: 0200
  IF: eno1 state: down mac: <filter>
  Device-2: Realtek RTL8852AE 802.11ax PCIe Wireless Network Adapter
    vendor: Hewlett-Packard driver: rtw89_pci v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 03:00.0 chip-ID: 10ec:8852
    class-ID: 0280
  IF: wlo1 state: up mac: <filter>
Bluetooth:
  Device-1: Realtek Bluetooth Radio type: USB driver: btusb v: 0.8
    bus-ID: 1-4:3 chip-ID: 0bda:2852 class-ID: e001 serial: <filter>
  Report: bt-adapter ID: hci0 rfk-id: 1 state: up address: <filter>
Drives:
  Local Storage: total: 476.94 GiB used: 45.01 GiB (9.4%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Micron
    model: MTFDHBA512TDV-1AZ1AABHA size: 476.94 GiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: HPS0032 temp: 49.9 C scheme: GPT
Partition:
  ID-1: / raw-size: 60 GiB size: 60 GiB (100.00%) used: 21.13 GiB (35.2%)
    fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p6 maj-min: 259:6
  ID-2: /boot/efi raw-size: 615 MiB size: 613.8 MiB (99.80%)
    used: 576 KiB (0.1%) fs: vfat block-size: 512 B dev: /dev/nvme0n1p8
    maj-min: 259:8
  ID-3: /home raw-size: 95.4 GiB size: 95.4 GiB (100.00%)
    used: 23.88 GiB (25.0%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p7
    maj-min: 259:7
  ID-4: /var/log raw-size: 60 GiB size: 60 GiB (100.00%)
    used: 21.13 GiB (35.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p6
    maj-min: 259:6
  ID-5: /var/tmp raw-size: 60 GiB size: 60 GiB (100.00%)
    used: 21.13 GiB (35.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p6
    maj-min: 259:6
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: partition size: 4 GiB used: 0 KiB (0.0%) priority: -2
    dev: /dev/nvme0n1p10 maj-min: 259:10
  ID-2: swap-2 type: zram size: 7.18 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 66.0 C mobo: N/A
  Fan Speeds (RPM): cpu: 0 fan-2: 0
  GPU: device: nvidia screen: :0.0 temp: 51 C device: amdgpu temp: 51.0 C
Info:
  Processes: 375 Uptime: 1m wakeups: 1 Memory: 7.18 GiB
  used: 2.09 GiB (29.0%) Init: systemd v: 250 tool: systemctl Compilers:
  gcc: 12.1.0 Packages: pacman: 1368 lib: 320 Shell: garuda-inxi (sudo)
  default: fish v: 3.4.1 running-in: gnome-terminal inxi: 3.3.15
Garuda (2.6.2-1):
  System install date:     2022-05-08
  Last full system update: 2022-05-11
  Is partially upgraded:   No
  Relevant software:       NetworkManager
  Windows dual boot:       Yes
  Snapshots:               Snapper
  Failed units:            

Can you suggest me common possible solution?

Try with filo's guide (created by tbg) above and see if it yields you somewhere. They are the first troubleshooting steps.

5 Likes

Your mistake here is switching to nvidia. Just keep it on hybrid and run the programs you need to use the GPU with prime-run. That's NVIDIA's official method. Don't mess around with tools like optimus-manager that try to do all kinds of wacky things to get stuff to work. This method also helps with battery life because it will turn off the GPU when not in use.

5 Likes

It doesn't give any better performance. If I run any browser with prime-run then it doesn't support hardware acceleration. But if NVIDIA is set to primary gpu, hardware acceleration is available and working with 10x performance for any apps.

That's why if I can set NVIDIA to primary GPU, it gives 10x performance.

Enabling this^^^ is known to cause freezing on some systems.

From the FAQ @filo already provided the link for you to read:

3 Likes

Tried after so many trials. Changing kernel, I/O schedulers, kernel parameters didn't work. Also no performance tweaks are enabled. But one things helped a little.

Adding kernel parameter nvidia drm modeset as 1 in GRUB file made it freeze display accept cursor. At first, whole display was freezed randomly after using sometime. Now, cursor is not freezed at least.

So can you please suggest some similar ways which can make whole display unfreeze?

Let me explain a little more...

You have a Laptop.
A PC/Laptop can function with one GPU.
Your Laptop vendor, like other vendors, added an additional GPU.
Why do you think they would want that? If nvidia is powerful, they could just ship only nvidia.

Laptop vendors add a 2nd GPU and they use one for low demand usage and the other one for high demand applications. This helps conserve battery power!!!!!!!

If you want to use the ONLY power hungry nvidia GPU, first look into the BIOS and see for such a setting.
Making nvidia Primary GPU, does not make sense. Ask your vendor to explain more, if my English is not good (BTW I am Greek :smile: ).
Optimus-manager advertises that can do the same thing (use nvidia only mode), but it uses advanced software technics, that, because there are countless HW/SW combinations, sometimes don't function as expected.

If you think optimus-manager maybe your solution, then I would suggest you ask their developers for support on your issue. We know a lot, but they are the creators and they will know ASAP what happens.

OTOH

this doesn't make sense. At least, if everything is configured well. Archwiki describes very well the way to setup and confirm hardware acceleration.

You have to try more..., or run Mint, Ubuntu, Fedora or WinOS. I hear they have a lot of success with Optimus Laptops.

5 Likes

Other distros doesn't provide what Garuda Linux provides. And before Garuda, I tried them all. Ubuntu also freezes like that in Garuda. Similar for Debian based distros. Fedora is not even booting up with NVIDIA drivers. Only option was Arch or Arch based distros. And I had 2 options : Garuda Linux and Manjaro. Manjaro has modified Arch to very much extent and it causes issues. And Garuda is just great and simple to use with all features anyone can imagine.

I tried by my side everything I could. Can you please suggest some way I can do so or it's just problem at NVIDIA side only?

@yash everyone else in this thread has a more detailed and informative response than you, and it's your thread. Why is that?

It's impossible to identify things that could use modification in your setup if you don't describe the changes you have been trying and what the results have been. This:

That's so unhelpful. There is basically no information here. No one knows what you tried. No one can tell what has helped or what has made things worse. Frankly you haven't even bothered to describe the problem very well.

So far the most detail you've put forward is you tried some random kernel parameter and something changed--but it's still broken. If I'm expressing that wrongly it's because I don't have a lot of information to base my synapses on.

Go through these links, take notes, describe what you've tried and what happens. Put some details into the thread, instead of just asking someone to come up with a magic snippet of code you can paste into your grub configuration without actually doing any work.

5 Likes