Lots of system freezes, followed by crashes

mudman · 6 March 2022 22:48

Hello, I've been getting ~2-3 crashes per day for a while now and I actually have some time to troubleshoot it. I'm somewhat new to Arch and Garuda, so I know only a couple of places to begin searching for my issue.

To start, each time the system crashes, right after GRUB and bootloader are done with their thing, I receive a handful of lines displaying error codes. I've discovered that it is displaying almost exactly what I get when I run journalctl -p 3 -b ( I will post the output of this command as well)

Are there any other logs I should be aware of? I have researched some of the issues returned from journalctl and found that they should be acceptable to ignore, though there are other errors I have yet to investigate.

I should also mention that crashes typically happen during gaming, I think it's interesting that it does not matter if the game is native or not. However, crashes happen during system maintenance and while I'm doing some work.

System:
Kernel: 5.16.12-zen1-1-zen x86_64 bits: 64 compiler: gcc v: 11.2.0
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=52aaf3ab-2bd7-42e7-bbf6-f58f185bd8d7 rw rootflags=subvol=@
quiet splash rd.udev.log_priority=3 vt.global_cursor_default=0 loglevel=3
Desktop: KDE Plasma 5.24.2 tk: Qt 5.15.2 info: latte-dock wm: kwin_x11
vt: 1 dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
Type: Desktop System: ASUS product: All Series v: N/A
serial: <superuser required>
Mobo: ASUSTeK model: RAMPAGE V EXTREME v: Rev 1.xx
serial: <superuser required> UEFI: American Megatrends v: 4101
date: 07/10/2019
CPU:
Info: model: Intel Core i7-6950X bits: 64 type: MT MCP arch: Broadwell
family: 6 model-id: 0x4F (79) stepping: 1 microcode: 0xB000040
Topology: cpus: 1x cores: 10 tpc: 2 threads: 20 smt: enabled cache:
L1: 640 KiB desc: d-10x32 KiB; i-10x32 KiB L2: 2.5 MiB desc: 10x256 KiB
L3: 25 MiB desc: 1x25 MiB
Speed (MHz): avg: 1773 high: 4106 min/max: 1200/4100 scaling:
driver: intel_cpufreq governor: performance cores: 1: 1480 2: 1201 3: 1202
4: 1202 5: 2219 6: 4106 7: 1361 8: 3496 9: 1203 10: 1202 11: 1253
12: 1200 13: 1858 14: 2300 15: 1204 16: 1202 17: 3157 18: 2221 19: 1202
20: 1203 bogomips: 120001
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities:
Type: itlb_multihit status: KVM: VMX disabled
Type: l1tf
mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
Type: mds mitigation: Clear CPU buffers; SMT vulnerable
Type: meltdown mitigation: PTI
Type: spec_store_bypass
mitigation: Speculative Store Bypass disabled via prctl
Type: spectre_v1
mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Full generic retpoline, IBPB: conditional,
IBRS_FW, STIBP: conditional, RSB filling
Type: srbds status: Not affected
Type: tsx_async_abort mitigation: Clear CPU buffers; SMT vulnerable
Graphics:
Device-1: NVIDIA GP104 [GeForce GTX 1070] vendor: ASUSTeK driver: nvidia
v: 510.54 alternate: nouveau,nvidia_drm pcie: gen: 3 speed: 8 GT/s
lanes: 16 bus-ID: 01:00.0 chip-ID: 10de:1b81 class-ID: 0300
Display: x11 server: X.Org v: 1.21.1.3 compositor: kwin_x11 driver: X:
loaded: nvidia unloaded: modesetting alternate: fbdev,nouveau,nv,vesa
gpu: nvidia display-ID: :0 screens: 1
Screen-1: 0 s-res: 3840x1080 s-dpi: 88 s-size: 1108x312mm (43.6x12.3")
s-diag: 1151mm (45.3")
Monitor-1: HDMI-0 pos: primary,left res: 1920x1080 hz: 60 dpi: 88
size: 553x309mm (21.8x12.2") diag: 633mm (24.9")
Monitor-2: HDMI-1 pos: right res: 1920x1080 hz: 60 dpi: 88
size: 553x309mm (21.8x12.2") diag: 633mm (24.9")
OpenGL: renderer: NVIDIA GeForce GTX 1070/PCIe/SSE2
v: 4.6.0 NVIDIA 510.54 direct render: Yes
Audio:
Device-1: Intel C610/X99 series HD Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel bus-ID: 00:1b.0 chip-ID: 8086:8d20
class-ID: 0403
Device-2: NVIDIA GP104 High Definition Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16
bus-ID: 01:00.1 chip-ID: 10de:10f0 class-ID: 0403
Device-3: Kingston HyperX Cloud Stinger Core Wireless + 7.1 type: USB
driver: hid-generic,snd-usb-audio,usbhid bus-ID: 3-9.4:9 chip-ID: 0951:170b
class-ID: 0300 serial: <filter>
Sound Server-1: ALSA v: k5.16.12-zen1-1-zen running: yes
Sound Server-2: PulseAudio v: 15.0 running: no
Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
Device-1: Intel Ethernet I218-V vendor: ASUSTeK driver: e1000e v: kernel
port: f000 bus-ID: 00:19.0 chip-ID: 8086:15a1 class-ID: 0200
IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Broadcom BCM4360 802.11ac Wireless Network Adapter
vendor: ASUSTeK driver: wl v: kernel modules: bcma pcie: gen: 1
speed: 2.5 GT/s lanes: 1 bus-ID: 05:00.0 chip-ID: 14e4:43a0
class-ID: 0280
IF: wlp5s0 state: dormant mac: <filter>
Bluetooth:
Device-1: ASUSTek Broadcom BCM20702 Single-Chip Bluetooth 4.0 + LE
type: USB driver: btusb v: 0.8 bus-ID: 3-6:3 chip-ID: 0b05:180a
class-ID: fe01 serial: <filter>
Report: bt-adapter ID: hci0 rfk-id: 1 state: up address: <filter>
Drives:
Local Storage: total: 3.81 TiB used: 1.26 TiB (33.0%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Intel model: SSDPEDMW012T4
size: 1.09 TiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
lanes: 4 type: SSD serial: <filter> rev: 8EV10171 temp: 36.9 C
scheme: GPT
ID-2: /dev/nvme1n1 maj-min: 259:2 vendor: Intel model: SSDPED1D480GA
size: 447.13 GiB block-size: physical: 512 B logical: 512 B
speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter> rev: E2010325
temp: 44.9 C scheme: MBR
ID-3: /dev/sda maj-min: 8:0 vendor: SanDisk model: SD8SBAT256G1122
size: 238.47 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 0000 scheme: MBR
ID-4: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 850 EVO 250GB
size: 232.89 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 1B6Q scheme: GPT
ID-5: /dev/sdc maj-min: 8:32 vendor: Western Digital
model: WD20EZRZ-00Z5HB0 size: 1.82 TiB block-size: physical: 4096 B
logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5400 serial: <filter>
rev: 0A80 scheme: GPT
Partition:
ID-1: / raw-size: 213.31 GiB size: 213.31 GiB (100.00%)
used: 42.44 GiB (19.9%) fs: btrfs dev: /dev/sdb3 maj-min: 8:19
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 26 MiB (8.7%) fs: vfat dev: /dev/sdb1 maj-min: 8:17
ID-3: /home raw-size: 213.31 GiB size: 213.31 GiB (100.00%)
used: 42.44 GiB (19.9%) fs: btrfs dev: /dev/sdb3 maj-min: 8:19
ID-4: /var/log raw-size: 213.31 GiB size: 213.31 GiB (100.00%)
used: 42.44 GiB (19.9%) fs: btrfs dev: /dev/sdb3 maj-min: 8:19
ID-5: /var/tmp raw-size: 213.31 GiB size: 213.31 GiB (100.00%)
used: 42.44 GiB (19.9%) fs: btrfs dev: /dev/sdb3 maj-min: 8:19
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 125.71 GiB used: 1.8 MiB (0.0%)
priority: 100 dev: /dev/zram0
Sensors:
System Temperatures: cpu: 72.0 C mobo: N/A gpu: nvidia temp: 49 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
Info:
Processes: 390 Uptime: 13m wakeups: 0 Memory: 125.71 GiB
used: 4.57 GiB (3.6%) Init: systemd v: 250 tool: systemctl Compilers:
gcc: 11.2.0 Packages: pacman: 1951 lib: 367 Shell: fish v: 3.3.1
default: Bash v: 5.1.16 running-in: konsole inxi: 3.3.13

mudman · 6 March 2022 22:49

Here's the journalctl output:

journalctl -p 3 -b

Mar 06 17:17:47 dr4g0n kernel:
Mar 06 17:17:47 dr4g0n systemd-modules-load[452]: Failed to find module 'vboxpci'
Mar 06 17:17:47 dr4g0n kernel: usb 3-9.4: device descriptor read/64, error -32
Mar 06 17:17:48 dr4g0n kernel: wlan0: Broadcom BCM43a0 802.11 Hybrid Wireless Controller 6.30.223.271 (r587334)
Mar 06 17:17:48 dr4g0n kernel:
Mar 06 17:17:48 dr4g0n systemd-udevd[492]: could not read from '/sys/module/pcc_cpufreq/initstate': No such device
Mar 06 17:17:48 dr4g0n kernel: Bluetooth: hci0: BCM: firmware Patch file not found, tried:
Mar 06 17:17:48 dr4g0n kernel: Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0b05-180a.hcd'
Mar 06 17:17:48 dr4g0n kernel: Bluetooth: hci0: BCM: 'brcm/BCM-0b05-180a.hcd'
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Couldn't find mci handler
Mar 06 17:17:48 dr4g0n kernel: EDAC sbridge: Failed to register device with error -19.

tbg · 7 March 2022 03:26

Below is a first draft of a guide I’m compiling to help in these type of situations.

Hope it helps.

Troubleshooting System Stutter, Freezes, Lags, and Hangs:

Important forum threads regarding this topic to read:

Disk IO reaches 100%, causing system hangs

Temporary system hangs/freezes when updating

Load over the CPU is too High | SLOW Response | Freezing | Hang | Crash

If you use Docker:

[btrfs] Docker and Subvolumes

Sluggishnes of system and weird journal errors

Troubleshooting Tips:

The causes of system lags/freezes/hangs has a very extensive list of possibilities. The causes can range from faulty hardware to hundreds of software/firmware//kernel/bios possibilities. It is often best to eliminate bios/kernel possibilities first, as those are two of the most promising, (and least time consuming) leads to pursue. Be sure to test out at least 3 or 4 of the most recommended kernels. Also ensure your bios (and your system) is fully up to date.

Search the forum first, (and then the internet) for possible causes/fixes to your issue.

Start by running a forum search on variations of the following terms:

froze
frozen
freeze
crash
hang
freezing
system crash
system freeze
system hang
lock up
locked up
hard lock
hard lock up
hard power down
hard power off
force power down
system unresponsive
computer unresponsive

Any variations of those terms should return lots of hits to comb through.

Here are some further tips to aid with your search efforts:

How to search for solutions the right way

Next, start writing a list of all suggestions for fixes you uncover and record all your efforts. Record all inputs and outputs of any commands run. Report in full detail every fix you attempt and post any relevant logs. Regularly post the results of your progress. There have been many suggestions already given on the forum (and internet), to correct freezing issues. It is your job to sift through all the possible fixes already posted to increase your probability of finding a solution.

If you are thorough and precise about reporting all your troubleshooting efforts, there is a good chance a solution will be found to your issue. The more proactive you are in this respect, the more likely you are to receive assistance from forum experts to find a solution. The less information and documentation you provide,the less likely you will ever find a resolution to your issue. Optics are important if you desire assistance, the more of an effort you make, the more likely others will make extra efforts to help you.

The first step you need to take to diagnose your issue is to start monitoring your resource usage. This may help determine what might be causing your issues. Install and learn how to use monitoring utilities such as top, htop, iotop or other diagnostic utilities to help pin down a cause.

The following diagnostic commands may help identify a cause:

sudo ps_mem -S -w 5 
sudo sensors-detect
sudo dmesg | grep oom-killer
swapon --show
cat /proc/sys/vm/swappiness
cat /proc/meminfo
top -o '%MEM'
while true ; do top -b | tee -a ~/top.log; sleep 5; done

The first command requires ps_mem to be installed.
The second command requires lm-sensors to be installed.
This last command will output to a log file at ~/top.log.

Post the outputs that aren’t excessively long on the forum if you require assistance with your problem. Very long outputs may better be posted through a pastebin/hastebin type service. The forum also has its own PrivateBin service for this.

Further troubleshooting suggestions:

Hardware related troubleshooting steps:

Install smartmontools to run a check and report on all your storage drives health. However, it is possible that software testing might not identify problematic hardware with 100% accuracy. I have encountered several hard drives that caused lockups in the past (even though they passed smartmon testing). Another option you can try to verify drive health is the whdd hard drive diagnostic utility. To fully eliminate the possibility of errant test results, it would be best to disconnect any attached drives.

There is another method to eliminate the possibility of software testing not identifying problematic hardware correctly. Rather than checking each individual component via software, you can remove hardware from the equation through a process of elimination. Power off, and then disconnect the computers power plug. Remove or disconnect as much internal hardware as possible. Disconnect any HDD or SSD. Remove PCIe addin cards from their slots. This includes any addin GPU, only if you have onboard graphics (switch to onboard in bios). Disable in the bios any devices that cannot be removed such as onboard network adapters. Leave only one RAM stick inserted. Boot from a live disk to check for any lockups. Eliminate hardware possibilities by gradually adding back devices. If the issue returns after adding back any piece of hardware that was removed you have found your culprit.

Do you think your power supply could possibly be faulty or underpowered? The only sure way to know is to swap out the power supply, (if you have a spare). Cleaning the vents and fan on the PSU with compressed air is a good practice to get in the habit of doing.

Try using an alternate GPU, if you have onboard, or a different add in graphics card, (if available). If using the open source Nvidia driver, switch to the proprietary driver, and vice versa.

Check that your system temperatures are not high as this can result in an unstable system. The CPU temperature can be a major factor in contributing to freezes. Have you cleaned the exhaust ports, power supply vents, fans, and heat sinks inside your computer recently? Install the lm-sensors package, and then issue the sudo sensors-detect command. If the CPU is running in excess of 75 deg C you should be concerned. If your CPU temps are still higher than recommended after cleaning your computer thoroughly you may need to re-seat your CPU heatsink with new thermal paste.

Remove/disconnect all peripherals.

Check that all internal cables are seated properly and undamaged.

Check all USB cables for damage as pets often like to chew cables and this can cause lockups.

Disconnect all but your main monitor.

Disconnect all peripherals like USB hubs, USB Hard drives, printers, web cams, etc etc as a test to see if the freezes still occur.

Reboot into your bios and (if possible) disable your Ethernet and WiFi in your bios temporarily as a test…

If you are using a wireless mouse or KB, replace them with a wired version for troubleshooting purposes.

Ram:

Remove all ram sticks, and the reinsert them, making sure all are seated properly.

Are your ram sticks all the same matching type recommended by your mobo manufacturer?

Run the long memory test with memtest86. Sometimes memory sticks can pass memtest86, but can still crash Linux. To discount this possibility, remove all ram sticks except one. Cycle through testing each individual ram stick, allowing sufficient time to see if the freezes continue or end.

Software, firmware, kernel, scheduler, bios, troubleshooting suggestions:

Kernels

Check your logs for any instances of kernel panic, as this is a definite indicator that something is amiss with your kernel.

Changing kernels is one of the easiest troubleshooting steps you can perform and it resolves far more issues than you’d ever expect. Whenever you start to experience unusual issues with your system, the first thing you should do is test at least three alternate kernels. For those experiencing severe system freezes/crashes testing out alternate kernels should always be your first step.

You can install various kernels via the terminal with the following commands:

sudo pacman -Syu linux-lts linux-lts-headers
sudo pacman -Syu linux linux-headers
sudo pacman -Syu linux-mainline linux-mainline-headers
sudo pacman -Syu linux-cacule linux-cacule-headers
sudo pacman -Syu linux-xanmod linux-xanmod-headers
sudo pacman -Syu linux-hardened linux-hardened-headers

I would suggest starting at the top of the kernel list and working your way down, (if your issue hasn’t improved).

You can switch to a newly installed kernel after a reboot via the grub boot menu at startup. Simply choose the kernel you wish to boot into from the kernel choices listed in the menu. Also, be sure to test the “fallback” version of each installed kernel from the grub boot menu as sometimes this can correct severe issues. After installing a new kernel it is best practice not to immediately uninstall your old kernel. It is always best to have at least two kernels installed in case one kernel experiences an issue booting. The LTS kernel is the recommended choice to keep installed as a backup kernel.

Test a Different Scheduler:

Sometimes a kernel change alone will not resolve some stubborn freezing problems. For those that have tested multiple different kernels and are still experiencing freezes, it is a good idea to also test out different I/O schedulers. Some kernel versions, (such as cacule) come preconfigured with different I/O schedulers, otherwise you must manually change schedulers yourself.

Try monitoring your disk I/O activity with the iotop utility. Excessive I/O activity can lead to slowdowns or freezes. If this appears to be happening you might want to try changing your I/O scheduler. This is an especially worthwhile troubleshooting step if you find any I/O errors in your logs. Test out different schedulers to see if there is any performance improvement.

To identify the scheduler in use for all drives, run:

grep . /sys/class/block/*/queue/scheduler

Switching the default scheduler in use may seem a little confusing if you’ve never attempted it before, however it is actually quite simple.

See the Archwiki documentation for information on:

Changing the I/O scheduler

Investigate and test various kernel parameters

There are a lot of kernel parameters that may help your issue. Unfortunately, the boot parameters are usually very specific to the type of hardware in question. It is very hard to recommend exactly which kernel parameters to test as there many parameters. It may require a fair bit of searching on your specific hardware to find parameters that may help.

The most likely way to locate what you might need is to search for:

Arch Linux freezes kernel parameter “your motherboard model”

Or:

Arch Linux freezes kernel parameter “your laptop model”

BIOS

Is your Bios up to date?

Most default BIOS settings are intended for Windows. Depending on your hardware, you may need to modify your bios settings for your system to be stable when using Linux.

Rectify Freezes on Ryzen 9 mobo’s:

Ryzen 9 - Freezing, crashing

Disable BTRFS Quota (qgroups)

Garuda and other distributions have numerous reports linking system freezes with btrfs quota’s being enabled. Disabling btrfs quotas would be a logical step if you experience freezes during balancing operations. if you do experience a freeze during a balancing operation, try waiting as long as possible to let things hopefully resolve on their own.

Read the link below for information on how to disable qgroups:

BTRFS quota is automatically re-enabled if I disable it

To disable BTRFS quota’s run:

sudo btrfs quota disable /

Disabling qgroups will impact timeshift’s ability to gauge the remaining space left for creating snapshots. However, with qgroups enabled on your computer it might feel sluggish, or even grind completely to a halt. It has also been reported that the more snapshots you have, the worse this issue can become.

Documentation regarding quota support in BTRFS:

BTRFS Quota support

There has been some discussion about making qgroups disabled by default with Garuda. At the time of writing this, I believe BTRFS quotas are still enabled by default in all editions.

Unfortunately it seems, some updates may re-enable qroups even though they were manually disabled. Therefor, you will need to check regularly to be sure they stay disabled.

Troubleshoot Garuda’s performance tuning packages:

To test if any of the Garuda’s performance enhancements are causing issues on your system you may want to try disabling/masking some of these services one at a time. The performance tuning packages Garuda has installed by default have changed over time. Depending on how old your install is, you could have a few of the older services not in current usage running on your system. If you suspect any of these services are causing issues on your system you can temporarily disable them via masking to test for improvements.

You can find out if any of these services are installed and running on your system with the following command:

systemctl status prelockd auto-cpufreq ananicy-cpp irqbalance preload memavaild

To stop/disable/mask any individual service that is running on your system, execute:

sudo systemctl disable --now prelockd.service && sudo systemctl mask prelockd.service && sudo systemctl daemon-reload

sudo systemctl disable --now auto-cpufreq.service && sudo systemctl mask auto-cpufreq.service && sudo systemctl daemon-reload

sudo systemctl disable --now ananicy-cpp.service && sudo systemctl mask ananicy-cpp.service && sudo systemctl daemon-reload

sudo systemctl disable --now irqbalance.service && sudo systemctl mask irqbalance.service && sudo systemctl daemon-reload

sudo systemctl disable --now  preload.service && sudo systemctl mask preload.service && sudo systemctl daemon-reload

sudo systemctl disable --now memavaild.service && sudo systemctl mask memavaild.service && sudo systemctl daemon-reload

The service’s state should be automatically refreshed by the included sudo systemctl daemon-reload command.

After testing the results of your systems performance with a service masked, the service can be easily be made operational again if you wish. To reinitialize any of the service(s) you masked, repeat the above command(s) substituting “unmask” in place of “mask” and “enable” in place of “disable”, as in the example below:

sudo systemctl unmask irqbalance.service && sudo systemctl enable --now irqbalance.service && sudo systemctl daemon-reload

In some instances you may need to reboot to fully initialize the service, as reloading may not be sufficient in all cases.

BTRFS Balancing Tips:

Something that sometimes helps correct stutters and freezes is deleting all your snapshots and then performing a btrfs balance. The more snapshots you have on your system the worse it seems to affect some systems. Generally I delete all my snapshots and perform a btrfs balance after I have 5 or so snapshots stored.

If you are experiencing lagging/freezing issues it may be helpful to disable btrfs quotas along with deleting your system snapshots. Oh and of course, be sure to make a fresh timeshift snapshot after you’ve wiped the old ones.

If you are experiencing a system slow down after doing a large update or deleting files that took up a lot of space, performaning a balance can sometimes help greatly. Be sure to reboot after the balancing is complete.

The handy command below will launch a 60% balance operation on / (root) and will also give continual updates on how far along the process is to completion:

bash -c "sudo btrfs balance start -musage=60 -dusage=60 / & sudo watch -t -n5 btrfs balance status / &&  fg"'

There are numerous threads on the forum dealing with freezing issues with many different suggestions posted on how to hopefully correct the issue. Please search the forum and report in detail on every fix you attempt and post relevant logs and command outputs. To troubleshoot any issue effectively forum assistants must know all the troubleshooting steps that have been tested to have any chance of finding a solution. Threads already covering this topic on the forum should provide plenty of information on the steps you need to take to troubleshoot this issue.

If a thorough search of the Garuda forum doesn’t turn up a solution, then searching other Arch based forums is usually the next step. If you can’t turn up what you need on Arch derivative distros fora’s then a general internet search is your next move.

Feedback you should provide:

Have you tested multiple alternate kernels?

Is your system fully up to date?

Have you checked your resource usage with htop, iotop, etc?

Is this a fresh install, or did this start recently after an update?

Have you tried disabling the baloo file indexer temporarily?

If you press the caps or NUMLOCK key, does your KB state light change?

Is your caps lock LED blinking, (possible kernel panic)?

Can you move your mouse cursor?

Do you have full keyboard functionality?

Does pressing CTRL+T open a terminal?

Is this a complete freeze up with no keyboard or mouse responsiveness?

Does pressing CTRL+ALT+F3 take you to the TTY ?

Have you tried restarting your system from the terminal or TTY ?

Can you use the Magic SysRq keys to restart/shutdown?

Have you tried restarting (KDE) plasma from the terminal?

Is there a specific program or action that often triggers a freeze?

Does it happen out of the blue, completely random, how frequently?

If a freeze does occur, does it resolve on its own if you wait a long time?

Have you tried changing your compositor settings?

Have you tried disabling your compositor entirely?

Have you tried removing all plasma widgets you’ve installed?

Have you checked, (and posted) your logs errors/crash dumps?

Have you tried Installing linux-firmware-git , and reboot?

Have you made any overclocking/undervolting or similar modifications?

Do use full disk encryption?

Do you use a swap partition?

Have you installed multiple Desktop Environments/WM’s?

If this started recently, have you tried performing a rollback via a snapshot?

Have you merged any/all pacnew files?

How many applications are generally running when the freezing occurs?

Do the freezes happen even if the system is idle?

Which applications have you installed from the AUR or Chaotic repo?

Have you experienced similar issues with this hardware on other OS’s?

Did, (or does) freezing also occur on Windows?

Did, (or does) freezing also occur on other Linux distros?

Have you tried booting live disks, with other Garuda DE’s or other distros?

Does freezing also occur in live environments, (which ones)?

Does freezing also occur if you create a new user account?

Does the system get progressively slower before freezing?

Does sound continue playing during a freeze?

Are you using tlp ? if so uninstall it.

If your computer has a disk activity LED, is it blinking during the freeze?

Please answer as many of the above questions as possible. We need to have the complete picture in order to help resolve your issue. Also, please provide feedback on all suggestions put to you (whether you feel they are relevant or not).

system · 21 March 2022 03:27

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

tbg · 22 March 2022 00:40

Off to the trash bin with your thread, as this thread is worthless to others without some effort and feedback on you part.

I posted a ton of info for you, and you couldn't even be bothered to respond within 2 weeks time.

It's quite rude to not respond to someone who went to that amount of effort to help you.

Why bother opening a help request if you have no intention of responding once help arrives.

I hope your next post shows better form than your inaugural thread.