Also, are you sure you are not running out of RAM when your system system locks up. You are reported as having under 4 GB of RAM, (which is the minimum we recommend). Your CPU usage could be responsible for a lockup, but so could your RAM.
Just to be extra sure, try running these commands separately:
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msi=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msi=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
Test each command individually without rebooting after the command was run. Also be sure to test each command using different kernels as well. The LTS kernel is the most likely to be successful in your case.
I have not yet experienced running out of RAM using Garuda KDE Dr460nized even under these conditions. I have not seen it hit the 3.0 GiB mark just yet but I will recheck the RAM usage the next time my system freezes using htop on TTY.
Whenever my system freezes, I can still move my cursor around. It usually freezes when I keep hovering over latte dock or when I hover over the system tray āNetworksā.
The most RAM Iāve had with Garuda while having these high CPU usage & errors in the background, I believe was around 1.8 GiB - 2.1 GiB shown by System Monitor. This was during a new install (with the Wi-Fi enabled by default on boot), when I was exploring & opening applications on Garuda KDE Dr460nized, unaware that Wi-Fi was the cause of the high CPU usage & errors in the background.
The only thing confusing me about monitoring these RAM usage is that System Monitor, htop, KSysGuard & Stacer are showing me different readings, most especially System Monitor. With Wi-Fi disabled since last reboot and KSysGuard, Konsole, Dolphin, Kate, System Monitor, Fire Dragon & Stacer open,
System Monitor shows Memory is using 2.7 GiB/3.7 GiB
htop shows Mem[1.69G/3.71G] & Swp[519M/11.7G]
KSysGuard shows Memory: 1.6 GiB of 3.7 GiB & Swap: 0.51 GiB of 11.7 GiB
Have you tested changing the kernel boot parameters yet?
If disabling msi/msix with the realtek wifi driver option does not correct your issue, then disabling msi with a boot parameter seems your only option left.
Add the GRUB kernel boot parameter pci=nomsi to /etc/default/grub:
Because that includes buffers. Iād suggest you read up on how Linux handles memory. Itās quite interesting. Anyway, just open a terminal and type free -M which will give you a truer picture.
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msi=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
rmmod rtl8821ae
rmmod rtl_pci
rmmod btcoexist
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/rtl_pci.ko.zst
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/btcoexist/btcoexist.ko.zst
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/rtl8821ae.ko.zst msi=0
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msix=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
rmmod rtl8821ae
rmmod rtl_pci
rmmod btcoexist
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/rtl_pci.ko.zst
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/btcoexist/btcoexist.ko.zst
insmod /lib/modules/5.19.5-zen1-1-zen/kernel/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/rtl8821ae.ko.zst msix=0
linux-lts:
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msi=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
rmmod rtl8821ae
rmmod rtl_pci
rmmod btcoexist
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/rtl_pci.ko.zst
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/btcoexist/btcoexist.ko.zst
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/rtl8821ae.ko.zst msi=0
sudo systemctl stop NetworkManager; sudo ip link set wlp3s0 down; sudo modprobe -rv rtl8821ae; sleep 3; sudo modprobe -v rtl8821ae msix=0; sudo ip link set wlp3s0 up; sudo systemctl start NetworkManager
rmmod rtl8821ae
rmmod rtl_pci
rmmod btcoexist
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/rtl_pci.ko.zst
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/btcoexist/btcoexist.ko.zst
insmod /lib/modules/5.15.63-1-lts/kernel/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/rtl8821ae.ko.zst msix=0
Same issues & CPU%. I only notice that at the middle of processing the output where it briefly pauses at rmmod btcoexsit, CPU% goes down below 10% but goes back up again when insmod appears with the Wi-Fi automatically re-enabled.
Yes, I hope that would be the case too. Iām just concerned that if since year 2017-2018 when I used Manjaro, it has the same issue as 2022 and the kernel version I was using before hasnāt even reached version 5.0 yet, will any kernel versions between those years even have the fix?
I might have to install every old kernel until versions 4.3 - 3.14 as shown on this website, which shows the source changes of the rtl8821ae driver in whichever kernel versions.
Or perhaps I shouldnāt? Iām not even sure if there is any driver changes at all between those versions & it might do more harm than good to my system.
I get that. I was on my phone when creating this post & the past days, I had to copy paste, save text files with commands and sending them via Bluetooth from phone to laptop & vice versa.
I have not tried it yet. Iāve only tried sudo reboot on TTY and if it takes forever to reboot, I just force shutdown which I know I should avoid doing. Iāll save that command for when my system freezes.
Iāve actually been trying to reproduce the error with the system freezing but it hasnāt been lately. Maybe because I was tinkering a bit with the compositor, latte dock and desktop effects or because of an update.
I tried before but I wasnāt sure if I did it correctly when following this guide. I will be trying those commands. It should be fine to edit it right, if GRUB was updated together with a system update (garuda-update), had a red warning message to manually reinstall, but it still didnāt break after rebooting unlike the other users?
Also, do you need my PCIe Bus Error logs? Using sudo dmesg | grep "pcie", I found these errors repeatedly shown:
I see. Iād definitely do some reading about it in the future, since Iām also quite interested about zram & swap, seeing as my laptop only have 3804 MiB.
I couldnāt use it with the capital M, gives me invalid option. Had to input free -m. I will be using this command more often.
This was something I was considering as a possibilty before you mentioned it. I was wondering if bluetooth was part of the problem as your adapter is a wifi/BT combo chip. I would suggest testing your system with Bluetooth blacklisted. Search the forum, it is a simple conf file required to blacklist the btusb module. Restart after creating the btusb blacklist file.
It should be just fine, even if there was a problem it could be easily reverted as I have the grub config file backed up before any changes are made.
It is always best to make a backup copy of any configuration file before you do any editing. If you do this, even if there are any negative side effects then things can be easily reversed.
Regarding running really old kernels, thatās not a great idea when using a rolling distro.
Blacklisting Bluetooth still gave me the same issue. I created a file rtl8821ae-blacklist.conf at /etc/modprobe.d/ with blacklist usb written on it, rebooted and it still has the same issue. I later added blacklist btrtl & rebooted, same thing. I removed the file later with sudo rm /etc/modprobe.d/rtl8821ae-blacklist.conf and reboot.
What did made a change was putting pci=nomsi kernel parameter on GRUB. It stopped the CPU% from soaring high, but I noticed that after letting my laptop sleep and I turn it back on, Wi-Fi shows a notification that it got deactivated & activated. Sometimes it does this twice. I also notice it took a bit longer for it to process & load a website, it could just be my internet.
Then I replaced pci=nomsi with pci=noaer, then input sudo update-grub & reboot. This also fixed the CPU% problem, but same as pci=nomsi, Wi-Fi deactivates & activates after turning laptop back on from sleep.
Then I read about another kernel parameter that can disable or enable the Active-State Power Management, it is pcie_aspm=off.
I tried this kernel parameter because it said something about disabling ASPM to allow PCIe links to operate with maximum performance and about latency, which I thought could be the issue. To my surprise, pcie_aspm=off also fixes the high CPU issue, but it still has the same issue with Wi-Fi after sleep plus this screenshot below.
Screenshot
With ASPM disabled, this shows:
ASUS-X441URK kernel: PCIe ASPM is disabled
ASUS-X441URK kernel: ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
ASUS-X441URK kernel: acpi PNP0A08:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]
ASUS-X441URK kernel: acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration
With ASPM enabled, this shows:
ASUS-X441URK kernel: ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
ASUS-X441URK kernel: acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
ASUS-X441URK kernel: acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration
ASUS-X441URK kernel: r8169 0000:02:00.0: can't disable ASPM; OS doesn't have ASPM control
This made me think that maybe MSI and ASPM doesnāt work well together in my system.
I want to dig deeper to the root of the problem and maybe I can find a way to report a bug issue about it somewhere.
Using the three kernel parameters, they had the same problem when connecting to Wi-Fi from sleep. When I try to manually connect, it loads for a minute until it finally shows a notification that Wi-Fi is connected.
Other things Iāve done was disabling MAC address randomization and enabling Wifi powersave from Garuda Network Assistant, enabled firewall from KDE settings, enabled UFW from Garuda Assistant too and these:
I recently tried this when my system froze. Turns out it wasnāt fixed at all when I thought it was. This didnāt restarted plasma, it showed some error on TTY and when I went back to the desktop using Alt + Left Arrow, system was still frozen. Also tried using it together with sudo, still encountered an error.
You now have a side effect that your wifi is not coming up correctly after a suspend operation. Have you tried disabling your wifi before going into suspend? If you can initiate your wifi connection properly (manually after resuming), then this should be able to be automated by creating a systemd service.
Search the forum and the Internet for "systemd wifi resume service" if you would like to write a service to automate this procedure.