Load over the CPU is too High | SLOW Response | Freezing | Hang | Crash

Naman · 20 February 2021 02:27

And welcome to the forums.

BrutalBirdie · 20 February 2021 13:12

and

It’s nice sharing a solution and confirming said solution, but I am not a fan of copy-paste a solution without background knowledge what it actually does.

Like I wrote in my topic [HELP/CHECK] btrfs - Add new drive to the system - #2 by BrutalBirdie I am new to btfs and a bit hesitant to do changes to the filesystem (/etc/fstab)

I can gather from the arch wiki and manpage:
https://wiki.archlinux.org/index.php/btrfs#Commit_interval
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)

So after reading up the wiki and manual, I assume.

My default commit= is 30 since its not defined in my /etc/fstab
Your suggested fix commit=15 0 2 would half the time between the interval of periodic transaction when data are synchronized to permanent storage.
The 0 2 is 0 <dump> and 2 <pass> from /etc/fstab ?

My /etc/fstab has these default option

# <file system>                           <mount point>  <type>  <options>                                                      <dump>  <pass>
UUID=d3c75445-c596-456b-b7f9-a7bc48e9126b /              btrfs   subvol=@,defaults,noatime,space_cache,autodefrag,compress=zstd 0       1

If you could elaborate why you have chosen these option and why this can help would be awesome for a newbie like me to feel saver handling my Garuda Linux.

Cheers

tbg · 20 February 2021 18:16

No real reason to fear editing fstab. Simply make sure to make a backup before editing which can be restored from the terminal if you mess things up.

Here are some good btrfs informational links:

If you are nervous about changing your commit interval via fstab you can test an alternate method. This can be accomplished in a similar manner by using systemctl.

You can change the data being committed from every 30 seconds to every 5 seconds with sysctl.

Change /proc/sys/vm/dirty_expire_centisecs to 500 (5 seconds) from 3000 (30 seconds is the default).

Use su to login as root, (this change is only temporary):

echo 500 > /proc/sys/vm/dirty_expire_centisecs

You can check the /proc/sys/vm/dirty_expire_centisecs setting with:

cat /proc/sys/vm/dirty_expire_centisecs

If this improves your performance, the setting can be made permanent with a conf file in the /etc/sysctl.d/ directory.

tbg · 11 March 2021 19:48

As well as all my other fixes I'd tested, I'd forgotten to mention this step which I applied today that helped immensely:

I followed the recommended procedure of restoring a timeshift snapshot from the grub boot menu a day ago, I then did a GUI timeshift restore of my last snapshot once I booted into my desktop (as recommended). Afterwards shutdown and startup took much longer, and my system was extremely sluggish. Today I repeated what I thought may have helped with this type of system slowdown once before.

I opened the timeshift GUI and deleted all my backup snapshots. Some snapshots required being deleted twice before they are removed from the menu. All my snapshots are manually created (not autosnapshots). Quotas are disabled in the system and in timeshift. After all snapshots were removed, I then performed a btrsf balancing. I performed the balancing twice, the first time will take a fair while depending on how long its been since the last time. The second balance (although redundant) I simply run to be extra sure everything is in order. This is likely an excessive use of balancing, and probably is not recommended nor required.

After the balancing is complete I reboot, and the change is noticeable immediately. The system shuts down in the normal amount of time, and the reboot time is normal as well. As soon as the system is fully started a series of tests proved the system performance was back as well. I have performed this sequence several times now and in both cases it has definitely improved things considerably. The first time I performed this along with many other troubleshooting steps, so I really wasn't sure what improved matters.

In this case I'm fairly confident this was the key step in correcting the performance issues. I can't say if this will be a help for others as I'm using "on demand" snapshots with quotas disabled (not the standard timeshift default setup). The thing that led me to this was reading numerous posts that an excessive amount of snapshots could lead to system slowdowns. In my case I only had 4-6 snapshots, (hardly what I'd consider excessive). However, the difference was like night and day after the deletions and re-balancing steps.

Perhaps that may help some.

BrutalBirdie · 11 March 2021 20:46

Well I just followed that and a normal balance with no filters took over 20 minutes.

tbg · 11 March 2021 20:50

Yes, t will take a while, depending on how long ago your last balancing was.

BrutalBirdie · 11 March 2021 21:10

Well I can report that helped A LOT.
I was running into smaller freezes here and there.

And upgrading the system could take up to 30 minutes when the dkms installs happend.

Well I just updated my kernels and it took 5 minutes.

tbg · 11 March 2021 21:16

Well that's encouraging that it seems to have more widespread application than to just myself.

I don't know whether this is applicable to systems that have not used the system restore feature before, or it simply helps after you've performed a restore. All I know as my system was working fine, I got a balky update where I couldn't login. I rolled the system back and it worked fine, but performance sucked.

After performing this procedure everything seemed to be back to normal. Updates afterwards were fine. I assume whatever caused the issue initially was quickly rectified.

Oh and of course, be sure to make a fresh timeshift snapshot after you've wiped the old ones (after the reboot).

BrutalBirdie · 12 March 2021 10:10

I did not restore a snapshot, ever.

jonathon · 12 March 2021 16:46

If this is related to BTRFS snapshots then I wonder whether a "gentle" balance (e.g. -dusage=50 -musage=50) might be a candidate for a systemd timer.

Basic initial script:

#!/usr/bin/bash

for _mount in $(mount | grep btrfs | cut -d' ' -f3); do
        btrfs filesystem balance start -dusage=50 -musage=50 "$_mount"
done

Stroke_Finger · 12 March 2021 19:38

I did my first balance earlier this evening, as more of an experience building exercise, than to solve any problems.

Like @BrutalBirdie I deleted or restored no snapshots, but system snappiness was improved on reboot.

Thanks to one and all for the very interesting read and information.

tbg · 12 March 2021 19:52

In my case I do periodic balancing fairly regularly. So I guess the question would be how often is enough?

This answer would seem to vary greatly depending on the percentage of file churn on a weekly basis. As I do not use my home directory for storing large data files such as HD movies etc my amount of data churn is usually minimal.

The only large files stored on my system partitions are my system snapshots. When I delete my snaps which I do whenever I reach 4 or 5 backups I perform a balancing.

I'm sure others must have far greater data churn than myself as I store nothing in my home directory (it is all symlinked).

Stroke_Finger · 12 March 2021 19:55

Same here - 2 x 500Gb SDD’s symlinked under home, as backup_store and data_store - I doubt my post added more to this, but evidence is evidence, so I submitted my results to hopefully help.

tbg · 13 March 2021 13:35

jonathon:

If this is related to BTRFS snapshots then I wonder whether a “gentle” balance (e.g. -dusage=50 -musage=50) might be a candidate for a systemd timer.

Basic initial script:
#!/usr/bin/bash

for _mount in $(mount | grep btrfs | cut -d' ' -f3); do
        btrfs filesystem balance start -dusage=50 -musage=50 "$_mount"
done

This is starting to seem like a good option to implement.

What kind of time frame do you think might be suitable for this timer?

I’m guessing the amount it would be needed would vary greatly depending on the size of the drive(s) and the percentage of data being flushed on a regular basis.

The only problem being that on systems prone to freezing a balance operation is one of the commonly mentioned triggers. So then the question would be, is it better to have a timer set infrequently for a time that is not high usage such as midweek at 4 am. Or, would it be less likely to trigger a freeze if run more often, (as it would take far less time to run and finish).

Thoughts?

Stroke_Finger · 13 March 2021 13:47

If 24/7 was presumed, then yes - that sound sensible, but usage (like mileage) will obviously vary. and I can’t envision a one-stop, fits-all solution. (but hell, I’m a noob)

My personal use is not 24/7, but per-day instances. In my case, manual would seem better IMHO.

Just my two-pennies worth.

tbg · 13 March 2021 14:00

Timers can of course be easily enabled and disabled. For those unfamiliar with systemd usage an otion to enable/disable a balance timer could probably be added to the Garuda Assistant app.

librewish · 13 March 2021 14:41

what about we use this by default

Bro · 13 March 2021 15:25

From your posted source:

Typical use cases

A rolling distro

frequency of updates: high, multiple times per week

amount of data changed between updates: high

Suggested values:

TIMELINE_LIMIT_HOURLY="12"
TIMELINE_LIMIT_DAILY="5"
TIMELINE_LIMIT_WEEKLY="2"
TIMELINE_LIMIT_MONTHLY="1"
TIMELINE_LIMIT_YEARLY="0"

The size of root partition should be at least 30GiB, but more is better.

tbg · 13 March 2021 15:34

I haven’t looked at it super closely yet, but they all seem well thought out.

BrutalBirdie · 13 March 2021 22:47

I just installed this as a package from AUR.
I will see if this makes live easier with btrfs.