Btrfs-cleaner gobbling up CPU but quota is disabled!

Post your terminal/konsole in- and output as text (no pictures) from:

System:
Kernel: 5.16.0-zen1-1-zen x86_64 bits: 64 compiler: gcc v: 11.1.0
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=03da7a37-4ea3-4808-b057-6f1ef916effa rw [email protected]
splash rd.udev.log_priority=3 vt.global_cursor_default=0
systemd.unified_cgroup_hierarchy=1
resume=UUID=6b5e9134-5814-43fa-a3ec-627a454e7d9c loglevel=3
Desktop: KDE Plasma 5.23.5 tk: Qt 5.15.2 info: latte-dock wm: kwin_x11
vt: 1 dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
Type: Desktop Mobo: Micro-Star
model: MPG X570 GAMING PRO CARBON WIFI (MS-7B93) v: 1.0
serial: <superuser required> UEFI: American Megatrends LLC. v: 1.E0
date: 12/17/2021
CPU:
Info: model: AMD Ryzen 9 5950X bits: 64 type: MT MCP arch: Zen 3
family: 0x19 (25) model-id: 0x21 (33) stepping: 0 microcode: 0xA201016
Topology: cpus: 1x cores: 16 tpc: 2 threads: 32 smt: enabled cache:
L1: 1024 KiB desc: d-16x32 KiB; i-16x32 KiB L2: 8 MiB desc: 16x512 KiB
L3: 64 MiB desc: 2x32 MiB
Speed (MHz): avg: 3865 high: 4473 min/max: 2200/5083 boost: enabled
scaling: driver: acpi-cpufreq governor: performance cores: 1: 3582 2: 3532
3: 3575 4: 4471 5: 3787 6: 3577 7: 3578 8: 3573 9: 4117 10: 3572 11: 3980
12: 4450 13: 3554 14: 4423 15: 4462 16: 3606 17: 3635 18: 3578 19: 3708
20: 4465 21: 3589 22: 3598 23: 3579 24: 3646 25: 4428 26: 3584 27: 4389
28: 4473 29: 3685 30: 3868 31: 3755 32: 3878 bogomips: 217612
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Vulnerabilities:
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: spec_store_bypass
mitigation: Speculative Store Bypass disabled via prctl
Type: spectre_v1
mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional,
IBRS_FW, STIBP: always-on, RSB filling
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: AMD Vega 10 XL/XT [Radeon RX Vega 56/64] vendor: ASUSTeK
driver: amdgpu v: kernel bus-ID: 2f:00.0 chip-ID: 1002:687f class-ID: 0300
Device-2: ARC Camera type: USB driver: snd-usb-audio,uvcvideo
bus-ID: 5-1.2.2:9 chip-ID: 05a3:9331 class-ID: 0102 serial: <filter>
Device-3: ET13R type: USB driver: snd-usb-audio,uvcvideo bus-ID: 5-4:4
chip-ID: 1e4f:1301 class-ID: 0102 serial: <filter>
Display: x11 server: X.Org 1.21.1.3 compositor: kwin_x11 driver:
loaded: amdgpu,ati unloaded: modesetting alternate: fbdev,vesa
display-ID: :0 screens: 1
Screen-1: 0 s-res: 4240x1440 s-dpi: 96 s-size: 1121x381mm (44.1x15.0")
s-diag: 1184mm (46.6")
Monitor-1: DisplayPort-0 res: 2560x1440 dpi: 109
size: 597x336mm (23.5x13.2") diag: 685mm (27")
Monitor-2: DisplayPort-1 res: 1680x1050 hz: 60 dpi: 71
size: 598x336mm (23.5x13.2") diag: 686mm (27")
OpenGL: renderer: AMD Radeon RX Vega (VEGA10 DRM 3.44.0 5.16.0-zen1-1-zen
LLVM 13.0.0)
v: 4.6 Mesa 21.3.4 direct render: Yes
Audio:
Device-1: AMD Vega 10 HDMI Audio [Radeon Vega 56/64] driver: snd_hda_intel
v: kernel bus-ID: 2f:00.1 chip-ID: 1002:aaf8 class-ID: 0403
Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI
driver: snd_hda_intel v: kernel bus-ID: 31:00.4 chip-ID: 1022:1487
class-ID: 0403
Device-3: Razer USA Nari Ultimate type: USB
driver: hid-generic,snd-usb-audio,usbhid bus-ID: 1-1:2 chip-ID: 1532:051a
class-ID: 0300
Device-4: ARC Camera type: USB driver: snd-usb-audio,uvcvideo
bus-ID: 5-1.2.2:9 chip-ID: 05a3:9331 class-ID: 0102 serial: <filter>
Device-5: ET13R type: USB driver: snd-usb-audio,uvcvideo bus-ID: 5-4:4
chip-ID: 1e4f:1301 class-ID: 0102 serial: <filter>
Sound Server-1: ALSA v: k5.16.0-zen1-1-zen running: yes
Sound Server-2: JACK v: 1.9.20 running: no
Sound Server-3: PulseAudio v: 15.0 running: yes
Sound Server-4: PipeWire v: 0.3.43 running: yes
Network:
Device-1: Intel I211 Gigabit Network vendor: Micro-Star MSI driver: igb
v: kernel port: d000 bus-ID: 26:00.0 chip-ID: 8086:1539 class-ID: 0200
IF: enp38s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
IF-ID-1: anbox0 state: down mac: <filter>
IF-ID-2: docker0 state: down mac: <filter>
Bluetooth:
Device-1: Intel AX200 Bluetooth type: USB driver: btusb v: 0.8
bus-ID: 1-4:4 chip-ID: 8087:0029 class-ID: e001
Report: bt-adapter note: tool can't run ID: hci0 rfk-id: 0 state: down
bt-service: disabled rfk-block: hardware: no software: no address: N/A
Drives:
Local Storage: total: 5.2 TiB used: 1.75 TiB (33.6%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:4 vendor: Samsung
model: SSD 970 EVO Plus 2TB size: 1.82 TiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 2B2QEXM7 temp: 70.8 C scheme: GPT
ID-2: /dev/nvme1n1 maj-min: 259:0 vendor: Samsung
model: SSD 970 EVO Plus 2TB size: 1.82 TiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 3B2QEXM7 temp: 69.8 C scheme: GPT
ID-3: /dev/sda maj-min: 8:0 vendor: Mushkin model: MKNSSDRE1TB
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 7C scheme: GPT
ID-4: /dev/sdb maj-min: 8:16 vendor: Mushkin model: MKNSSDCR480GB
size: 447.13 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: BBF0 scheme: MBR
ID-5: /dev/sdc maj-min: 8:32 vendor: OCZ model: AGILITY3 size: 223.57 GiB
block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD
serial: <filter> rev: 2.15 scheme: MBR
Partition:
ID-1: / raw-size: 1.75 TiB size: 3.57 TiB (203.87%) used: 1.02 TiB (28.7%)
fs: btrfs dev: /dev/nvme1n1p2 maj-min: 259:2
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 576 KiB (0.2%) fs: vfat dev: /dev/nvme1n1p1 maj-min: 259:1
ID-3: /home raw-size: 1.75 TiB size: 3.57 TiB (203.87%)
used: 1.02 TiB (28.7%) fs: btrfs dev: /dev/nvme1n1p2 maj-min: 259:2
ID-4: /var/log raw-size: 1.75 TiB size: 3.57 TiB (203.87%)
used: 1.02 TiB (28.7%) fs: btrfs dev: /dev/nvme1n1p2 maj-min: 259:2
ID-5: /var/tmp raw-size: 1.75 TiB size: 3.57 TiB (203.87%)
used: 1.02 TiB (28.7%) fs: btrfs dev: /dev/nvme1n1p2 maj-min: 259:2
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: partition size: 69.06 GiB used: 0 KiB (0.0%)
priority: -2 dev: /dev/nvme1n1p3 maj-min: 259:3
ID-2: swap-2 type: zram size: 62.78 GiB used: 9.5 MiB (0.0%)
priority: 100 dev: /dev/zram0
Sensors:
System Temperatures: cpu: N/A mobo: N/A gpu: amdgpu temp: 57.0 C
mem: 54.0 C
Fan Speeds (RPM): N/A gpu: amdgpu fan: 1410
Info:
Processes: 684 Uptime: 13h 7m wakeups: 0 Memory: 62.78 GiB
used: 12.98 GiB (20.7%) Init: systemd v: 250 tool: systemctl Compilers:
gcc: 11.1.0 clang: 13.0.0 Packages: 1990 pacman: 1975 lib: 555 flatpak: 6
snap: 9 Shell: fish v: 3.3.1 default: Bash v: 5.1.16 running-in: konsole
inxi: 3.3.11

Hi,

I have the issue that btrfs-cleaner is constantly using a lot of CPU and also slows down IO. I already searched for that issue on the forums and in the net. Usually this happens if quota is enabled.
But, to all my knowledge, I do not have quote enabled and never had (installation is quite recent from January 2022 and it did not have timeshift but already started with snapper).

Is there any log where I can have a look what btrfs-cleaner does?

I copied a lot of files and data to my home drive (like at least a TB of steam games and also some backup of my windows drive) and also had snapper take some snapshots of that drive. Since I noticed btfs-cleaner slowing down my system I already disabled timeline snapshots for my home subvolume (and now also for root) and hoped it would finish whatever it does and calm down a bit.. it did run the whole night now and still consuming CPU time..

I also see a lot of activity from baloo but after understanding what it is (file indexing) that seems to be expected. I suspended indexing (balooctl suspend) for now, and it came to an end, but btfs-cleaner still is very active... :-/

thanks for any hints.
Garfonso

Sometimes it can get re-enabled.

Check if btrfs quota is enabled, and find how much space a btrfs snapshot is taking up:

sudo btrfs qgroup show /

Please post output.

Disable btrfs quotas:

sudo btrfs quota disable /
2 Likes

Sure:

[🔴] × sudo btrfs qgroup show /
ERROR: can't list qgroups: quotas not enabled
1 Like

Well that answers that question.

Try this:

Change /proc/sys/vm/dirty_expire_centisecs to 500 (5 seconds) from 3000 (30 seconds is the default).

Use su to login as root, (this change is only temporary):

echo 500 > /proc/sys/vm/dirty_expire_centisecs
3 Likes

If above does not help, then try:

btrfs quota disable /data

That did not change anything, sadly..

Says:

[[email protected] achim]# btrfs quota disable /data
ERROR: cannot access '/data': No such file or directory

I do not have a /data folder or volume but I do have a home subvolume and did that for home, too (already a few hours ago)...

My bad, that was my path on my comp. Recorded the command and forgot it was specific to my system.

Just curious:

systemctl list-unit-files --state=enabled --no-pager

systemctl list-unit-files --state=failed --no-pager
1 Like

One of your best bets might be to try lowering the priority of the btrfs-cleaner process.

This can be done in different ways. If you already have ananicy-cpp.service running you could try writing a rule for btrfs-cleaner.

You could also try the "cpulimit" tool and set a max cpu rate, for btrfs-cleaner .

https://archlinux.org/packages/community/x86_64/cpulimit/

Hope that does the trick for you.

Should I try to disable quota for all the subvolumes?

[[email protected] achim]# systemctl list-unit-files --state=enabled --no-pager
UNIT FILE                                             STATE   VENDOR PRESET
var-lib-snapd-snap-bare-5.mount                       enabled disabled     
var-lib-snapd-snap-core-11993.mount                   enabled disabled     
var-lib-snapd-snap-core18-2253.mount                  enabled disabled     
var-lib-snapd-snap-core18-2284.mount                  enabled disabled     
var-lib-snapd-snap-gnome\x2d3\x2d28\x2d1804-161.mount enabled disabled     
var-lib-snapd-snap-gtk\x2dcommon\x2dthemes-1519.mount enabled disabled     
var-lib-snapd-snap-journey-23.mount                   enabled disabled     
var-lib-snapd-snap-onenote\x2ddesktop-13.mount        enabled disabled     
var-lib-snapd-snap-snapd-14295.mount                  enabled disabled     
grub-btrfs-snapper.path                               enabled disabled     
ananicy-cpp.service                                   enabled disabled     
anbox-container-manager.service                       enabled disabled     
avahi-daemon.service                                  enabled disabled     
docker.service                                        enabled disabled     
fancontrol.service                                    enabled disabled     
garuda-pacman-lock.service                            enabled disabled     
[email protected]                                        enabled enabled      
ipp-usb.service                                       enabled disabled     
irqbalance.service                                    enabled disabled     
libvirtd.service                                      enabled disabled     
linux-modules-cleanup.service                         enabled disabled     
lm_sensors.service                                    enabled disabled     
memavaild.service                                     enabled disabled     
ModemManager.service                                  enabled disabled     
NetworkManager-dispatcher.service                     enabled disabled     
NetworkManager-wait-online.service                    enabled disabled     
NetworkManager.service                                enabled disabled     
nmb.service                                           enabled disabled     
nohang-desktop.service                                enabled disabled     
preload.service                                       enabled disabled     
prelockd.service                                      enabled disabled     
sddm-plymouth.service                                 enabled disabled     
smb.service                                           enabled disabled     
snapd.service                                         enabled disabled     
sshd.service                                          enabled disabled     
systemd-network-generator.service                     enabled enabled      
systemd-networkd-wait-online.service                  enabled disabled     
systemd-networkd.service                              enabled enabled
systemd-timesyncd.service                             enabled enabled
teamviewerd.service                                   enabled disabled
uresourced.service                                    enabled disabled
avahi-daemon.socket                                   enabled disabled
cups.socket                                           enabled disabled
libvirtd-ro.socket                                    enabled disabled
libvirtd.socket                                       enabled disabled
saned.socket                                          enabled disabled
snapd.socket                                          enabled disabled
systemd-networkd.socket                               enabled disabled
virtlockd.socket                                      enabled disabled
virtlogd.socket                                       enabled disabled
remote-fs.target                                      enabled enabled
btrfs-balance.timer                                   enabled disabled
btrfs-defrag.timer                                    enabled disabled
btrfs-scrub.timer                                     enabled disabled
btrfs-trim.timer                                      enabled disabled
snapper-boot.timer                                    enabled disabled
snapper-cleanup.timer                                 enabled disabled
snapper-timeline.timer                                enabled disabled

58 unit files listed.
[[email protected] achim]# systemctl list-unit-files --state=failed --no-pager
UNIT FILE STATE VENDOR PRESET

0 unit files listed.

Learning new commands here. :slight_smile:

I doubt that's necessary.

I see you have ananicy-cpp.service running so you could use it for limiting btrfs-cleaner, (or any other method on my last post)..

ok, thanks. Will read up on that.

1 Like

This may not help you, as on my system btrfs-cleaner's niceness level is already set as low as possible YMMV.

https://man.archlinux.org/man/nice.1.en

1 Like

Continuing from above:

You can however fine tune other systemd parameters. This gets rather complex if you're not familiar with systemd services. I'm plenty familiar with it, and I've never attempted tuning anything in this manner. Definitely not light reading:

Understanding systemd scheduling-related options

https://wiki.archlinux.org/title/systemd

Here are some other ideas I located that I hadn't seen before:

I did find one fairly recent thread claiming that removing the autodefrag mount option from btrfs partitions in/etc/fstab fixed a similar issue. That's unfortunately not a very good long term solution (but I guess it's worth a try).

See:

autodefrag mount option causes high cpu load

I also found another thread claiming that modifying snapper's config file could help.

See:

Yet another btrfs-cleaner freezing issue

I'll let you know if I come across anything further.

Edit:

Oh ya, I can't believe this wasn't covered already. Be sure to test some alternate kernels such as, linux-lts, and linux-mainline, and perhaps a real time (rt) kernel may help.

7 Likes

Did you balance the btrfs filesystem
It might need to be balanced just in case

3 Likes

I definitely do a balancing more often than the systemd timer's monthly setting.

5 Likes

Hi,

thanks for the additional informations.

I did some more experiments on my side, too. I did disable snapper timeline. And I think the hourly snapshots were the problem. After some time btfs-cleaner has stopped whatever it did. One problem for me with this was, that I did not have a clue, what it was doing and how much progress it made. For that I found the tool btrfs-orphan-cleaner-progress from the python-btrfs package. That showed a number of "orphans" that was to be cleaned, which was declining. So at least I saw progress. :wink:

I then did a full balance.

After that I feld lucky and activated snapper timeline snapshots again (with only very few to keep) but btrs-cleaner was back again, clogging my harddrive performance... :-/

Now I tried to remove autodefrag and rebootet and btfs-cleaner was done instantly and btrs-orphaned-cleaner-progress shows that there are no orphans to clean. Yay. :slight_smile:
So I think I'll leave snapper timeline disabled and enable autodefrag again for the time being.

Thanks for all the help, in any case.
Garfonso

4 Likes

I'm so glad one of my suggestions helped you get this situation under control. You stuck with it, and that's the key thing. I have experienced these types of issues in the past and if you keep working at it, they can almost always be resolved.

Unfortunately, many people simply throw their hands in the air when they come across an issue like this. Most usually give up and go back to Windows or Ubuntu. They then proceed to post nasty reviews about how unstable Garuda is all over the internet, when in reality Garuda is extremely stable. IMO most of these people lacking in perseverance simply aren't cut out for dealing with the complexities of an Arch based rolling distro.

@Garfonso, you've proved you're one of the people cut from the right jib capable of dealing with complex issues and following them through to resolution. Kudos to you for sticking with it to the end, and for reporting back on all your findings.

Welcome to the Garuda community my friend. :+1:

7 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.