Kernel Benchmarking Results - Zen, cacULE, TKG, BORE

DISCLAIMER: I'm not a kernel guy, or at least haven't been for a while. The last time I was building/tweaking/optimizing/etc for kernels was on the Nexus 5X. I may have made some sub-optimal decisions here, so if that's the case let me know. This represents tests on one machine with one benchmarking tool and is not a comprehensive analysis.

Benchmark Details

All tests done with Geekbench 5, no windows open, run in GNOME Terminal under fish shell, on power and plugged into mouse and keyboard. Desktop session was GNOME under Wayland for all. No performance or powersave tweaks changed from default in Garuda Assistant. Fans returned to idle and Terminal re-launched between runs.

Full relevant hardware details are in the geekbench links, but my machine is a ThinkPad X1 Carbon 6th gen, with 16GB of RAM and a Kaby Lake R Intel i7. I'm running a relatively fresh install of the Garuda GNOME edition.

Zen Kernel 5.17.2.zen3-1

Default Garuda kernel, up to date with current repos. No kernel modifications have been made.

Showed small increases across the board switching to performance. Fans ran sooner and higher under performance.

cacULE Kernel 5.17.2-2

Installed via Garuda Settings manager, Chaotic-AUR package.

Felt a tad snappier, probably placebo, slightly worse than Zen on multi-core but better on single-core according to Geekbench. Fans were peaking faster, but could be just a bit hotter from repeated benchmarking despite fans returning to idle. Launching my standard-load apps quickly as I normally do also felt faster. Further normal-load use, and it does definitely feel snappier despite quantitative results, further use and benchmarking necessary. :person_shrugging:

CachyOS cacULE Kernel 5.17.3-1

Built from AUR with appropriate headers. I didn't edit the build files, probably could have added some flags or at least modprobed to slim it down, but given that I used a prebuilt for the other cacULE it felt a fairer comparison.

Felt similar to the other cacULE kernel. I've given up on going off my intuition though, it's subtle enough that I could just be reading into it too much. Benchmarks consistently slightly worse than both Zen and the other cacULE across the board.

Custom-Built TKG-CFS 5.17.3-256

Built from here with Glitched CFS. Might try building with cacULE as well, but stayed with the default for now. Selected full tickless option, applied skylake-specific microarch tuning (this is the applicable option for Kaby Lake R processors, I checked). Did not apply ACS override patch, not completely clear on what this does practically speaking. Appears to be included in Zen, but has some associated (security?) risk and didn't seem to be relevant to my use-case. Did not use menuconfig or do any additional configuration. Built with gcc using the -o3 compiler flag and whatever else TKG's build script does by default.

Note that I did not test the other profiles, as this kernel uses a modified, aggressive ondemand scheduler by default and the profile changer did not seem to modify this. Honestly, it felt worse than the cacULE options. The only specific qualitative observation I have to this regard is that there seemed to be slightly more stuttering when going in and out of the overview with all my windows open and when opening all my apps in rapid succession.

Closing Notes

Where mentioned, "normal working load" refers to being logged in to the GNOME Shell under Wayland, and having open 1-2 FireFox windows with a combined 20-30 tabs, Discord, Fractal, Spotify, Obsidian, Code-OSS, and GNOME Terminal.

I plan to take a day or so to use each kernel for my normal working load and take more extensive subjective notes, but as mentioned the validity of these is potentially questionable. Regardless, I'll update this post with my observations and anything else I try. I may try some other options, and my intuitive observations do make me want to try building the TKG with cacULE instead.

If there are any other benchmarks or adjustments that anyone wants to see or recommends that I try out, let me know! It's not really super sustainable for me to recompile a whole lot as these builds take forever and redline my CPU the whole time, but if anyone wants to see other configs and I have the time I'll give it a shot. I didn't test the Power Saver profile because I'm not really using my machine on the go right now, but I may run some battery life assessments if there's any interest in that.

Realistically speaking, trying to trick out a kernel to squeeze a little more juice out of my machine is stupid because a) it's a laptop, b) Garuda Assistant's tuning options are more than sufficient if I want to optimize, c) the only games I play are Minecraft and osu!lazer, d) out of the box performance is pretty staggering already and handles even my heaviest use perfectly well, and e) the changes between kernels and profiles seem relatively small.

Cheers!

9 Likes

And you're bored as heck? :slight_smile: How about re-compiling each kernel with the appropriate flags or whatever these kids call 'em nowadays?

3 Likes

I've always applied the ideal at the heart of the phrase "unused RAM is wasted RAM" to many aspects of my life. I've tended to phrase it as some variation of "if you're not using the whole machine, what'd you get it for?" Machine in this case being anything - your brain, your phone, your car, your actual computer, etc.

Something tugs at my brain, deep down, whispering "run right up to the edge and drop a fork into the abyss while we're at it, Nietzsche's dead anyway" and it (sometimes) blends nicely with a (sometimes) more rational drive towards general inquiry and self-documentation.

Headed that way :rofl: maybe I'll just finally heed le l'appel du vide gentoo

2 Likes

I feel that, aww the good old Gentoo times :smiling_face:

I'm curious about your results ngl :star_struck:

I've not run any more real benchmarks yet, nor have I taken a run at playing any games, but I have been rocking with the 5.17.2-2-cacule kernel on the balanced profile since I posted this and I must say it really does feel smoother and snappier. I'll have to branch out into some other benchmarks to get a little more comprehensive picture I guess.

If I get the chance in the next couple days, I am planning to do another TKG build with cacULE in place of CFS and more concerted attention to the configs and flags to see if that makes any difference.

I'll keep y'all posted!

2 Likes

That's cool, I'm looking forward to! :smiley:

sure that's tedious to test - respect for that :+1:t2: :muscle:t2:

1 Like

Think you can benchmark Linux bore kernel? It's the best performing kernel I've ever run and it's from the cachyos team using the Bore scheduler. Would be interesting to see the numbers in comparison.

3 Likes

Interesting. Some cursory searches didn't reveal any information about what the BORE scheduler actually is besides "Burst-Oriented Response Enhancer" with none of the README's providing more details. No qualms about giving it a shot, though, just if you have any more info you could point me at I am curious.

I see a 1.16.16-1 is available via the chaotic and latest 5.17.3-1 is on the AUR, but increasingly the CachyOS repo is cropping up and seems to have pretty quick turnaround on generic prebuilds. Any recommendation? I'd install the same one you've got if you want to see, or I could take a stab at CachyOS' fancy build script to optimize a little more.

EDIT: Did find a little description in the patchfile itself, lol. Sounds like some clever stuff is going on, I'll look into it.

Yeah it's a very interesting project to be honest. From what I understand, they use a similar base but the naming such as cacule and bore are just to indicate the scheduler used. I talked to the developer himself and he even suggested to me to run bore. And since recommending it I have been on it exclusively lol. Laptop boots up a hell of a lot faster and it's all very snappy for me. I'm on the latest AUR build as they're generally fine to run although occasionally there's an issue like a previous build with the syncing of the chaotic aur etc. Reverted back though and no problems since. I'd definitely recommend just building it like a normal kernel such as using paru to grab the package and then makepkg -si. I definitely need to check out that patch file you mentioned though because I don't fully understand how this scheduler works. I just know that it's bloody fast haha. Would appreciate you checking it out when you get the time as I love hearing others feedback and I don't really have good enough hardware to be compiling kernels all the time as it takes multiple hours. So if you have the time then I'd love to know your opinions on the snappiness feeling, boot times and overall feel as well as the benchmark single and multi core scores if you get the time.

1 Like

Yeah, that's been mostly my impression too, they are all Linux kernels after all. Zen, TKG and the Cachy kernels appear to use differing patchsets and a handful of other exclusive tweaks respectively, with both of the latter pulling in Zen's stuff among their other configurations. The main change though is the scheduler, dropping in replacements for Linux's CFS or in the case of TKG's default just applying a tuning set and some mods onto CFS.

Are you booting from an SSD or an HDD? I've not looked at boot times but Garuda boots in like 2 seconds flat on my machine so it's hard to evaluate the difference without. More generally, what specs are you working with? Curious what differences manifest there. If you're interested, geekbench is available in the AUR and probably shouldn't take more than ~10ish minutes if you wanted a quantitative comparison.

I still don't either :rofl:, but here is the patch I mentioned, description starts on line 39. I'm sure my understanding will grow and some of my old Android kernel knowledge might get dredged as I mess with this stuff again but the sorcery at play is going over my head right now.

Kernel builds seem to take me around an hour and obvs redline my entire CPU for that time, but if I don't care about extending that a bit my machine does remain remarkably usable during that time. If it wasn't beyond the scope of what I care to take the time to do, I'd be curious if the different kernels make a difference in these times but :person_shrugging: seems a hassle to test lol.

Tempted as I am to build out a cacULE from the TKG buildscript, I've already got the two so I'll do the BORE next.

I am staunchly ignoring the "evil Sonar" whispering the promises of Gentoo in my ear.

2 Likes

By boot time I don't mean boot itself but you know that bar that loads before the login screen? That's like twice the speed for me as opposed to other kernel's.

Damn just an hour? That's the dream​:joy:

Oh, huh, I don't have a loading bar at login, but I do use the GNOME edition. GDM's behavior might just be a tad different, not sure on this.

Optimally, anyway. I could slim it down wayyy more if I took a couple days and some careful intentionality to build a modprobedb. If I keep doing this kernel nonsense for much longer I will, but for now I think it's worth testing the default monoliths without having to dig through kernel messages to troubleshoot some bizarre issue because I didn't load some specific module I didn't catch.

So, small update on this. I just installed the linux-bore kernel available from the chaotic AUR that shows up in the Garuda Settings Manager. It's still on 5.16.16-1, but I wanted to give the prebuilt a shot before I sent it on a full build. Somewhat bizarrely, it doesn't seem to load the driver/module for my wireless card? Unfortunately, geekbench needs an internet connection, so I wasn't able to benchmark it at all and didn't have any other tools installed. I would put some effort into troubleshooting this but I won't have time until later, at which point I may as well just build my own on the current stable.

@dr460nf1r3 thought I'd mention you on this just in case it's something that requires any attention. If you want me to investigate like I do and make a real post let me know!

So yeah, no linux-bore yet, but my other progress continues. Went ahead and removed my AUR-built cacULE since the chaotic version got updated in line with the current stable and it was benchmarking better than my own build. Plus, I was using the chaotic version anyway for my first longer-use evaluation. Bit less unscientific, applying the update midway through the eval, so I may check if anything changed with the bump.

With more "normal load" use on cacULE, the sense that it really is perceptibly quicker and more responsive is growing, but :person_shrugging:

2 Likes

I am trying out Linux garuda-iMac 5.17.3-3-cachyos-bore since yesterday. So far, real steady as far as my core loads. Very consistent and steadier than normal according to my core graphs/charts. It sure does feel snappy(er) :wink: and for sure more steady as I watch my core loads with all my normal-use apps I keep open. I've had none/nada/zilch pauses (I used to get very slight pauses when the system would load up) on this cachyos-bore, at all.

So far, very nice. As long as it runs this steady/smooth/snappy... I'm In!I am not a kernel expert, but I can say, again, watching my core/GUI/graph in system monitor, the loads rise and fall evenly ...and... In Concert as opposed to one or two cores jumping real high from time to time representing(at least visually) an uneven workload to me but i could be wrong here. That is not happening here with the cachyos-bore. I have a more even spread of the workload across all 8 cores. This is something I can actually observe. Is this a good or bad thing? I can't see how that would be a bad thing. :wink: Maybe someone can elaborate.

I will be keeping an eye on this cachyOS project's kernels, for sure.

-Peace :peace:

1 Like

Sorry I'm just bumping this thread again, but I did say I'd keep it updated! Interesting results today, and some further observations.

BORE Kernel 5.17.3-3

Wanting to follow up with @Grimy1928's recommendation, I found that linux-bore had been updated in the chaotic AUR (thanks dr460nf1r3!!!) to be inline with current stable. So I installed it, along with the headers, and rebooted. Following the same protocols as my initial post here, I benchmarked it with geekbench.

Yes, you read that right, it threw me off too and I double-checked I hadn't gotten them backwards. Bizarrely, switching over to the performance profile made it bench worse. More on this in the last section.

From the more subjective-qualitative end of things, I must say: damn does this one feel fantastic. Maybe even more than cacULE. I'll need some more side-by-side use and observation, though.

So, I saw this as I started writing this post, and this is an interesting observation. I checked my own CPU graphs and did a little fiddling, and you may be right. I'd been keeping a loose eye on them while doing my observations with cacULE, and under BORE it does seem to rise more evenly across my cores. I'll have to pay better attention to this to see if this is not just some cognitive bias of my own, though.

Other Updates

To try and see if the cacULE kernel actually did feel better than the Zen, I booted, logged in as quickly as I could (as I normally do), opened my standard load apps in rapid succession, and did some window-switching/loading/random stuff to make the CPU do a little work. I checked this a couple times with each kernel, and I'm now sure that the cacULE is faster/snappier/more responsive somehow.

Next Steps

  • I need to find some other benchmarks, and maybe do some targeted stress-testing. It's clear that the geekbench results do not strictly correlate with actual, normal use.
  • I'm going to switch over to the BORE kernel for now, keeping the "Balanced" profile set I guess, since that's where it was benchmarking the best. We'll see.
  • In general, there's a lot to figure out here, and it's not quite as high a priority as the rest of what I'm working on Garuda-wise. Still, I'll spread my work between the kernels and keep taking notes.

Further Things

Clearly, I didn't do my reading as thoroughly as I should have, so I looked into things a little deeper. power-profiles-daemon was working (at least, clearly changing some behaviors), and I combed my logs to see if anything funky was happening. There wasn't. It was correctly changing the active profile as far as I could verify.

So I tried to figure out what power-profiles-daemon actually does and... Well, I have absolutely no idea. Besides the config file and related commands telling me the name of the profile set and the fact that the "intel_pstate" driver is used, I'm entirely unclear. I've got a wad of documentation built up to look into on this, but mostly on other things mentioned in relation to the daemon. The project's README shed little light, but mentions that other behaviors can be hooked to the profiles as well to change other device behavior. :person_shrugging:

To go alongside this, I'm also going to look over the various tuning tweaks offered by the Assistant more thoroughly, both in the hopes that it'll broaden my knowledge a bit and just to familiarize myself better with Garuda's work as a whole.

If anyone has any insight on the things I've mentioned, please do let me know. Sorry if anyone who actually knows how kernels work has to read this, I'd imagine there's a lot I'm missing here. I've always loved digging into documentation, but I am in over my head on this closer-to-metal stuff.

3 Likes

Its a neat little daemon which can set powersave, balanced or performance power profiles. It is also integrated with KDE. I don't know if its that way on every device, but my laptop heavily profits from it - during powersave mode it substancially increases the battery life for example. Iirc it works by limiting CPU frequencies :eyes:

The issue with all kernels maintained by @ptr1337 is, that the patches used change their content and keep getting cached between builds, thus causing checksum mismatches. This time I manually corrected them, however I'll need to look into a permanent solution for this again.

3 Likes

Yeah, I found reference to it doing so for at least the Powersave profile, but didn't see specific details or anything. I got the sense that it's doing something else, but I haven't dug deep enough to figure out if/what. Seems to follow GNOME's grand tradition as far as documentation. :rofl: I'd been using it fine with Zen (and in the past with other GNOME distros since it was implemented) and not really paying much attention - switching to powersave on battery, leaving it mostly balanced or performance if it felt necessary on AC power. I definitely noticed a difference between them. Only reason it even occurred to me it wasn't doing exactly what I'd sort of assumed was the oddly counterintuitive result from comboing it with BORE. Like I said, I've no real idea what's afoot here yet. :person_shrugging: Might fully disable it and check out corectrl or kmon, see if I can take a look at what the kernels prefer without it interfering and re-bench if I have the time.

Ah, gotcha! That's unfortunate I suppose but they just started getting included, right? Makes sense that there'd be wrinkles to iron out, especially with custom kernels. Wish I could offer more than just understanding on that but alas.

Thanks for the response, I'll keep playing around and keep y'all posted if anything that feels really relevant comes up!

2 Likes

GNOME. Arch Linux - power-profiles-daemon 0.10.1-2 (x86_64)

(Which makes no difference.)

It replaces TLP. (I get to learn something new every day.) :wink:

3 Likes

Hey, the maintainer of the kernels here,

First thanks for your time to compare the kernels in the detail!

First one thing to geekbench, sadly this benchmark is really worse in benchmarking kernels directly in the detail, its okay if you want to compare the windows performance with your linux performance of your rig, but actually it gives really random values out.

Even if I benchmark 3 times with the same kernel, you'll get a difference from +- 20-40 points.
The phoronix benchmark is the better alternative to compare kernels in their performance regarding benchmarks.
In general the "snappier", more responsive the kernel is, the less is the throughput the most time.
So the goal is to archive a kernel which is responsive and got a good throughput.

If you have any issue, please report them at GitHub, Discord, Telegram or here. I will take now more often here a look into it.

All the kernels are prebuilt into the CachyOS repo, in Generic, Generic-v3 and also with lto.
The repo is way more stable then the last time, we fixed many things which caused some issues, and so far i know from @dr460nf1r3 it is compatible with garuda.

If you just want to have the built Kernels, you can just import our keyring and download the kernel and the headers from https://aur.cachyos.org
If you want to use the v3 optimized kernels, add in your /etc/pacman.conffollowing:

#Architecture = auto
Architecture = x86_64 x86_64_v3

For building yourself it will auto detect your cpu march and will optimize the makeflags automatically for it. Also there are several other options your can set in the PKGBUILD.
Also you can use modprobed-db to reduce your compiled modules alot, this decreases the build time a lot. Just read at the arch wiki how to use it. At the PKGBUILD you can set _localmodcfg=y then the modprobed-db config will be used.
https://wiki.archlinux.org/title/Modprobed-db

I already prepared many changes for the upcoming 5.17.4 Kernel, which showed at some testers and me great improvements. I think it will be tomorrow released, or today.

@SonarMonkey
With your wireless card seems a weird issue. Sadly I dont have the config anymore.
Maybe check out a 5.17 Kernel from the chaotic-aur, or our repo if the problem still exist.

@Cannabis
Which cpu you got in your iMac ? Should be a intel or?
Are you using the pstate driver?

If you got any question's, issue's feel free to hit me up.

Regards

10 Likes