Kernel Benchmarking Results - Zen, cacULE, TKG, BORE

By boot time I don't mean boot itself but you know that bar that loads before the login screen? That's like twice the speed for me as opposed to other kernel's.

Damn just an hour? That's the dream​:joy:

Oh, huh, I don't have a loading bar at login, but I do use the GNOME edition. GDM's behavior might just be a tad different, not sure on this.

Optimally, anyway. I could slim it down wayyy more if I took a couple days and some careful intentionality to build a modprobedb. If I keep doing this kernel nonsense for much longer I will, but for now I think it's worth testing the default monoliths without having to dig through kernel messages to troubleshoot some bizarre issue because I didn't load some specific module I didn't catch.

So, small update on this. I just installed the linux-bore kernel available from the chaotic AUR that shows up in the Garuda Settings Manager. It's still on 5.16.16-1, but I wanted to give the prebuilt a shot before I sent it on a full build. Somewhat bizarrely, it doesn't seem to load the driver/module for my wireless card? Unfortunately, geekbench needs an internet connection, so I wasn't able to benchmark it at all and didn't have any other tools installed. I would put some effort into troubleshooting this but I won't have time until later, at which point I may as well just build my own on the current stable.

@dr460nf1r3 thought I'd mention you on this just in case it's something that requires any attention. If you want me to investigate like I do and make a real post let me know!

So yeah, no linux-bore yet, but my other progress continues. Went ahead and removed my AUR-built cacULE since the chaotic version got updated in line with the current stable and it was benchmarking better than my own build. Plus, I was using the chaotic version anyway for my first longer-use evaluation. Bit less unscientific, applying the update midway through the eval, so I may check if anything changed with the bump.

With more "normal load" use on cacULE, the sense that it really is perceptibly quicker and more responsive is growing, but :person_shrugging:

2 Likes

I am trying out Linux garuda-iMac 5.17.3-3-cachyos-bore since yesterday. So far, real steady as far as my core loads. Very consistent and steadier than normal according to my core graphs/charts. It sure does feel snappy(er) :wink: and for sure more steady as I watch my core loads with all my normal-use apps I keep open. I've had none/nada/zilch pauses (I used to get very slight pauses when the system would load up) on this cachyos-bore, at all.

So far, very nice. As long as it runs this steady/smooth/snappy... I'm In!I am not a kernel expert, but I can say, again, watching my core/GUI/graph in system monitor, the loads rise and fall evenly ...and... In Concert as opposed to one or two cores jumping real high from time to time representing(at least visually) an uneven workload to me but i could be wrong here. That is not happening here with the cachyos-bore. I have a more even spread of the workload across all 8 cores. This is something I can actually observe. Is this a good or bad thing? I can't see how that would be a bad thing. :wink: Maybe someone can elaborate.

I will be keeping an eye on this cachyOS project's kernels, for sure.

-Peace :peace:

1 Like

Sorry I'm just bumping this thread again, but I did say I'd keep it updated! Interesting results today, and some further observations.

BORE Kernel 5.17.3-3

Wanting to follow up with @Grimy1928's recommendation, I found that linux-bore had been updated in the chaotic AUR (thanks dr460nf1r3!!!) to be inline with current stable. So I installed it, along with the headers, and rebooted. Following the same protocols as my initial post here, I benchmarked it with geekbench.

Yes, you read that right, it threw me off too and I double-checked I hadn't gotten them backwards. Bizarrely, switching over to the performance profile made it bench worse. More on this in the last section.

From the more subjective-qualitative end of things, I must say: damn does this one feel fantastic. Maybe even more than cacULE. I'll need some more side-by-side use and observation, though.

So, I saw this as I started writing this post, and this is an interesting observation. I checked my own CPU graphs and did a little fiddling, and you may be right. I'd been keeping a loose eye on them while doing my observations with cacULE, and under BORE it does seem to rise more evenly across my cores. I'll have to pay better attention to this to see if this is not just some cognitive bias of my own, though.

Other Updates

To try and see if the cacULE kernel actually did feel better than the Zen, I booted, logged in as quickly as I could (as I normally do), opened my standard load apps in rapid succession, and did some window-switching/loading/random stuff to make the CPU do a little work. I checked this a couple times with each kernel, and I'm now sure that the cacULE is faster/snappier/more responsive somehow.

Next Steps

  • I need to find some other benchmarks, and maybe do some targeted stress-testing. It's clear that the geekbench results do not strictly correlate with actual, normal use.
  • I'm going to switch over to the BORE kernel for now, keeping the "Balanced" profile set I guess, since that's where it was benchmarking the best. We'll see.
  • In general, there's a lot to figure out here, and it's not quite as high a priority as the rest of what I'm working on Garuda-wise. Still, I'll spread my work between the kernels and keep taking notes.

Further Things

Clearly, I didn't do my reading as thoroughly as I should have, so I looked into things a little deeper. power-profiles-daemon was working (at least, clearly changing some behaviors), and I combed my logs to see if anything funky was happening. There wasn't. It was correctly changing the active profile as far as I could verify.

So I tried to figure out what power-profiles-daemon actually does and... Well, I have absolutely no idea. Besides the config file and related commands telling me the name of the profile set and the fact that the "intel_pstate" driver is used, I'm entirely unclear. I've got a wad of documentation built up to look into on this, but mostly on other things mentioned in relation to the daemon. The project's README shed little light, but mentions that other behaviors can be hooked to the profiles as well to change other device behavior. :person_shrugging:

To go alongside this, I'm also going to look over the various tuning tweaks offered by the Assistant more thoroughly, both in the hopes that it'll broaden my knowledge a bit and just to familiarize myself better with Garuda's work as a whole.

If anyone has any insight on the things I've mentioned, please do let me know. Sorry if anyone who actually knows how kernels work has to read this, I'd imagine there's a lot I'm missing here. I've always loved digging into documentation, but I am in over my head on this closer-to-metal stuff.

3 Likes

Its a neat little daemon which can set powersave, balanced or performance power profiles. It is also integrated with KDE. I don't know if its that way on every device, but my laptop heavily profits from it - during powersave mode it substancially increases the battery life for example. Iirc it works by limiting CPU frequencies :eyes:

The issue with all kernels maintained by @ptr1337 is, that the patches used change their content and keep getting cached between builds, thus causing checksum mismatches. This time I manually corrected them, however I'll need to look into a permanent solution for this again.

3 Likes

Yeah, I found reference to it doing so for at least the Powersave profile, but didn't see specific details or anything. I got the sense that it's doing something else, but I haven't dug deep enough to figure out if/what. Seems to follow GNOME's grand tradition as far as documentation. :rofl: I'd been using it fine with Zen (and in the past with other GNOME distros since it was implemented) and not really paying much attention - switching to powersave on battery, leaving it mostly balanced or performance if it felt necessary on AC power. I definitely noticed a difference between them. Only reason it even occurred to me it wasn't doing exactly what I'd sort of assumed was the oddly counterintuitive result from comboing it with BORE. Like I said, I've no real idea what's afoot here yet. :person_shrugging: Might fully disable it and check out corectrl or kmon, see if I can take a look at what the kernels prefer without it interfering and re-bench if I have the time.

Ah, gotcha! That's unfortunate I suppose but they just started getting included, right? Makes sense that there'd be wrinkles to iron out, especially with custom kernels. Wish I could offer more than just understanding on that but alas.

Thanks for the response, I'll keep playing around and keep y'all posted if anything that feels really relevant comes up!

2 Likes

GNOME. Arch Linux - power-profiles-daemon 0.10.1-2 (x86_64)

(Which makes no difference.)

It replaces TLP. (I get to learn something new every day.) :wink:

3 Likes

Hey, the maintainer of the kernels here,

First thanks for your time to compare the kernels in the detail!

First one thing to geekbench, sadly this benchmark is really worse in benchmarking kernels directly in the detail, its okay if you want to compare the windows performance with your linux performance of your rig, but actually it gives really random values out.

Even if I benchmark 3 times with the same kernel, you'll get a difference from +- 20-40 points.
The phoronix benchmark is the better alternative to compare kernels in their performance regarding benchmarks.
In general the "snappier", more responsive the kernel is, the less is the throughput the most time.
So the goal is to archive a kernel which is responsive and got a good throughput.

If you have any issue, please report them at GitHub, Discord, Telegram or here. I will take now more often here a look into it.

All the kernels are prebuilt into the CachyOS repo, in Generic, Generic-v3 and also with lto.
The repo is way more stable then the last time, we fixed many things which caused some issues, and so far i know from @dr460nf1r3 it is compatible with garuda.

If you just want to have the built Kernels, you can just import our keyring and download the kernel and the headers from https://aur.cachyos.org
If you want to use the v3 optimized kernels, add in your /etc/pacman.conffollowing:

#Architecture = auto
Architecture = x86_64 x86_64_v3

For building yourself it will auto detect your cpu march and will optimize the makeflags automatically for it. Also there are several other options your can set in the PKGBUILD.
Also you can use modprobed-db to reduce your compiled modules alot, this decreases the build time a lot. Just read at the arch wiki how to use it. At the PKGBUILD you can set _localmodcfg=y then the modprobed-db config will be used.
https://wiki.archlinux.org/title/Modprobed-db

I already prepared many changes for the upcoming 5.17.4 Kernel, which showed at some testers and me great improvements. I think it will be tomorrow released, or today.

@SonarMonkey
With your wireless card seems a weird issue. Sadly I dont have the config anymore.
Maybe check out a 5.17 Kernel from the chaotic-aur, or our repo if the problem still exist.

@Cannabis
Which cpu you got in your iMac ? Should be a intel or?
Are you using the pstate driver?

If you got any question's, issue's feel free to hit me up.

Regards

11 Likes

Model name: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

╰─λ uname -a
Linux garuda-iMac 5.17.3-zen1-1-zen #1 ZEN SMP PREEMPT Thu, 14 Apr 2022 01:18:28 +0000 x86_64 GNU/Linux

[[email protected] boot]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
intel_pstate

  • Right now I am booted into stock Garuda zen kernel while obtaining the results above. I will boot into cachyos-bore and check for intel_pstate there as well.

Thanks!

2 Likes

Oh sick, thanks for the detailed reply! Before anything, I must say I really appreciate your work.

Yeah, I was starting to see geekbench's sort of uselessness, lol. I'll check out phoronix, and I'm looking into some stress/load testers as well.

I saw mention of this, but I wasn't exactly clear on the difference. I found steps to check if v3 is supported, but hadn't followed through with that yet.

Yeah, I looked into it, I'll probably start building a db here soon so I can more feasibly build kernels regularly. I didn't do so initially because I was itching to test and for modprobe wanted to follow the recommendation of letting the db fill out for for a few days with everything I need so as to avoid accidentally not loading a module that I needed for something more specific.

I'll have to add the repo, I saw it come up but it felt a better idea to use the ones you'd built for Garuda initially. What do you mean by "our keyring" though? Isn't it an external repo?

I only experienced this with the 5.16.16-1 build of linux-bore from chaotic - other kernels and the updated build of linux-bore didn't have this, so it wasn't a big issue. I'd be willing to look into it more thoroughly if you're curious, though!

Ah man, I was hoping I had a little more time before I ran down this kernel rabbit hole a little deeper :rofl: Suppose I'll wait for this to hit and just run this whole experiment from the top with the update and better benching tools.

Nothing right now, unless you have any specific insight on the actions/mechanisms of power-profiles-daemon and it's interplay with non-standard schedulers and such.

Thanks again! Let me know if there's anything you want me to poke at more specifically, and I'll keep updating this thread. :peace_symbol:

2 Likes

Which is the best kernel to use with blender as a real world test? :smiley:

3 Likes

Blender's performance is probably pretty dependent on RAM+GPU+CPU performance all together. It's also a more specific kind of load, so stuff that's meant for interactivity/general use might not necessarily push the same to getting heavy in blender? That's just conjecture though, :person_shrugging:

You'd probably just have to try a few out. Zen since it's the default shipped with Garuda, LTS if you want a properly stock baseline, then maybe branch out into the other custom's. May just wanna read up on what all is offered in Garuda's kernel manager, there's all kinds of wavy stuff.

Anecdotally, I've continued to have the best experience on the linux-cacULE build provided by ptr1337 above for Garuda.

If you do test this stuff (or wanna look into the mentioned phoronix tools) keep us updated here!

4 Likes

I agree but testing a kernel against real world scenario's is better than bench marking (blender was just a use case) :smiley:

5 Likes

It will be worth looking at linux-lqx too. It's a more "standard" kernel than the more obscure ones here (essentially a more aggressive linux-zen) but with enough differences to make it interesting.

4 Likes

Hey @ptr1337 nice to see you on this topic considering your kernels are a main subject. I do have a question regarding the CPU optimisation script. So I originally built the kernel through makepkg and the package build detected my CPU and compiled it with the necessary flags. But a new build was precompiled on the chaotic aur. So as I compiled it with my optimisations and then updated it through chaotic aur, do I lose my optimisations because the optimisations would be of the person, in this case, @dr460nf1r3 who compiled it? Does this question make sense? I'm basically asking if my optimisations are lost because I updated it from a different person who compiled it with their different optimisations based on the automatic CPU detection. Thanks

Edit: Sorry for any tags. Just wanted to show who is doing what so it's clearer in my question.

Actually cpu's got different the micro architectures. The compilers are using different compile flags which are optimized for the cpu, this results most time in a better performance.
The difference between x86-64 generic to x86-64-v3 should be around 10 %.
Thats the reason why man people also compile their packages theirself (Gentoo best example for it) and use march=native to get their compiled packages optimized to their system.

The CachyOS repo provides most archlinux packages and also the kernel in Generic v3.
Here you can read more about:

And here a How to add the repo:

I did pushed the kernels already with the new changes. I already told @dr460nf1r3 to build them :slight_smile:

Actually I think thats probably a issue from the config since most governor are disabled. I'll fix this with the 5.17.4 push.

Thanks for your feedback! I'll let you know!

Actually for testing a kernels performance with a small script is from @anon34128669 really good.
Just be sure to have all dependencys installed, but i think its also in the aur.

https://aur.archlinux.org/packages/mini-benchmarker

Hey @Grimy1928,

In general chaotic-aur builts them with the generic optimization.
If you use the kernel from another person which compiled it, the person should have the same cpu architecture as you (for example zen3,.. ) or should choose when building the kernel the generic-v3 optimization.

You can change before building between the auto detection or the select:

In general is the Generic v3 Optimization in these times the best to share kernels/packages between systems. Every CPU which was released near intel haswell should support x86-64-v3 optimization.

You can check it with the following command:

/lib/ld-linux-x86-64.so.2 --help | grep "x86-64-v3 (supported, searched)" 

If you get a output then its supported.

Regards.

4 Likes

I'd be interested in that answer! :smiley:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.