MCE Errors on boot also cant boot into some kernals such as the included zen kernal

Here is my inxi -Faz

System:    Kernel: 5.13.19-204-tkg-muqss x86_64 bits: 64 compiler: gcc v: 11.1.0
parameters: intel_pstate=passive BOOT_IMAGE=/@/boot/vmlinuz-linux-tkg-muqss
root=UUID=f7a8fd4e-227a-4071-b245-a6e78f058ae8 rw [email protected] quiet splash
rd.udev.log_priority=3 vt.global_cursor_default=0 systemd.unified_cgroup_hierarchy=1 loglevel=3
Desktop: KDE Plasma 5.23.2 tk: Qt 5.15.2 info: latte-dock wm: kwin_x11 vt: 1 dm: SDDM
Distro: Garuda Linux base: Arch Linux
Machine:   Type: Desktop Mobo: Gigabyte model: Z590 AORUS ELITE serial: <filter>
UEFI: American Megatrends LLC. v: F4 date: 08/23/2021
CPU:       Info: 10-Core model: Intel Core i9-10850K bits: 64 type: MT MCP arch: Comet Lake family: 6
model-id: A5 (165) stepping: 5 microcode: EC cache: L2: 20 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 144000
Speed: 2107 MHz min/max: 800/5200 MHz Core speeds (MHz): 1: 2107 2: 1395 3: 4564 4: 1541
5: 4842 6: 4813 7: 4942 8: 4904 9: 4961 10: 2056 11: 2078 12: 3421 13: 4916 14: 4916 15: 4959
16: 4979 17: 4861 18: 4885 19: 3057 20: 4509
Vulnerabilities: Type: itlb_multihit status: KVM: VMX disabled
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Enhanced IBRS, IBPB: conditional, RSB filling
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:  Device-1: NVIDIA GA102 [GeForce RTX 3080] vendor: eVga.com. driver: nvidia v: 470.74
alternate: nouveau,nvidia_drm bus-ID: 01:00.0 chip-ID: 10de:2206 class-ID: 0300
Device-2: Logitech Logi Webcam C920e type: USB driver: uvcvideo bus-ID: 1-7.2:6
chip-ID: 046d:08b6 class-ID: 0e02 serial: <filter>
Display: x11 server: X.Org 1.20.13 compositor: kwin_x11 driver: loaded: nvidia
unloaded: modesetting,nouveau alternate: fbdev,nv,vesa display-ID: :0 screens: 1
Screen-1: 0 s-res: 2560x1440 s-dpi: 108 s-size: 602x352mm (23.7x13.9") s-diag: 697mm (27.5")
Monitor-1: DP-2 res: 2560x1440 dpi: 108 size: 600x350mm (23.6x13.8") diag: 695mm (27.3")
OpenGL: renderer: NVIDIA GeForce RTX 3080/PCIe/SSE2 v: 4.6.0 NVIDIA 470.74 direct render: Yes
Audio:     Device-1: Intel vendor: Gigabyte driver: snd_hda_intel v: kernel bus-ID: 00:1f.3
chip-ID: 8086:f0c8 class-ID: 0403
Device-2: NVIDIA GA102 High Definition Audio vendor: eVga.com. driver: snd_hda_intel v: kernel
bus-ID: 01:00.1 chip-ID: 10de:1aef class-ID: 0403
Device-3: C-Media Schiit Modi 3 type: USB driver: hid-generic,snd-usb-audio,usbhid
bus-ID: 1-5:2 chip-ID: 0d8c:0066 class-ID: 0300
Device-4: Kingston HyperX 7.1 Audio type: USB driver: hid-generic,snd-usb-audio,usbhid
bus-ID: 1-8:4 chip-ID: 0951:16a4 class-ID: 0300 serial: <filter>
Sound Server-1: ALSA v: k5.13.19-204-tkg-muqss running: yes
Sound Server-2: JACK v: 1.9.19 running: no
Sound Server-3: PulseAudio v: 15.0 running: no
Sound Server-4: PipeWire v: 0.3.39 running: yes
Network:   Device-1: Realtek RTL8125 2.5GbE vendor: Gigabyte driver: r8169 v: kernel port: 3000
bus-ID: 03:00.0 chip-ID: 10ec:8125 class-ID: 0200
IF: enp3s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Drives:    Local Storage: total: 4.55 TiB used: 114.98 GiB (2.5%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB
block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 1B4QFXO7 temp: 37.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Seagate model: ST4000DM004-2CV104 size: 3.64 TiB
block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5425
serial: <filter> rev: 0001 scheme: GPT
Partition: ID-1: / raw-size: 449.22 GiB size: 449.22 GiB (100.00%) used: 114.96 GiB (25.6%) fs: btrfs
dev: /dev/nvme0n1p5 maj-min: 259:5
ID-2: /boot/efi raw-size: 100 MiB size: 96 MiB (96.00%) used: 25.8 MiB (26.9%) fs: vfat
dev: /dev/nvme0n1p2 maj-min: 259:2
ID-3: /home raw-size: 449.22 GiB size: 449.22 GiB (100.00%) used: 114.96 GiB (25.6%) fs: btrfs
dev: /dev/nvme0n1p5 maj-min: 259:5
ID-4: /var/log raw-size: 449.22 GiB size: 449.22 GiB (100.00%) used: 114.96 GiB (25.6%)
fs: btrfs dev: /dev/nvme0n1p5 maj-min: 259:5
ID-5: /var/tmp raw-size: 449.22 GiB size: 449.22 GiB (100.00%) used: 114.96 GiB (25.6%)
fs: btrfs dev: /dev/nvme0n1p5 maj-min: 259:5
Swap:      Kernel: swappiness: 133 (default 60) cache-pressure: 50 (default 100)
ID-1: swap-1 type: zram size: 31.22 GiB used: 31 MiB (0.1%) priority: 100 dev: /dev/zram0
Sensors:   System Temperatures: cpu: 16.8 C mobo: 16.8 C gpu: nvidia temp: 53 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
Info:      Processes: 411 Uptime: 4h 36m wakeups: 0 Memory: 31.22 GiB used: 24.33 GiB (77.9%)
Init: systemd v: 249 tool: systemctl Compilers: gcc: 11.1.0 clang: 12.0.1 Packages: 1589 apt: 1
pacman: 1588 lib: 479 flatpak: 0 Shell: fish v: 3.3.1 default: Bash v: 5.1.8
running-in: konsole inxi: 3.3.08

Also here is my Journal Errors for the MCE Issue

-- Journal begins at Sun 2021-10-17 09:16:19 PDT, ends at Fri 2021-10-29 20:09:07 PDT. --

Oct 29 15:38:04 Jacobs-PC kernel: x86/cpu: SGX disabled by BIOS.

Oct 29 15:38:04 Jacobs-PC kernel: psi: task underflow! cpu=1 t=2 tasks=[0 0 0 1] clear=4 set=0

Oct 29 15:38:04 Jacobs-PC kernel: ACPI BIOS Error (bug): Failure creating named object [\ADBG], AE_ALREADY_EXISTS (20210331/dswload2-326)

Oct 29 15:38:04 Jacobs-PC kernel: ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20210331/psobject-220)

Oct 29 15:38:04 Jacobs-PC kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.PGON.PBGE], AE_NOT_FOUND (20210331/psargs-330)

Oct 29 15:38:04 Jacobs-PC kernel: ACPI Error: Aborting method \_SB.PC00.PGON due to previous error (AE_NOT_FOUND) (20210331/psparse-529)

Oct 29 15:38:04 Jacobs-PC kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PG01._ON due to previous error (AE_NOT_FOUND) (20210331/psparse-529)

Oct 29 15:38:04 Jacobs-PC kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: ee0000000040110a

Oct 29 15:38:04 Jacobs-PC kernel: mce: [Hardware Error]: TSC 0 ADDR fef20300 MISC 3880000086

Oct 29 15:38:04 Jacobs-PC kernel: mce: [Hardware Error]: PROCESSOR 0:a0655 TIME 1635547078 SOCKET 0 APIC 0 microcode ec

Oct 29 15:38:04 Jacobs-PC kernel:

Oct 29 15:38:04 Jacobs-PC systemd[1]: Failed to start systemd-guest-user.service.

Oct 29 15:38:04 Jacobs-PC systemd[1]: Failed to start systemd-guest-user.service.

Oct 29 15:38:05 Jacobs-PC systemd-tmpfiles[577]: Failed to write file "/sys/module/pcie_aspm/parameters/policy": Operation not permitted

Oct 29 15:38:06 Jacobs-PC kernel: hid-generic 0003:0D8C:0066.0001: No inputs registered, leaving

Oct 29 15:38:07 Jacobs-PC kernel: pwc: Failed to set LED on/off time (-32)

Oct 29 15:38:07 Jacobs-PC kernel: pwc: send_video_command error -32

Oct 29 15:38:07 Jacobs-PC kernel: pwc: Failed to set video mode [email protected] fps; return code = -32

Oct 29 15:38:08 Jacobs-PC nmbd[2879]: [2021/10/29 15:38:08.588622, 0] ../../source3/nmbd/nmbd.c:900(main)

Oct 29 15:38:08 Jacobs-PC nmbd[2879]: nmbd version 4.15.1 started.

Oct 29 15:38:08 Jacobs-PC nmbd[2879]: Copyright Andrew Tridgell and the Samba Team 1992-2021

Oct 29 15:38:08 Jacobs-PC smbd[2881]: [2021/10/29 15:38:08.617832, 0] ../../source3/smbd/server.c:1738(main)

Oct 29 15:38:08 Jacobs-PC smbd[2881]: smbd version 4.15.1 started.

Oct 29 15:38:08 Jacobs-PC smbd[2881]: Copyright Andrew Tridgell and the Samba Team 1992-2021

Oct 29 15:38:14 Jacobs-PC systemd[3369]: Failed to start Profile-sync-daemon.

Oct 29 15:38:16 Jacobs-PC kernel: usb 1-8: 1:1: cannot get freq at ep 0x81

Oct 29 15:38:16 Jacobs-PC kernel: usb 1-8: 1:1: cannot get freq at ep 0x81

Oct 29 15:38:16 Jacobs-PC kernel: usb 1-8: 2:1: cannot get freq at ep 0x1

MCEs can occasionally be fixed with a BIOS update.

However, there's not enough detail in the log you provided (or in your post generally) to determine whether the MCE is a factor in the issue you're having.

1 Like

I just checked my bios I am on the latest version let me know what information would be helpful as I'm not really sure.

Which kernels boot? Which ones don't? What do you mean by "don't boot"? Did those kernels ever work? What did you change from the time they did? Which driver packages did you install?

1 Like

5.13.19-204-tkg-muqss boots as it is the one I'm currently using the included zen kernal refuses to boot and gets stuck on loading initial ram disk the normal 5.14 linux kernal does the same and these kernals have never worked. I have tried both open and proprietary nvidia drivers but neither work.

i no this is a long shot just check you dont have fast boot and secure boot enabled in the bios as i have had the same ish problem before

1 Like

That sounds like an interrupted update process.

Boot, then run:

sudo mkinitcpio -P
sudo update-grub

then try those kernels again.

ok running sudo mkinitcpio -p linux-zen results in

sudo mkinitcpio -p linux-zen
==> Building image from preset: /etc/mkinitcpio.d/linux-zen.preset: 'default'
-> -k /boot/vmlinuz-linux-zen -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-zen.img
==> Starting build: 5.14.15-zen1-1-zen
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [autodetect]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> ERROR: module not found: `nvidia'
==> ERROR: module not found: `nvidia_modeset'
==> ERROR: module not found: `nvidia_uvm'
==> ERROR: module not found: `nvidia_drm'
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-zen.img
==> WARNING: errors were encountered during the build. The image may not be complete.
==> Building image from preset: /etc/mkinitcpio.d/linux-zen.preset: 'fallback'
-> -k /boot/vmlinuz-linux-zen -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-zen-fallback.img -S autodetect
==> Starting build: 5.14.15-zen1-1-zen
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: aic94xx
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> ERROR: module not found: `nvidia'
==> ERROR: module not found: `nvidia_modeset'
==> ERROR: module not found: `nvidia_uvm'
==> ERROR: module not found: `nvidia_drm'
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-zen-fallback.img
==> WARNING: errors were encountered during the build. The image may not be complete.

Your nvidia driver didn't build. Reinstall it, and check the output of dkms status and dkms autoinstall. You can also manually build it for the failed kernel, then rebuild the initramfs.

1 Like

it seems the old source was kept for some reason i just deleted the problem file and it did build correctly however it still refuses to boot with the affected kernels.

I have secure boot disabled i think thats what you meant i also have fast boot off but thanks for the suggestion.

Output of dkms status and of mkinitcpio -P ?

1 Like

This is dkms status

nvidia/495.44, 5.10.77-3-lts, x86_64: installed
nvidia/495.44, 5.13.19-204-tkg-muqss, x86_64: installed (original_module exists)
nvidia/495.44, 5.14.16-arch1-1, x86_64: installed
nvidia/495.44, 5.14.16-zen1-1-zen, x86_64: installed
vmware-workstation/16.2.0_18760230, 5.10.77-3-lts, x86_64: installed
vmware-workstation/16.2.0_18760230, 5.13.19-204-tkg-muqss, x86_64: installed
vmware-workstation/16.2.0_18760230, 5.14.16-arch1-1, x86_64: installed
vmware-workstation/16.2.0_18760230, 5.14.16-zen1-1-zen, x86_64: installed

and here is mkinitcpio -P

==> Building image from preset: /etc/mkinitcpio.d/linux-lts.preset: 'default'
-> -k /boot/vmlinuz-linux-lts -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-lts.img
==> Starting build: 5.10.77-3-lts
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [autodetect]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-lts.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-lts.preset: 'fallback'
-> -k /boot/vmlinuz-linux-lts -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-lts-fallback.img -S autodetec
t
==> Starting build: 5.10.77-3-lts
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: aic94xx
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-lts-fallback.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'default'
-> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /boot/initramfs-linux.img
==> Starting build: 5.14.16-arch1-1
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [autodetect]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'fallback'
-> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-fallback.img -S autodetect
==> Starting build: 5.14.16-arch1-1
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: aic94xx
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-fallback.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-tkg-muqss.preset: 'default'
-> -k /boot/vmlinuz-linux-tkg-muqss -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-tkg-muqss.img
==> Starting build: 5.13.19-204-tkg-muqss
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [autodetect]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-tkg-muqss.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-tkg-muqss.preset: 'fallback'
-> -k /boot/vmlinuz-linux-tkg-muqss -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-tkg-muqss-fallback.img
-S autodetect
==> Starting build: 5.13.19-204-tkg-muqss
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: aic94xx
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-tkg-muqss-fallback.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-zen.preset: 'default'
-> -k /boot/vmlinuz-linux-zen -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-zen.img
==> Starting build: 5.14.16-zen1-1-zen
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [autodetect]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-zen.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-zen.preset: 'fallback'
-> -k /boot/vmlinuz-linux-zen -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-zen-fallback.img -S autodetec
t
==> Starting build: 5.14.16-zen1-1-zen
-> Running build hook: [base]
-> Running build hook: [udev]
-> Running build hook: [modconf]
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: aic94xx
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: xhci_pci
-> Running build hook: [keyboard]
-> Running build hook: [keymap]
-> Running build hook: [consolefont]
-> Running build hook: [plymouth]
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating zstd-compressed initcpio image: /boot/initramfs-linux-zen-fallback.img
==> Image generation successful

Last one, sudo update-grub and then see if they will boot.

2 Likes

They unfortunately still wont boot with my 3080 but i just tried to boot with integrated graphics and they boot so its something to do with the gpu

What do you mean here by "don't boot"? What stage does the boot process get to? What do the boot messages show if you remove quiet from the boot line? What does the journal show for that boot attempt?

1 Like

Removing quiet just gets stuck at loading initial ram disk and here is the journal for that boot

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.