Kernel Panic - Not Syncing: Fatal Exception In Interrupt when trying to install/boot from USB?

I just thought I would provide an update on this as I have resolved the issue. The way I got around this issue was installing another distro with a different kernel. It makes sense as the kernel is stored in the memory. The only way anyone should ever run into a kernel panic during installation is if the kernel was corrupted some how. So by installing a different distro with a different kernel, I "overwrote" the corrupted kernel, clearing it from the memory. After that was done, I removed the currently installed distro, wiped the drive and I ended up installing Garuda with zero issues after.

Magic?
Just format the disk or install new. Done.

2 Likes

Just to add a note here as a new Garuda user. I had the same issue on my laptop which uses Optimus (I also tried EndeavourOS on the laptop too, which I use on my desktop - since Endeavour wouldn't boot, I thought I'd try Garuda). Tried KDE and Gnome versions.

It seems that the recent stable versions of the 5.12 kernel have a pretty serious bug with Nvidia hardware and Optimus is even worse in particular (surprise surprise, optimus sux). Seems to happen regardless of driver or GRUB kernel boot option setup (I'm not an inexperienced Linux user and tried all the normal tricks, and exhausted every option on the various Nvidia related Arch Wiki pages), and found that the only way to boot is with nomodeset, and even then, I could only get video output using an external HDMI connection, no image on the internal laptop screen. I did find an issue on the nvidia bug tracker/mailing list while troubleshooting on the live image, but I can't find it right now.

Anyway once I realised the kernel version might be to blame I downloaded the XFCE edition of Garuda which uses the LTS kernel, and all was well, booted right up, I'm installed and everything's running smoothly.

I realise it's already a monumental effort to build and maintain these versions, but I think it might be a good idea to give all the editions an LTS version alongside the mainline version. This isn't the first problem I've run into in 5.12, it seems to be quite a buggy kernel so far.

(alternatively, is it possible to somehow patch a live image built with the stable kernel to use LTS?)

3 Likes

This happens only in your imagination :sleeping_bed: Read the comment from @lewiji to see whatā€™s troubleshooting about.

IIRC the barebones ISOs are build on linux-lts.

3 Likes

Had nothing to do with the disk, the installation never got far enough to install it. I had wiped and reformatted the disk multiple times with zero effect.

All of kernel memory and user process memory is stored in physical memory in the computer (or perhaps on disk if data has been swapped from memory).

The kernel is a computer program at the core of a computer's operating system that has complete control over everything in the system. It is the "portion of the operating system code that is always resident in memory", and facilitates interactions between hardware and software components.

That's the literal definition of what a kernel is.

Please refrain from giving Kernel lessons.
No problem for me and maybe others, but at least 3 people from the "Garuda Team" answered here ...
By the way, maybe you haven't considered that the memory you're talking about is RAM, volatile, so once you shut down your PC after the first kernel panic, everything should have come back clean, with no need to install a different distro to clean-up.
I don't know what solved your issue, most likely when you wiped your drive, but for sure it is not a matter of RAM IMO.

4 Likes

Let me extrapolate on how I arrived at the conclusion I did for clarity. First off, I think that petsam was correct in identifying the initial cause of the issue which was the USB drive was removed prior to the restart, causing the corruption.

This is what we ruled out during the troubleshooting process:

  • The image was not the issue as we verified the checksum, downloaded a different image and re-verified the checksum of the hash.
  • The USB drive was not the issue as we replaced that with a different one and attempted formatting in MBR and GPT.
  • The storage disk/drive was not the issue as that was wiped clean after the initial failed installation in Windows with diskpart (Zero'd out the disk, no more operating system, no partitions, nothing).
  • We ruled out BIOS concerns with secure boot, fast boot and XMP. Also made sure AHCI was configured properly.
  • Hardware incompatibilities are ruled out due to a clean pass with memtest86+ and the fact that the OS works on the hardware now without issues.
  • As a last ditch effort, we tried to modify the grub command line.

So, there's no Operating System installed, nothing on the disk, BIOS seems good, hardware checks out, and the image is clean. Where does that leave us?

The kernel.

This is why I think the kernel wasn't wiped from the memory. I understand how volatile memory works, it should wipe when the system is powered down. I agree with that.

A computer require power to run, this you know. A PC is powered from the wall in AC (alternating current) but computer parts require DC (Direct Current). Inside the desktop PC is a power supply unit that converts AC to DC. As long as the desktop PC is plugged into the wall it always receives AC power.

In the early days a PC had a 'AT' power supply with a switch on the front. The 'AT' type power supply had a push button switch that stopped the DC power. The problem with this was that users would turn off the computer while it was writing to the hard drive. Turning off the power during hard drive write would cause the hard drive to become corrupted.

So, the next iteration of PC design had an ATX power supply. In this design the power supply connected to the motherboard and the switch on the front of the PC was connected to the motherboard. For the ATX design pushing the off switch sends a signal to the motherboard, the Operating System (in this case there is no OS) reads the signal on the motherboard and sends a signal to the power supply.

The power supply has multiple DC outputs. The hard drive (and floppy) used 12 Volts. The CPU took 5 Volts and later 3.3 Volts. The different voltages are independent, so different parts of the computer may be switched of while other parts are on.

When you press the power button on the from of the PC or select turn-off from the Operating System, there are always at least one or two powered components. At very least the circuit on the motherboard that receives the power button signal and relays it to the power supply must be powered and is as long as the PC is plugged into the wall. (Admittedly, I never unplugged the power from the wall).

The component in question is the RAM (actually DRAM), and it is typically, not easy to tell if the power to the RAM is off or what method of turning the computer off will stop the supply of power to the RAM. (I did not put this in the solution because I didn't think about it until now, but the RGB LEDS on my RAM never shut off when I powered the system down).

The only way to be absolutely sure there is no power to the RAM is to disconnect the PC from the wall. In hindsight, I could have done this and that probably would have fixed the kernel panic.

As long as power is supplied to the RAM the RAM will retain the contents of whatever was last in it.

In this case, the kernel.

2 Likes

Well, now it is elaborated in a much more complete and plausible way... :slight_smile:
I hope you didn't take me wrong before, if the post had been this one I would not have written anything :wink:

:laughing:

:+1:

Thanks a lot!!! I had long time to laugh like this!!!

It's great you do dive that deep and I applaud, although you use false assumptions that lead you to a false result.

Please, give me a minute to detail an answer.

3 Likes

Probably this is the issue. I"ll explain later.

IIUC itā€™s wrong. And this is probably your bias from the ā€œNSAs and CIAs that can recover data from RAM chips of terroristsā€ rumor, which is not a rumor.
Memory chips are like a group of 1/0 bits that donā€™t need power to live, but you need power to use them (difficult for my English to explain). They keep their values when there is no power, so the NSAs can read those values when they are dead (not on a powered up system). They read the values on a HW chip, like using a magnifying glass (imagine Sherlock Holms :wink: ).
If/when you power up your system, BIOS powers up RAM and resets/reuses the previously full memory, overwriting old values and skipping those that it is not using (until it needs them).
If your theory was valid, we would never be able to boot up a crashed PC (of any OS).

I didnā€™t know there is such a thing. You may always learn!..

As I explained above, RAM keeps itā€™s data/values with or without power (IIUC).

Then what is the issue?

Unplugging the USB drive, left unfinished business from the installer OS.
Even if the installer announced it is finished and it thinks it is true and programmatically correct, the installer is a program (Iā€™ve heard this beforeā€¦), and as such, it does not communicate directly with the HW, but through the OS, which in turn has its own ways to send data to HW and when to send them. While the OS (in which we can include the Desktop Manager and toolkits for making it short) confirms to the installer that it has finished the job, it often lies (like parents often do to their kids, to make their lives easier :grinning_face_with_smiling_eyes: ), having cached up few or a lot of transcribing to the disk, for performance reasons, intending to write finally in due time. Also, since the running applications are physically on the USB drive, unplugging the drive, removes a part of essentials, that may not already exist on RAM, or needs some other action that requires the USB disk to proceed/continue the greater job. For example, in dd jobs, we use sync to verify what is announced as finished job is actually synced to the disk.

The second disk related possible issue is the swap (partition or RAMed).
It may not be the case with yours, butā€¦ if you had setup hibernation swap on your installed system and the existing system disks include a swap partitionā€¦
When systemd starts, if there is an existing swap partition on the system, even if it is not added in fstab, it tries to use it for runtime swap, unless you have explicitly configured it otherwise. When the installer boots up, starts using this existing swap partition. An abrupt system crash (as happened in your case) could have left the swap full of broken data, that might not look immediately broken to the next booted system/OS (when trying to boot the installed Garuda system) and thus it could make it crash because of the supposed hibernated swapped image, which is not.
This is a stretched but possible scenario, so I guess the 1st one is most probably your case.

Keep on thinking deep! :+1:

5 Likes

Okay, great! I'm totally okay with being wrong, by the way. I don't have a PhD in CompSci, or Electrical Engineering or Mathematics. I just primarily troubleshoot hardware issues and I'm trying to wrap my head around what could have caused this because I want to learn.

I know there's a huge scope of knowledge I just don't have and would appreciate any constructive education on the matter. I'm just here for help and wanted to post what resolved the issue for me because I probably won't be the last person that runs into this issue and would like to provide some insight for someone else down the line that could run into this.

So thanks for explaining this to me.

4 Likes

My first reaction after reading your synopsis was OMG you used a Windows utility to wipe your disk. Always use DD to zero out a drive if you want the job done right IMO.

The next thing was, yes you definitely should have completely removed all power sources and reset your bios to factory to start with a clean slate.

1 Like

That's good information to have! I literally just started using Linux a couple weeks ago and didn't know this. This is kind of why I'm trying to learn, there's a lot of things that are safer, more efficient and more secure when it's done through Linux from my understanding.

Specifically, I gravitated to Garuda because it's Arch based (which I know is unstable, but the Arch Wiki seems like it has great documentation, and this will put me in situations that will force me to learn more about Linux as I will need to put more effort into maintaining my system and educating myself to do so).

Additionally it seems like with the implementation of how you've integrated Timeshift will allow me to reset and try again when I make the countless mistakes along the way which is inevitable.

That's kind of why my heart was set on using this distribution.

2 Likes

I think there is a bit of confusion generated by the overlap of what is a BIOS and what is RAM.

3 Likes