System missing from GRUB after update

Hello, today I updated my system and while shutting down, I noticed a strange error like “No EFI device found”. Then when I rebooted, the grub entry for Garuda Linux was gone. I made a Garuda USB and booted into a live session, then restored the snapshot from before the update, but the grub entry was still missing. Now, I do have a very peculiar system setup, so let me explain.

My /boot and /home directories are located on a 500GB SATA SSD, while my root partition and everything else is on a 1TB NVME SSD installed using a PCIe adapter. This setup is necessary because my motherboard is too old to natively support booting from NVME drives, but from what I understand, GRUB can load the necessary drivers to interface with the drive and load the operating system from it. I assume that the latest update messed up my GRUB config somehow, and perhaps my /boot directory is excluded from snapshots since it’s on another partition/device which would explain why restoring a snapshot didn’t fix it.

I’m currently in a live session using a USB with a Garuda ISO on it. The NVME drive is listed in the partition manager, so it’s likely still functional. My SATA drive shows up too, and I can mount the drive and browse the GRUB configuration files, but I’m not sure what to do. I searched for other related topics, and one suggested to use the Garuda Boot Options tool, but the “Boot to” dropdown doesn’t list any options. Another suggestion was to use the Garuda Boot Repair tool, but I’m hesitant to take any course of action that might break my configuration further.

In my two years of using Garuda (exactly two years, funny enough… this issue happened almost down to the hour of the day I installed it) I’ve never had any serious issues like this. Any advice would be appreciated, and I’d be happy to provide any additional information. Here is the output of garuda-inxi below:

System:
Kernel: 5.19.7-zen2-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
parameters: BOOT_IMAGE=/boot/vmlinuz-x86_64 lang=en_US keytable=us tz=UTC
misobasedir=garuda misolabel=GARUDA_DR460NIZEDGAMING_TALON quiet
systemd.show_status=1 driver=nonfree nouveau.modeset=0 i915.modeset=1
radeon.modeset=1
Desktop: KDE Plasma v: 5.25.5 tk: Qt v: 5.15.6 info: latte-dock
wm: kwin_x11 vt: 1 dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
Type: Desktop Mobo: ASUSTeK model: P6T SE v: Rev 1.xx
serial: <superuser required> BIOS: American Megatrends v: 0908
date: 09/21/2010
CPU:
Info: model: Intel Core i7 920 bits: 64 type: MT MCP arch: Nehalem
gen: core 1 built: 2008-10 process: Intel 45nm family: 6
model-id: 0x1A (26) stepping: 5 microcode: 0x1D
Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
L3: 8 MiB desc: 1x8 MiB
Speed (MHz): avg: 1906 high: 2668 min/max: 1600/2668 boost: enabled
scaling: driver: acpi-cpufreq governor: performance cores: 1: 1604 2: 2668
3: 1604 4: 1602 5: 1899 6: 1604 7: 1605 8: 2668 bogomips: 42763
Flags: ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities:
Type: itlb_multihit status: KVM: VMX disabled
Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
vulnerable
Type: mds status: Vulnerable: Clear CPU buffers attempted, no microcode;
SMT vulnerable
Type: meltdown mitigation: PTI
Type: mmio_stale_data status: Unknown: No mitigations
Type: retbleed status: Not affected
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
prctl
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
sanitization
Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
STIBP: conditional, RSB filling, PBRSB-eIBRS: Not affected
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
vendor: Gigabyte driver: amdgpu v: kernel arch: GCN-4 code: Arctic Islands
process: GF 14nm built: 2016-20 pcie: gen: 2 speed: 5 GT/s lanes: 16
link-max: gen: 3 speed: 8 GT/s ports: active: DP-2,DP-3
empty: DP-1,HDMI-A-1 bus-ID: 02:00.0 chip-ID: 1002:67df class-ID: 0300
Display: x11 server: X.Org v: 21.1.4 with: Xwayland v: 22.1.3
compositor: kwin_x11 driver: X: loaded: amdgpu unloaded: modesetting
alternate: fbdev,vesa gpu: amdgpu display-ID: :0 screens: 1
Screen-1: 0 s-res: 3840x1080 s-dpi: 96 s-size: 1016x285mm (40.00x11.22")
s-diag: 1055mm (41.54")
Monitor-1: DP-2 mapped: DisplayPort-1 pos: primary,left
model: Acer XF243Y serial: <filter> built: 2021 res: 1920x1080 hz: 60
dpi: 93 gamma: 1.2 size: 527x296mm (20.75x11.65") diag: 604mm (23.8")
ratio: 16:9 modes: max: 1920x1080 min: 720x400
Monitor-2: DP-3 mapped: DisplayPort-2 pos: right model: Acer XF243Y P
serial: <filter> built: 2021 res: 1920x1080 hz: 144 dpi: 93 gamma: 1.2
size: 527x296mm (20.75x11.65") diag: 604mm (23.8") ratio: 16:9 modes:
max: 1920x1080 min: 720x400
OpenGL: renderer: AMD Radeon RX 580 Series (polaris10 LLVM 14.0.6 DRM
3.47 5.19.7-zen2-1-zen) v: 4.6 Mesa 22.1.7 direct render: Yes
Audio:
Device-1: Intel 82801JI HD Audio vendor: ASUSTeK driver: snd_hda_intel
bus-ID: 7-1:2 v: kernel bus-ID: 00:1b.0 chip-ID: 0951:16a4
chip-ID: 8086:3a3e class-ID: 0300 serial: <filter> class-ID: 0403
Device-2: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
vendor: Gigabyte driver: snd_hda_intel v: kernel pcie: gen: 2 speed: 5 GT/s
lanes: 16 link-max: gen: 3 speed: 8 GT/s bus-ID: 02:00.1
chip-ID: 1002:aaf0 class-ID: 0403
Device-3: Kingston HyperX 7.1 Audio type: USB
driver: hid-generic,snd-usb-audio,usbhid
Sound Server-1: ALSA v: k5.19.7-zen2-1-zen running: yes
Sound Server-2: PulseAudio v: 16.1 running: no
Sound Server-3: PipeWire v: 0.3.57 running: yes
Network:
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: e800
bus-ID: 06:00.0 chip-ID: 10ec:8168 class-ID: 0200
IF: enp6s0 state: up speed: 100 Mbps duplex: full mac: <filter>
Drives:
Local Storage: total: 2.31 TiB used: 0 KiB (0.0%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Addlink model: M.2 PCIE G3x4
NVMe size: 953.87 GiB block-size: physical: 512 B logical: 512 B
speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter> rev: ECFM32.1
temp: 27.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Seagate model: ST1000DM010-2EP102
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B
speed: 1.5 Gb/s type: HDD rpm: 7200 serial: <filter> rev: CC43
scheme: MBR
ID-3: /dev/sdb maj-min: 8:16 vendor: Western Digital
model: WDS500G2B0A-00SM50 size: 465.76 GiB block-size: physical: 512 B
logical: 512 B speed: 3.0 Gb/s type: SSD serial: <filter> rev: 20WD
scheme: GPT
ID-4: /dev/sdc maj-min: 8:32 type: USB vendor: SanDisk
model: Cruzer Force size: 14.32 GiB block-size: physical: 512 B
logical: 512 B speed: 3.0 Gb/s type: N/A serial: <filter> rev: 1.00
scheme: MBR
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
Partition:
Message: No partition data found.
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 11.68 GiB used: 0 KiB (0.0%) priority: 100
dev: /dev/zram0
Sensors:
System Temperatures: cpu: 36.5 C mobo: 37.0 C gpu: amdgpu temp: 51.0 C
Fan Speeds (RPM): cpu: 1795 psu: 0 case-1: 1328 case-2: 803 gpu: amdgpu
fan: 770
Power: 12v: 12.14 5v: N/A 3.3v: 3.30 vbat: N/A gpu: amdgpu watts: 30.08
Info:
Processes: 244 Uptime: 39m wakeups: 0 Memory: 11.68 GiB used: 3.2 GiB
(27.4%) Init: systemd v: 251 default: graphical tool: systemctl
Compilers: gcc: 12.2.0 Packages: pacman: 1837 lib: 515 Shell: fish v: 3.5.1
default: Bash v: 5.1.16 running-in: konsole inxi: 3.3.20
warning: database file for 'garuda' does not exist (use '-Sy' to download)
warning: database file for 'core' does not exist (use '-Sy' to download)
warning: database file for 'extra' does not exist (use '-Sy' to download)
warning: database file for 'community' does not exist (use '-Sy' to download)
warning: database file for 'multilib' does not exist (use '-Sy' to download)
warning: database file for 'chaotic-aur' does not exist (use '-Sy' to download)
Garuda (2.6.7-1):
System install date:     2023-09-22
Last full system update: 2023-09-22 ↻
Is partially upgraded:   No
Relevant software:       NetworkManager
Windows dual boot:       <superuser required>
Snapshots:               Snapper
Failed units:            snapper-cleanup.service

Use live ISO, chroot, install-grub, update-grub.
Use forum search please :slight_smile:

1 Like

Thank you for pointing me in the right direction, I would have been searching for the wrong things for a long time. That sent me to this forum post which seems to be exactly what I’m looking for: How to chroot Garuda Linux

But now I’m a bit worse off than I started. Entering the chroot environment was easy enough. I was able to mount my boot partition with mount /dev/sdb1 /boot, and update-grub worked fine. There was a “Kernels not found” message which I wasn’t sure was an error, but googling turned up no results so I figured it’s normal. I rebooted and behold, I had a grub entry for Garuda Linux again! But it didn’t boot.

  Booting `Garuda Linux (on /dev/nvme0n1p1)`

error: no such device: ca2fa708-e61a-4167-9312-ffbc040fa3bd.
error: filename expected.

Press any key to continue...

  Failed to boot both default and fallback entries.

Press any key to continue...

So I entered chroot again, and this time installed grub with grub-install /dev/sdb. I ran update-grub and got the same results as last time, seemed fine. I rebooted, and this time there was no fancy grub splash screen waiting for me. It went straight into grub rescue mode.

GRUB loading...
Welcome to GRUB!

error: no such device: ca2fa708-e61a-4167-9312-ffbc040fa3bd.
error: unknown filesystem.
Entering rescue mode...
grub rescue> 

I found this forum post about reinstalling grub: Cannot Reinstall Grub Bootloader - #4 by FGD and it mentioned that the user had a GPT partition table, so they should be installing the UEFI version of grub instead of the legacy BIOS version. My boot drive is also partitioned with GPT, but I was using the legacy BIOS version before so I don’t know if I should be using the UEFI version. My motherboard might not even support UEFI.

I also saw in an askubuntu question that when installing grub, you have to point it to a location to read data from the hard disk at the beginning. I am not sure how this works in a chroot environment, or when the root partition is on a separate drive. I’m not sure where to go from here.

This is not correct. It should be:

sudo mkdir -p /mnt/broken
sudo mount /dev/sdb1 /mnt/broken
sudo garuda-chroot /mnt/broken/@

Then after chrooting:

mount /dev/sdb2 /boot/efi

(change sdb2 with your ESP, check as suggested in the tutorial). Then install grub bootloader and update grub

grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=garuda --recheck
update-grub

Or if in legacy mode, use of course

grub-install /dev/sdb
2 Likes

Do you mean /boot/efi is on the SATA SSD? Or where is the EFI partition?

/boot is where the kernels and initramfs images will be. The default for Garuda is for boot to be on Btrfs, and it would be included in the normal system snapshots.

/boot/efi is where the bootloader stuff is stored–so Grub related files, etc.

It might be helpful if you can post some more information:

lsblk -f
sudo parted -l
efibootmgr

Edit: my mistake, I see it’s a BIOS board so there actually is not an EFI partition in this case.

1 Like

Yes, I did those steps to enter the chroot environment. I installed in legacy mode using grub-install /dev/sdb just like you said. The only difference is I used the command mount /dev/sdb1 /boot so that update-grub would run correctly, otherwise there is no mounted boot directory to update. Then I exited chroot with exit and rebooted. What did I do wrong?


Edit: clarified the mounting process

So I found this post about whether GRUB can access NVME drives if the BIOS doesn’t have support for it. The answer is: in theory, yes, but in practice GRUB doesn’t have a driver for it. However, GRUB can load the kernel which, if it has an NVME driver, can then access the drive and load the rest of the operating system. The problem is that I just checked my /boot directory and it doesn’t contain a vmlinuz image or an initramfs. So if GRUB doesn’t have a linux kernel it can load, then I’m screwed. I’m not sure why update-grub gave me the message “No kernels found”, but that could have something to do with my issue.

After a lot of digging, I figured it out! Since I didn’t have a ramdisk or kernel images in my /boot directory, update-grub generated a configuration that searched for a kernel on the NVME drive, which GRUB cannot see. So to get GRUB to boot the linux kernel right away, I had to manually set up the initial ramdisk environment using mkinitcpio:

# Enter chroot using live ISO
sudo mkdir -p /mnt/broken
sudo mount /dev/nvme0n1p1 /mnt/broken
sudo garuda-chroot /mnt/broken/@

# Mount the boot volume (the drive was at sda1 this time)
mount /dev/sda1 /boot

# Install kernel and generate .preset files in /etc/mkinitcpio.d
pacman -S linux-zen
ls /etc/mkinitcpio.d
linux-zen.preset

# Create initial ramdisk and check that the kernel and image files are in /boot
mkinitcpio -p linux-zen
ls /boot
grub                              initramfs-linux-zen.img  memtest86+
initramfs-linux-zen-fallback.img  intel-ucode.img          vmlinuz-linux-zen

# Re-install and update GRUB
grub-install /dev/sda
update-grub

And finally, after exiting chroot and rebooting, I had a working system again! It seems the cause of this issue was that when updating my system, the .preset files in /etc/mkinitcpio.d didn’t get generated for some reason. So when the GRUB configuration was updated, I didn’t have an initial ramdisk environment, leaving me with an unbootable system.

2 Likes

You should have checked the mounted partition (/boot/) contents before you run grub-install. It seems the mounted partition was the wrong one (or failed to mount?). Drive names are not persistent.

Which partition is that?
The more info you provide, the best help you will get.

After you chroot from Live ISO, check that you have your original (initially configured on installation) boot partition mounted. Compare original fstab contents with your currently mounted partitions. Correct /boot/ if needed.
How did you initially install? :joy:
/boot/ directory holds kernel images (created by mkinitcpio/dracut ) and grub folder with grub.cfg, modules/themes and other folders. If needed, you can recreate these contents with appropriate commands.

1 Like

Sorry to ninja you with a solution! But I’d be happy to clarify some things for completeness’ sake.

That is possible. Indeed, in my solution, the drive name had changed since last time. It’s possible that after several tries at different solutions I forgot to check that the drive name was correct. But I do know that my actual boot volume did not contain a linux kernel or initram image, which was part of the problem.

That is the root partition on the NVME drive.

It’s been a while. I used the graphical installer on a live USB, set up some custom partitions, with a 1GB legacy boot volume on the SATA SSD and the root volume on the NVME SSD. I wisely took a screenshot of the install screen for future reference, though it might not provide much clarity.

I suppose you mean something that I cannot understand :smile: .
The system (bootloader) cannot boot without kernel images.
Kernel images are replaced (overwritten) during kernel upgrades. Some image should be there, if the partition is the correct one. Also, grub folder should be there, as well.
I think something else must have happened, even a bug. If you had kept notes, you could help finding a possible bug.

2 Likes

I’m sorry, I tend to mix up terminology sometimes. I’m referring to what you’re talking about here, the kernel images that should be in the the /boot directory:

They were missing. I thought that kernel images could also be located somewhere else on the root partition, and that GRUB was searching for them. But maybe that isn’t the case, since the broken grub.cfg was just invoking linux without specifying a file containing the kernel.

As I found out when manually invoking mkinitcpio, there were no .preset files in /etc/mkinitcpio.d, so something must have gone wrong in the upgrade. It could simply be that I ran out of disk space, since there were some errors about some files failing to write. Running df -h revealed that my NVME drive was 99% full, with only a couple gigabytes remaining. I’m not sure that would cause the .preset files not to generate, though.

I did find the culprit taking up all my disk space, however. I found out I had system snapshots over a year old from timeshift, before Garuda switched to snapper as its snapshot management tool! I looked up how to delete these snapshot subvolumes, and freed up a few hundred gigabytes of space by doing so.

4 Likes

It is a possibility.

Well done with your recovery. You worked like a real Arch user:tm: !

5 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.