Strange booting behaviour

So I have been using Garuda for a few months and have been having a good time with it, however, recently it looks like I am unable to just boot into the regular "Garuda Linux" entry from grub. I am able to go into the snapshots, select the latest one, and then choose the regular linux-zen kernel and initramfs (not the fallback ones) and it boots just fine. I think this started happening after I did a pacman -Syu but I am not entirely sure. When it originally happened I thought to just restore a snapshot. So I boot into an older snapshot (From anywhere from a week to a month ago) and click the "Restore" button on the popup that says I am running on a snapshot and would I like to restore it. I immediately do a reboot and still nothing. Stuck on loading initial ramdisk at boot.

I initially thought it was something with my grub install. So I blew the old partition away, and then reinstalled grub to no avail. Thought that maybe I should just rebuild the initramfs so I ran sudo mkinitcpio -P and that still didn't work. Then I thought to just reinstall the linux headers using sudo pacman -S linux-zen linux-zen-headers to maybe grab the newest version that might fix it, but still nothing. I also went in to btrfs-assistant and checked the diffs on each of my files in the /boot directory and it showed that there were no changes between my snapshots and the latest snapshot which I thought was weird, since in the grub menu it is booting using one of my older snapshots' /boot folder with success.

So far my solution has been to going into the "Garuda Snapshots" entry within grub, and selecting the vmlinuz-linux-zen & initramfs-linux-zen.img & amd-ucode.img entry which boots using initrd "/@_backup_20220112230608429/boot/amd-ucode.img" "/@_backup_20220112230608429/boot/initramfs-linux-zen.img". I am able to get into my computer using this method, and it seems like it isn't detected as a snapshot and I can actually install things and have it persist so I am not complaining a whole lot. But I think it would be great to get back to a state where I am able to just click enter without thinking about it. Is there anything that I am missing at all? Or is my only option to reinstall?

This is the output from garuda-inxi as well. I tried to include as much as I could, but if I missed anything I would be glad to add! Thanks!

garuda-inxi


System:
  Kernel: 6.1.1-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
    parameters: BOOT_IMAGE=/@_backup_20222712220721796/boot/vmlinuz-linux-zen
    root=UUID=926ca3a3-eac8-45e1-b0cb-ec0ecc0d7019 quiet quiet splash
    rd.udev.log_priority=3 vt.global_cursor_default=0 loglevel=3
    rootflags=defaults,noatime,compress=zstd,discard=async,ssd,subvol=@_backup_20222712220721796
  Desktop: i3 v: 4.21.1 info: i3bar dm: LightDM v: 1.32.0
    Distro: Garuda Linux base: Arch Linux
Machine:
  Type: Desktop Mobo: ASUSTeK model: ROG STRIX B450-I GAMING v: Rev 1.xx
    serial: <filter> UEFI: American Megatrends v: 4901 date: 07/25/2022
Battery:
  Device-1: hidpp_battery_0 model: Logitech G305 Lightspeed Wireless Gaming
    Mouse serial: <filter> charge: 55% (should be ignored) rechargeable: yes
    status: discharging
CPU:
  Info: model: AMD Ryzen 7 3700X socket: AM4 bits: 64 type: MT MCP arch: Zen 2
    gen: 3 level: v3 note: check built: 2020-22 process: TSMC n7 (7nm)
    family: 0x17 (23) model-id: 0x71 (113) stepping: 0 microcode: 0x8701021
  Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache:
    L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 4 MiB desc: 8x512 KiB
    L3: 32 MiB desc: 2x16 MiB
  Speed (MHz): avg: 3600 min/max: 2200/4426 boost: enabled
    base/boost: 3600/4400 scaling: driver: acpi-cpufreq governor: performance
    volts: 1.1 V ext-clock: 100 MHz cores: 1: 3600 2: 3600 3: 3600 4: 3600
    5: 3600 6: 3600 7: 3600 8: 3600 9: 3600 10: 3600 11: 3600 12: 3600
    13: 3600 14: 3600 15: 3600 16: 3600 bogomips: 115203
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed mitigation: untrained return thunk; SMT enabled with STIBP
    protection
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, STIBP:
    always-on, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA GA104 [GeForce RTX 3070] driver: nvidia v: 520.56.06
    alternate: nouveau,nvidia_drm non-free: 520.xx+
    status: current (as of 2022-10) arch: Ampere code: GAxxx
    process: TSMC n7 (7nm) built: 2020-22 pcie: gen: 2 speed: 5 GT/s lanes: 16
    link-max: gen: 4 speed: 16 GT/s bus-ID: 07:00.0 chip-ID: 10de:2484
    class-ID: 0300
  Display: x11 server: X.Org v: 21.1.4 compositor: Picom v: git-98a5c
    driver: N/A display-ID: :0 screens: 1
  Screen-1: 0 s-res: 5120x1440 s-dpi: 108 s-size: 1204x333mm (47.40x13.11")
    s-diag: 1249mm (49.18")
  Monitor-1: DP-4 pos: right res: 2560x1440 hz: 60 dpi: 93
    size: 697x392mm (27.44x15.43") diag: 800mm (31.48") modes: N/A
  Monitor-2: HDMI-0 pos: primary,left res: 2560x1440 hz: 60 dpi: 109
    size: 597x335mm (23.5x13.19") diag: 685mm (26.95") modes: N/A
  API: OpenGL Message: Unable to show GL data. Required tool glxinfo
    missing.
Audio:
  Device-1: NVIDIA GA104 High Definition Audio driver: snd_hda_intel v: kernel
    pcie: gen: 2 speed: 5 GT/s lanes: 16 link-max: gen: 4 speed: 16 GT/s
    bus-ID: 07:00.1 chip-ID: 10de:228b class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
    bus-ID: 09:00.4 chip-ID: 1022:1487 class-ID: 0403
  Sound API: ALSA v: k6.1.1-zen1-1-zen running: yes
  Sound Server-1: PulseAudio v: 16.1 running: no
  Sound Server-2: PipeWire v: 0.3.60 running: yes
Network:
  Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel
    pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 04:00.0
    chip-ID: 8086:1539 class-ID: 0200
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Realtek RTL8822BE 802.11a/b/g/n/ac WiFi adapter vendor: ASUSTeK
    driver: rtw_8822be v: N/A modules: rtw88_8822be pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 port: c000 bus-ID: 05:00.0 chip-ID: 10ec:b822 class-ID: 0280
  IF: wlp5s0 state: down mac: <filter>
  IF-ID-1: virbr0 state: down mac: <filter>
Bluetooth:
  Device-1: ASUSTek Bluetooth Radio type: USB driver: btusb v: 0.8
    bus-ID: 1-8:5 chip-ID: 0b05:185c class-ID: e001 serial: <filter>
  Report: bt-adapter ID: hci0 rfk-id: 1 state: up address: <filter>
Drives:
  Local Storage: total: 1.82 TiB used: 196.46 GiB (10.5%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung
    model: SSD 970 EVO Plus 2TB size: 1.82 TiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 2B2QEXM7 temp: 51.9 C scheme: GPT
Partition:
  ID-1: / raw-size: 1022.45 GiB size: 1022.45 GiB (100.00%)
    used: 196.46 GiB (19.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p2
    maj-min: 259:2
  ID-2: /home raw-size: 1022.45 GiB size: 1022.45 GiB (100.00%)
    used: 196.46 GiB (19.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p2
    maj-min: 259:2
  ID-3: /var/log raw-size: 1022.45 GiB size: 1022.45 GiB (100.00%)
    used: 196.46 GiB (19.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p2
    maj-min: 259:2
  ID-4: /var/tmp raw-size: 1022.45 GiB size: 1022.45 GiB (100.00%)
    used: 196.46 GiB (19.2%) fs: btrfs block-size: 4096 B dev: /dev/nvme0n1p2
    maj-min: 259:2
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: zram size: 31.27 GiB used: 2 MiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 68.0 C mobo: 52.0 C gpu: nvidia temp: 38 C
  Fan Speeds (RPM): cpu: 1708 case-1: 1196 gpu: nvidia fan: 0%
  Power: 12v: 9.87 5v: N/A 3.3v: N/A vbat: 3.27
Info:
  Processes: 399 Uptime: 4m wakeups: 2 Memory: 31.27 GiB
  used: 3.94 GiB (12.6%) Init: systemd v: 252 default: graphical
  tool: systemctl Compilers: gcc: 12.2.0 Packages: pm: pacman pkgs: 1753
  libs: 430 tools: pamac,paru Shell: garuda-inxi (sudo) default: Bash
  v: 5.1.16 running-in: kitty inxi: 3.3.23
Garuda (2.6.9-1):
  System install date:     2022-11-26
  Last full system update: 2022-11-29
  Is partially upgraded:   Yes
  Relevant software:       NetworkManager
  Windows dual boot:       Yes
  Snapshots:               Snapper
  Failed units:

Hi, welcome to the forum! :wave:
Having a look at your inxi, it says

Garuda (2.6.9-1):
  System install date:     2022-11-26
  Last full system update: 2022-11-29
  Is partially upgraded:   Yes
  Relevant software:       NetworkManager
  Windows dual boot:       Yes

Partial upgrades aren't supported in Arch due to very problematic dependency issues that result from its rolling release nature. You mentioned that installed stuff persists in your booted snapshot, so I reckon you should try to update the system by running

update -a

and then reboot to see if it works. If not, maybe try restoring the snapshot after updating the system, or chroot from a live thumb drive session and update.

If updating doesn't work, at GRUB, go to the snapshot entry you normally boot from, press E, and remove all the quiet from the kernel parameters. Let us know where it stops at, and if it's consistent. Try some other kernels if you can, such as LTS and Hardened. ^^

Also, if the Windows dual boot part is legit, that could possibly be part of the problem as well, but hopefully not. :grimacing:

2 Likes

Sorry for such the late reply. But it doesn't look like that fixed it :confused: I am thinking that it is the windows install since it isn't detected by osprober and doesn't appear in grub upon boot, which I didn't even think to check for some reason. I think I am just going to get a separate drive and keep the 2 isolated from one another. Thank you for the help!

I'm wary of giving advice because I haven't done it before.

Anyway, I think you'd have to boot the live media, and properly restore the @_backup_20220112230608429 as @, then fix the grub and fstab entries if they still refer to the backup. Then reboot, update, reboot, and update remote reset-snapper.
Apologies in advance for the hand waving.

This thread was a similar case with a "sticky" snapshot, maybe there is some useful insight there.
There are likely other threads with a similar problem.

If you're considering a reinstall anyway (yes, keeping the two on separate drives is safer) you sort of have nothing to lose trying, just make sure your data is safe (backup that is).


For information, as far as I can tell there are two kinds of snapshots that are made automatically: the ones in .snapshots, taken pre/post package manager operations and/or periodically, and those named something with _backup_ and a timestamp in the name, taken before restoring one of the "regular" ones.
Apparently the snapper tools only manage the ones in .snapshots, not the "one off" kind.
After all they could be anything.

2 Likes

Thank you for the tips! I removed all windows partitions from my disk (it is now on a completely different drive altogether now), deleted all my _backup_ snapshots which broke grub, and reinstalled grub. Took a little bit of time to sort it all out, but it seems like it is working more now than it was before, but it doesn't look like it is 100%. I did a sudo update -a, but after I did that I wasn't able to boot with the default grub config that was generated. In order for me to boot, I need to remove the parameters quiet quiet splash from the grub entry. Am I ok to manually edit and keep the changed grub config file so that it just omits these parameters? Or will it just get overwritten by doing a system update? Thanks again for the help!

2 Likes

Yes, you can remove the quiet quiet splash permanently, either from the Garuda Boot Options (will update GRUB automatically) or editing the GRUB_CMDLINE_LINUX_DEFAULT= line in /etc/default/grub/ and then running sudo update-grub.

Just remember if an upgrade leaves a .pacnew file next to it (it will say so and "please check and merge") compare the two and apply the changes from the .pacnew to the real one.

2 Likes

Thank you!!

Unfortunately, this is not always true, because then you could skip the verification and directly overwrite the old file.

E.g. if in /etc/default/grub the value

GRUB_DISTRIBUTOR='Arch' 

instead of

GRUB_DISTRIBUTOR='Garuda'

would be entered.

The easiest way to check is with

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.