Initially I was planning to add this as a comment on this thread, but I eventually felt like it didn't fit so I decided to toss it into its own thread.
I recently installed Gnome 42 (garuda-gnome-linux-zen-220329) and after reading in the wiki post here which warns that enabling Wayland will break stuff, I had no choice but to immediately enable Wayland.
I enabled it with Garuda Assistant (Settings -> enable GDM Wayland) and took a reboot. After selecting the grub entry, I got the familiar "Loading Linux linux-zen ... / Loading initial ramdisk ..." message, then the Plymouth screen with the eagle head and the status bar. After the status bar was mostly full, it reverted back to " Loading Linux linux-zen ... / Loading initial ramdisk ..." and got stuck.
I gave it a few minutes just in case, then was going to take Raising Elephants and restore a snapshot to backtrack a little. I only got out the "R" and the "E" when the screen snapped back to life and the login screen (GDM?) came up. I logged in normally and poof, there I was in a Wayland session.
Oddly, I wasn't connected to my network and I quickly discovered that NetworkManager actually wasn't running at all. sudo NetworkManager restart and it popped back on and automatically connected to the network.
I had to look it up because I had forgotten, but Magic SysRq key "R" switches the keyboard from raw to XLATE mode and "E" sends the SIGTERM signal to all processes except init. It appears starting NetworkManager is a process that is getting killed in this case, but other than that I'm not sure if anything is missing. The session seems fine, the obvious things are all working normally.
I took an update and rebooted and discovered the issue recurs exactly: the boot gets stuck at initial ramdisk, Alt+(Print,R,E) breaks it free and login screen loads normally, the session loads without NetworkManager started.
I understand there are still some kinks to be ironed out with the Gnome 42/Wayland stack, and frankly this issue has a very small impact on me personally so I'm not calling for support at all. I just thought I would offer it up to the community in case it would be helpful to troubleshoot. For now, the issue is easily and reliably reproducible.
Yes, I think so. That's why I started up the thread, my friend. Let me know if you have any ideas for how I can dig in and expose the problem a bit; your experience runs quite a bit deeper than mine.
I think it is very unlikely this is an issue with Garuda's implementation, but as you mention it might be worth sending a bug report upstream.
I'm at work now, but when I get home I am planning on firing up a different kernel to see what happens. Let me know if you've got some other thoughts and I'd be more than happy to take a crack at it.
sudo find /etc -type f | sudo xargs stat > etc_find_2.txt
Meld was not installed so I installed it real quick (had to restart NetworkManager first), then ran it against etc_find_1.txt and etc_find_2.txt. There were only five hits, so I'll just list them out:
/etc/environment and /etc/gdm/custom.conf were both modified. /etc/gdm/custom.conf gets the WaylandEnable=false line commented out when you enable Wayland. The new entries in /etc/environment do not seem particularly noteworthy, although it does seem odd that the file is basically empty to begin with (see .bak version below).
/etc/environment
#
# This file is parsed by pam_env module
#
# Syntax: simple "KEY=VAL" pairs on separate lines
#
_JAVA_AWT_WM_NONREPARENTING=1
#MOZ_ENABLE_WAYLAND=1
EDITOR=/usr/bin/micro
BROWSER=firefox
TERM=alacritty
MAIL=geary
/etc/gdm/custom.conf
# GDM configuration storage
[daemon]
AutomaticLoginEnable=False
# Uncomment the line below to force the login screen to use Xorg
#WaylandEnable=false
DefaultSession=gnome-xorg.desktop
[security]
[xdmcp]
[chooser]
[debug]
# Uncomment the line below to turn on debugging
#Enable=true
/etc/ld.so.cache did not change, but it was accessed at some point between running find 1 and find 2. I do not know what this file is.
/etc/environment.bak is a new file that was created--makes sense, although as I mentioned the file is basically empty.
/etc/environment.bak
#
# This file is parsed by pam_env module
#
# Syntax: simple "KEY=VAL" pairs on separate lines
#
/etc/resolv.conf changed, which at first interested me but then I realized when I restarted NetworkManager to install meld it would have whipped up a fresh copy of resolv.conf so that checks out.
At first glance these findings seem unremarkable to me, but let me know if I'm overlooking something.
I think next I'll throw another kernel in here and see if anything interesting happens.
This doesnāt make sense for a GDM Wayland setting.
Disable DefaultSession in custom.conf
GDM logs might provide debug info. They are at
/ver/log/gdm/
and I guess they would be the equivalent of Xorg logs, since there is no Xorg running. If OTOH /var/log/Xorg.?.0 exist and their log time was in the Wayland boot, they would also add to our understanding, potentially Xorg was trying to start.
If there are no logs, or they are not enough, there is a setting in custom.conf to enable debugging (default: disabled)
[debug]
# Uncomment the line below to turn on debugging
Enable=true
You might also try booting to TTY and try starting GDM from there, to get a visualized error effect , or just revert failiing changes/tests.
I commented out the line and took a reboot just in case this could be breaking something (I honestly wasnāt sure). However, if this has made a change it is not one that is obvious to me.
garuda-inxi from the Wayland session does mention X.org as a server that is running. Does that seem right?
There are not too many logs in here at the moment. The gdm directory doesnāt contain a single file.
Ī» ls -l /var/log
drwx------ - root 21 Apr 06:34 ļ audit
drwx--x--x - root 12 May 18:45 ļ gdm
drwxr-xr-x - root 14 Jan 2021 ļ gssproxy
drwxr-sr-x@ - root 12 May 18:45 ļ journal
drwxr-xr-x - root 6 Dec 2021 ļ old
drwx------ - root 28 Apr 01:30 ļ private
.rw------- 52k root 13 May 09:49 ļ boot.log
.rw-rw---- 0 root 12 May 18:45 ļ btmp
.rw-rw-r-- 0 root 12 May 18:45 ļ lastlog
.rw-r--r-- 118k root 12 May 21:45 ļ pacman.log
.rw-r----- 36k root 13 May 09:38 ļ snapper.log
.rw-rw-r-- 10.0k root 13 May 09:50 ļ wtmp
So:
Iāll fire up this debug option and see if we can get anybody to talk.
UPDATE 1:
This seems a little odd, but after enabling debug mode and rebooting it did not hang at initial ramdisk like usual, but it did automatically load an X11 session for some reason.
/etc/gdm/custom.conf has not changed:
# GDM configuration storage
[daemon]
AutomaticLoginEnable=False
# Uncomment the line below to force the login screen to use Xorg
#WaylandEnable=false
#DefaultSession=gnome-xorg.desktop
[security]
[xdmcp]
[chooser]
[debug]
# Uncomment the line below to turn on debugging
Enable=true
āEnable GDM Waylandā is still checked in Garuda Assistant:
/var/log/ is still completely empty except for boot.log (no errors noted), pacman.log, snapper.log, and wtmp.
Iāll see if I can reboot and get back into a Wayland session from the login screen.
UPDATE 2:
I was not able to get back into a Wayland session while the debug mode was enabled.
It seems odd, but something about the debug mode appears to disable Wayland somehow. I tried Zen and LTS kernels (why not ) but it boots straight to X11. The cog wheel on the login screen is also missing (the cogwheel gives the option to switch to X11 if you want to when booting Wayland).
When I commented out the debugging line in /etc/gdm/custom.conf and rebooted, I was back to stuck at ramdisk ā Raising Elephants ā login to Wayland without NetworkManager.
This setting does not exist in my default Arch package custom.conf
There is no man page for gdm to help with settings.
It might mean Xwayland or generally a $DISPLAY existence. I donāt know. Mutter takes control after gdm.
Look in ~/.local/share/, in relevant folders. Xorg saves at /xorg/, maybe gdm or other?
Did you check as root? It is not readable by anyone elseā¦
There are some automation rules for gdm, that disable Wayland in some cases. It could be something from those that (wrongly or not) does this.
$ grep "^#" /usr/lib/udev/rules.d/61-gdm.rules
# identify virtio graphics cards to find passthrough setups
# identify virtio graphics cards to find passthrough setups
# cirrus
# vga
# qxl
# disable Wayland on Hi1710 chipsets
# disable Wayland on Matrox chipsets
# disable Wayland on aspeed chipsets
# disable Wayland if modesetting is disabled
# but keep it enabled for simple framebuffer drivers
# The vendor nvidia driver has multiple modules that need to be loaded before GDM can make an
# informed choice on which way to proceed, so force GDM to wait until NVidia's modules are
# loaded before starting up.
# Check if suspend/resume services necessary for working wayland support is available
# If this machine has an internal panel, take note, since it's probably a laptop
# FIXME: It could be "ghost connectors" make this pop positive for some workstations
# in the wild. If so, we may have to fallback to looking at the chassis type from
# dmi data or acpi
# If this is a hybrid graphics setup, take note
# If this is a hybrid graphics laptop with vendor nvidia driver, disable wayland
# Disable wayland in situation where we're in a guest with a virtual gpu and host passthrough gpu
# Disable wayland when there are multiple virtual gpus
# Disable wayland when nvidia modeset is disabled or when drivers are a lower
# version than 470,
# For versions above 470 but lower than 510 prefer Xorg,
# Above 510, prefer Wayland.
# disable wayland if nvidia-drm modeset is not enabled
# disable wayland for nvidia drivers versions lower than 470
# For nvidia drivers versions Above 510, keep Wayland by default
# For nvidia drivers versions 470-495, prefer Xorg by default
I have some coding to do, so I wonāt try GDM for now. Maybe later, or someone else would want to test and confirm this issue, or not.
As long as it doesnāt seem to be a Garuda settings issue, there is no real rush.
Probably your HW/SW combination, unless others reproduce itā¦
Gnome devs are smarter than any of us, anyway!
Yes, I switched to root and confirmed the directory is empty.
That is interesting. I didnāt see anything that jumped out at me on this list as far as GDM debug mode goes, but I take your point: there could be something somewhere disabling Wayland on purpose, for any number of reasons.
This thread got me combing through the output of journal -b looking for clues, but it is quite a lengthy file and I have a bit of an untrained eye. I could identify the moment when the sysrq commands were entered, but nothing surrounding that event really stood out to me.
I thought it might be more useful to look at the journal if there were a more pronounced time difference between all the stuff that is working normally, and whatever the system is hanging up on so I took a reboot and left the system āstuckā on the initial ramdisk screen while I took my son to the park to run around like an insane maniac (as he does) before dinner.
When I came back a little over an hour later, sure enough it was still on the initial ramdisk screen. I sent through the sysrq commands and logged in to review the journal -b output. This time I noticed a lengthy series of āstate changed new leaseā outputs leading up to the sysrq commands:
May 13 17:02:25 fw-gnome NetworkManager[533]: <info> [1652475745.6991] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:03:19 fw-gnome NetworkManager[533]: <info> [1652475799.6993] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:04:13 fw-gnome NetworkManager[533]: <info> [1652475853.6977] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:05:07 fw-gnome NetworkManager[533]: <info> [1652475907.7023] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:06:02 fw-gnome NetworkManager[533]: <info> [1652475962.6986] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:06:57 fw-gnome NetworkManager[533]: <info> [1652476017.7014] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:07:52 fw-gnome NetworkManager[533]: <info> [1652476072.7075] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:08:47 fw-gnome NetworkManager[533]: <info> [1652476127.7027] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:08:48 fw-gnome wpa_supplicant[1158]: wlp170s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-56 noise=9999 txrate=245000
May 13 17:08:50 fw-gnome wpa_supplicant[1158]: wlp170s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-56 noise=9999 txrate=245000
May 13 17:08:56 fw-gnome wpa_supplicant[1158]: wlp170s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-41 noise=9999 txrate=245000
May 13 17:09:41 fw-gnome NetworkManager[533]: <info> [1652476181.7081] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:10:35 fw-gnome NetworkManager[533]: <info> [1652476235.7015] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:11:30 fw-gnome NetworkManager[533]: <info> [1652476290.7023] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:12:26 fw-gnome NetworkManager[533]: <info> [1652476346.7029] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:13:20 fw-gnome NetworkManager[533]: <info> [1652476400.7036] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:14:16 fw-gnome NetworkManager[533]: <info> [1652476456.7021] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:15:09 fw-gnome NetworkManager[533]: <info> [1652476509.7068] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:16:05 fw-gnome NetworkManager[533]: <info> [1652476565.6837] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:16:59 fw-gnome NetworkManager[533]: <info> [1652476619.7022] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:17:53 fw-gnome NetworkManager[533]: <info> [1652476673.7013] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:18:46 fw-gnome NetworkManager[533]: <info> [1652476726.7035] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:19:39 fw-gnome NetworkManager[533]: <info> [1652476779.7086] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:20:32 fw-gnome NetworkManager[533]: <info> [1652476832.7021] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:21:25 fw-gnome NetworkManager[533]: <info> [1652476885.7051] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:22:20 fw-gnome NetworkManager[533]: <info> [1652476940.7044] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:23:16 fw-gnome NetworkManager[533]: <info> [1652476996.7594] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:24:09 fw-gnome NetworkManager[533]: <info> [1652477049.7146] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:25:04 fw-gnome NetworkManager[533]: <info> [1652477104.7047] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:25:59 fw-gnome NetworkManager[533]: <info> [1652477159.7068] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:26:53 fw-gnome NetworkManager[533]: <info> [1652477213.7086] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:27:48 fw-gnome NetworkManager[533]: <info> [1652477268.7048] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:28:41 fw-gnome NetworkManager[533]: <info> [1652477321.7076] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:29:35 fw-gnome NetworkManager[533]: <info> [1652477375.7603] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:30:30 fw-gnome NetworkManager[533]: <info> [1652477430.7111] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:31:24 fw-gnome NetworkManager[533]: <info> [1652477484.7066] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:32:19 fw-gnome NetworkManager[533]: <info> [1652477539.6799] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:33:12 fw-gnome NetworkManager[533]: <info> [1652477592.7093] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:34:06 fw-gnome NetworkManager[533]: <info> [1652477646.7073] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:35:00 fw-gnome NetworkManager[533]: <info> [1652477700.7195] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:35:55 fw-gnome NetworkManager[533]: <info> [1652477755.7062] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:36:50 fw-gnome NetworkManager[533]: <info> [1652477810.7093] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:37:46 fw-gnome NetworkManager[533]: <info> [1652477866.7117] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:38:41 fw-gnome NetworkManager[533]: <info> [1652477921.6847] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:39:34 fw-gnome NetworkManager[533]: <info> [1652477974.7082] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:40:27 fw-gnome NetworkManager[533]: <info> [1652478027.7083] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:41:22 fw-gnome NetworkManager[533]: <info> [1652478082.7099] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:42:15 fw-gnome NetworkManager[533]: <info> [1652478135.7088] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:43:11 fw-gnome NetworkManager[533]: <info> [1652478191.7068] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:44:04 fw-gnome NetworkManager[533]: <info> [1652478244.7082] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:44:58 fw-gnome NetworkManager[533]: <info> [1652478298.7133] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:45:51 fw-gnome NetworkManager[533]: <info> [1652478351.7114] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:46:45 fw-gnome NetworkManager[533]: <info> [1652478405.7129] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:47:41 fw-gnome NetworkManager[533]: <info> [1652478461.7097] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:48:37 fw-gnome NetworkManager[533]: <info> [1652478517.7100] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:49:33 fw-gnome NetworkManager[533]: <info> [1652478573.7088] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:50:27 fw-gnome NetworkManager[533]: <info> [1652478627.7114] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:51:22 fw-gnome NetworkManager[533]: <info> [1652478682.7118] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:52:15 fw-gnome NetworkManager[533]: <info> [1652478735.7099] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:53:10 fw-gnome NetworkManager[533]: <info> [1652478790.7105] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:54:05 fw-gnome NetworkManager[533]: <info> [1652478845.7129] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:55:01 fw-gnome NetworkManager[533]: <info> [1652478901.6874] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:55:56 fw-gnome NetworkManager[533]: <info> [1652478956.7152] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:56:51 fw-gnome NetworkManager[533]: <info> [1652479011.7123] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:57:47 fw-gnome NetworkManager[533]: <info> [1652479067.7144] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:58:41 fw-gnome NetworkManager[533]: <info> [1652479121.7144] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 17:59:36 fw-gnome NetworkManager[533]: <info> [1652479176.7795] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:00:29 fw-gnome NetworkManager[533]: <info> [1652479229.6937] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:01:25 fw-gnome NetworkManager[533]: <info> [1652479285.7180] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:02:19 fw-gnome NetworkManager[533]: <info> [1652479339.7173] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:03:12 fw-gnome NetworkManager[533]: <info> [1652479392.7151] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:04:08 fw-gnome NetworkManager[533]: <info> [1652479448.7194] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:05:01 fw-gnome NetworkManager[533]: <info> [1652479501.7186] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:05:55 fw-gnome NetworkManager[533]: <info> [1652479555.7166] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:06:48 fw-gnome NetworkManager[533]: <info> [1652479608.6890] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:07:43 fw-gnome NetworkManager[533]: <info> [1652479663.7147] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:08:37 fw-gnome NetworkManager[533]: <info> [1652479717.7171] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:09:33 fw-gnome NetworkManager[533]: <info> [1652479773.7170] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:10:27 fw-gnome NetworkManager[533]: <info> [1652479827.7145] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:11:23 fw-gnome NetworkManager[533]: <info> [1652479883.7169] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:12:17 fw-gnome NetworkManager[533]: <info> [1652479937.7138] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:13:13 fw-gnome NetworkManager[533]: <info> [1652479993.7170] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:14:07 fw-gnome NetworkManager[533]: <info> [1652480047.6925] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:15:02 fw-gnome NetworkManager[533]: <info> [1652480102.7168] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:15:58 fw-gnome NetworkManager[533]: <info> [1652480158.7179] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:16:51 fw-gnome NetworkManager[533]: <info> [1652480211.7158] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:17:46 fw-gnome NetworkManager[533]: <info> [1652480266.7182] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:18:42 fw-gnome NetworkManager[533]: <info> [1652480322.6949] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:19:35 fw-gnome NetworkManager[533]: <info> [1652480375.7190] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:20:30 fw-gnome NetworkManager[533]: <info> [1652480430.7248] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:21:25 fw-gnome NetworkManager[533]: <info> [1652480485.7209] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:22:19 fw-gnome NetworkManager[533]: <info> [1652480539.7179] dhcp4 (wlp170s0): state changed new lease, address=192.168.0.199
May 13 18:23:02 fw-gnome kernel: sysrq: Keyboard mode set to system default
May 13 18:23:04 fw-gnome kernel: sysrq: Terminate All Tasks
May 13 18:23:04 fw-gnome systemd-journald[299]: Received SIGTERM.
May 13 18:23:04 fw-gnome kernel: fbcon: Taking over console
May 13 18:23:04 fw-gnome bluetoothd[511]: Terminating
May 13 18:23:04 fw-gnome wpa_supplicant[1158]: p2p-dev-wlp170s: CTRL-EVENT-DSCP-POLICY clear_all
May 13 18:23:04 fw-gnome wpa_supplicant[1158]: p2p-dev-wlp170s: CTRL-EVENT-DSCP-POLICY clear_all
May 13 18:23:04 fw-gnome wpa_supplicant[1158]: nl80211: deinit ifname=p2p-dev-wlp170s disabled_11b_rates=0
May 13 18:23:04 fw-gnome NetworkManager[533]: <info> [1652480584.6689] caught SIGTERM, shutting down normally.
May 13 18:23:04 fw-gnome avahi-daemon[509]: Got SIGTERM, quitting.
May 13 18:23:04 fw-gnome boltd[549]: Error releasing name org.freedesktop.bolt: The connection is closed
May 13 18:23:04 fw-gnome ModemManager[548]: <info> caught signal, shutting down...
May 13 18:23:04 fw-gnome audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-oomd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 13 18:23:04 fw-gnome kernel: audit: type=1131 audit(1652480584.671:127): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-oomd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 13 18:23:04 fw-gnome systemd[1]: systemd-oomd.service: Deactivated successfully.
May 13 18:23:04 fw-gnome audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=bluetooth comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 13 18:23:04 fw-gnome avahi-daemon[509]: Leaving mDNS multicast group on interface wlp170s0.IPv6 with address 2601:18d:4500:4b00::b9f3.
May 13 18:23:04 fw-gnome kernel: audit: type=1131 audit(1652480584.672:128): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=bluetooth comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 13 18:23:04 fw-gnome systemd[1]: systemd-oomd.service: Consumed 4.825s CPU time.
May 13 18:23:04 fw-gnome avahi-daemon[509]: Leaving mDNS multicast group on interface wlp170s0.IPv4 with address 192.168.0.199.
That is obviously just a snippet of the journal; I tossed the whole file into the pastepin if anyone fancies a peek: Garuda's PrivateBin
Iām sure this message was there before when I reviewed the log, just not quite so many. A new NetworkManager/āstate changed new leaseā entry is listed every minute or so for the whole time we were at the park.
I started thinking maybe the issue is related to the network card in the laptop. Bizarre, I would say (because what business does Wayland have interfering with the network card?), but then again NetworkManager appears to be the only service not starting when broken out of the initial ramdisk and into the Wayland session.
I decided to test by installing garuda-gnome-linux-zen-220329 on a spare partition on my desktop system, which has an altogether different hardware setup including an older network card.
No hanging at the Plymouth screen, no sysreq intervention required, it just boots normally.
Soā¦my suspicious eye has come to rest on the relatively new network card (Intel Wi-Fi 6E AX210) in the laptop, although it is still rather unclear to me what is happening.
It doesnāt seem like Wayland is to blame, because not only does that just seem insane but also my primary installation on this device is Garuda Sway. I havenāt lifted a finger configuring the network card on Sway, it worked OOTB and has never so much as flinched.
Buried in the Gnome/Wayland stack somewhere, something gets changed when you enable GDM that appears to have an effect on the kernel or firmware level.
Blacklist the iwlwifi module to test if your wifi is responsible. If you can then boot into wayland successfully, manually modprobe iwlwifi to initiate your network connection.
From the logs, it seems Wayland is tried first and fails. Gnome/gdm error out and then try Xorg, which also fails. After reset, it seems to work, so it's not a terminal/final error. Messages talk about drm something. Maybe a Gnome bug, or network, which seems to start errors.
Relevant log
May 13 16:46:46 fw-gnome gnome-shell[717]: Running GNOME Shell (using mutter 42.1) as a Wayland display server
May 13 16:46:46 fw-gnome NetworkManager[533]: <info> [1652474806.3470] device (wlp170s0): set-hw-addr: set MAC address to 1E:71:53:A0:29:A7 (scanning)
May 13 16:46:46 fw-gnome kernel: iwlwifi 0000:aa:00.0: WRT: Failed to set DRAM buffer for alloc id 1, ret=-1
May 13 16:46:46 fw-gnome kernel: iwlwifi 0000:aa:00.0: WRT: Failed to set DRAM buffer for alloc id 2, ret=-1
May 13 16:46:46 fw-gnome kernel: iwlwifi 0000:aa:00.0: WRT: Failed to set DRAM buffer for alloc id 3, ret=-1
May 13 16:46:46 fw-gnome gnome-shell[717]: Failed to open atomic modesetting backend: GDBus.Error:System.Error.EBUSY: Device or resource busy
May 13 16:46:46 fw-gnome gnome-shell[717]: g_hash_table_destroy: assertion 'hash_table != NULL' failed
May 13 16:46:46 fw-gnome gnome-shell[717]: Failed to open legacy modesetting backend: GDBus.Error:System.Error.EBUSY: Device or resource busy
May 13 16:46:46 fw-gnome gnome-shell[717]: Failed to open gpu '/dev/dri/card0': No suitable mode setting backend found
May 13 16:46:46 fw-gnome org.gnome.Shell.desktop[717]: Failed to setup: No GPUs found
May 13 16:46:46 fw-gnome gnome-session[640]: gnome-session-binary[640]: WARNING: App 'org.gnome.Shell.desktop' exited with code 1
May 13 16:46:46 fw-gnome gnome-session-binary[640]: WARNING: App 'org.gnome.Shell.desktop' exited with code 1
May 13 16:46:46 fw-gnome gnome-session-binary[640]: Unrecoverable failure in required component org.gnome.Shell.desktop
I hate to admit it, but I really fumbled with trying to get this module blacklisted.
Following along with the wiki post here, I made file /etc/modprobe.d/bl_iwlwifi.conf and added blacklist iwlwifi, then regenerated the initramfs using sudo mkinitcpio -P and rebooted.
This did not change the broken Plymouth experience, and after logging in mkinitcpio -M still listed iwlwifi. No problem; the wiki notes:
Note: The blacklist command will blacklist a module so that it will not be loaded automatically, but the module may be loaded if another non-blacklisted module depends on it or if it is loaded manually.
However, there is a workaround for this behaviour; the install command instructs modprobe to run a custom command instead of inserting the module in the kernel as normal, so you can force the module to always fail loading with:
/etc/modprobe.d/blacklist.conf
⦠install module_name /bin/true ā¦
This will effectively blacklist that module and any other that depends on it.
I added install iwlwifi /bin/true to my bl_iwlwifi.conf, regenerated the initramfs again and rebooted.
This also did not affect the Plymouth hang, and mkinitcpio -M still listed iwlwifi. I started wondering if sudo mkinitcpio -P was perhaps the incorrect way to regenerate the initramfs, and after reading through the man pages a little I switched to sudo mkinitcpio -p linux-zen, but this did not yield a different result.
I tried commenting out the āblacklistā line, thinking perhaps I am not supposed to have both the āblacklistā and āinstallā lines, but that did not appear to change anything. I found this old post that says to use install module /bin/false (instead of /bin/true) so I tried that. Itās unclear to me if this distinction makes any difference.
After running through all possible combinations of these noted variations, I started to wonder if mkinitcpio -M was a reliable method for determining if the module was blacklisted (perhaps it shows up on the module list whether it is blacklisted or not?) so I checked lspci -v and found this output:
aa:00.0 Network controller: Intel Corporation Wi-Fi 6 AX210/AX211/AX411 160MHz (rev 1a)
Subsystem: Intel Corporation Wi-Fi 6 AX210 160MHz
Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 18
Memory at 7a200000 (64-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
Hmmā¦doesnāt look blacklisted, does it?
But then I restarted NetworkManager are fired up the browser to discover that it was not connected to my network, and oh by the way WiFi is not even an option in the network settings.
I backtracked and commented out the install iwlwifi /bin/false line (leaving just the blacklist line) and the behavior was the sameāno change to Plymouth/login/NetworkManager situation, but WiFi is not an available resource.
So I guess the blacklist is working correctly? I was expecting it to be a bit more obviousāfor example, that it would come off of the mkinitcpio -M and lspci -v outputsābut there is no denying that blacklisting the module takes the network card down.
All that to say, it looks like WiFi is perhaps not responsible for the Plymouth/initial ramdisk hang after all.
That was expected. It doesnāt make sense a network driver to break GDM or other video module.
Nevertheless, you may want to know about a bug with this (I think) wifi card/module. There are some wacky workarounds, but the bug is open.
I would suggest to test sddm and/or kde, or lightdm, to see if it is gdm only or gnome.
My experience with wayland in the past was aweful and I recently tried again gnome and kde in Wayland. They feel very stable and improved, with kde having more small usability issues.
I say this because my video setup is far from usual/normal and still Wayland works fine.
I use LightDM and can start any DM, any session type with the same user account, on a desktop with Intel and nvidia390, one monitor on each GPU output, without issues.
I donāt know if GDM has to start in Wayland or in Xorg, or which is better. I donāt listen to Gnome KDE devs. I just use what works best for my workflow and that is BSPWM.
A last thought is to check Intel gpu troubleshooting, for possible workarounds, something like this or this.
That is curious about the WiFi card bugāI have never had any issues with it myself (unless you count this weird GDM thing) although I do know there are two versions of the AX210āone is the āvProā. The vPro version enables a bunch of enterprise-specific functionality that is apparently rather broken on Linux. Some poor folks scooped up the vPro thinking the āproā meant it was better, but then they have trouble getting normal functionality out of it.
In the bug report, not a single person mentions one way or the other which version they are using, so it makes me wonder if perhaps some of those folks
are running the vPro not realizing itās not fully supported on Linux.
That first link appears to be related to Xorg configuration. The second link says:
If using ālate startā KMS and the screen goes blank when āLoading modulesā, it may help to add i915 and intel_agp to the initramfs.
I started to look into this, but it appears those modules are included in the initramfs by default. I am not certain, but is it possible Garuda does not use ālate startā KMS?
I spent about an hour reading through @SonarMonkeyās exhaustive Gnome/GDM/Wayland threads again (by the way, I havenāt seen him in a whileā¦) searching for clues I may have missed before circling back to this:
I slapped SDDM on the box and wouldnāt you know it? It boot up just fine. Gnome warned me that the lock screen will be disabled because it relies on GDM, but other than that it worksāno hang on initial ramdisk. Correct me if I am misinterpreting this, but that seems to strongly imply the misconfiguration lies in GDM.
Is there any way to review specifically what happens when a user checks the box in Garuda Assistant ā Settings ā Enable GDM Wayland? The user-facing changes appear to be limited to commenting out the one line in /etc/gdm/custom.conf, and thatās pretty much it. But there must be something else happening.
The reason I ask is because if the way the Garuda tool implements Wayland on Gnome has a problem, perhaps we could find it and fix it. If not, I guess Iām at the point where maybe Iāll just collect my notes, put together a bug report for Gnome, andā¦move on with my life!
We have already checked that.
It just runs one of the two scripts on/off
IIRC while they are included, they start later, while adding them in MODULES, makes them start early. Having tested so far, you might want to test this as wellā¦
I see what you meanāthe script is very simple, not really any room for unintended consequences.
How come MOZ_ENABLE_WAYLAND=1 is a commented out line on the Wayland enable script? Iām just curious; it seems like it should be on (Iām guessing it was causing problems somewhere?).
Iāve read through the initramfs material a few time trying to figure out if I am missing something, but I guess Iām still thinking these are listed in āmodulesā and should be starting early.
This is the first chunk of my /etc/mkinitcpio.conf:
# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run. Advanced users may wish to specify all system modules
# in this array. For instance:
# MODULES=(crc32c-intel intel_agp i915 amdgpu radeon nouveau)
MODULES=(crc32c-intel intel_agp i915 amdgpu radeon nouveau)
i915 and intel_agp are already listed there (I did not add them).
Am I looking at the right thing?
Small update:
Instead of breaking the initial ramdisk hang with the sysreq SIGTERM, if I switch to tty2, log in, and run
sudo systemctl restart gdm.service
it pops right to the normal GDM login screen. From there, logging in (again) loads a proper Gnome desktop in a Wayland session.
Iām not sure why I didnāt think to try this sooner, just from a troubleshooting standpoint. Obviously this isnāt much of an improvement; itās just as hacky as the sysreq/login/sudo NetworkManager restart method (and about twice as much typing, if you count the double login), although logging in with this method has the benefit of automatically connecting to the network (since NetworkManager never gets killed off).
It seems the NetworkManager/state changed new lease error message is a red herring; the GDM service is what hangs.
This almost sounds like a race condition is happening. I am not familiar with Gnome in the least as I have always been a KDE fan since day 1 using Linux. My take on this is that perhaps adding a 2 or 3 second delay before starting the GDM service may help work around this issue. Sure thatās still hacky, but at least it wonāt require any more manual intervention if you use a systemd drop in.
I am interested in trying this, however this is a bit beyond my depth. I donāt think I have ever heard of a systemd drop in until now. Iāve read through this, which is clear enough and led me to discover sudo systemctl edit gdm:
### Editing /etc/systemd/system/gdm.service.d/override.conf
### Anything between here and the comment below will become the new contents of the file
### Lines below this comment will be discarded
### /usr/lib/systemd/system/gdm.service
# [Unit]
# Description=GNOME Display Manager
#
# # replaces the getty
# Conflicts=getty@tty1.service
# After=getty@tty1.service
#
# # replaces plymouth-quit since it quits plymouth on its own
# Conflicts=
# After=
#
# # Needs all the dependencies of the services it's replacing
# # pulled from getty@.service and
# # (except for plymouth-quit-wait.service since it waits until
# # plymouth is quit, which we do)
# After=rc-local.service plymouth-start.service systemd-user-sessions.service
#
# # GDM takes responsibility for stopping plymouth, so if it fails
# # for any reason, make sure plymouth still stops
# OnFailure=plymouth-quit.service
#
# [Service]
# ExecStart=/usr/bin/gdm
# KillMode=mixed
# Restart=always
# IgnoreSIGPIPE=no
# BusName=org.gnome.DisplayManager
# EnvironmentFile=-/etc/locale.conf
# ExecReload=/bin/kill -SIGHUP $MAINPID
# KeyringMode=shared
#
# [Install]
# Alias=display-manager.service
It looks like a nice easy way to make a drop-in file, with some suggested lines and everything. Nice!
I noticed the ārestartā option mentioned in both the edit gdm file and the wiki post, which sounded like it might work so I tossed this in there:
[Service]
Restart=always
RestartSec=10
It did not work, although I honestly am not sure how to even check if this syntax is correct, or if what I am asking for is what I think I am asking for.
I backed up and took a look here to investigate the āwaitā commandāthat seems like it would be a cleaner solution anyhowābut Iām not sure how to introduce this command into the drop-in file.
If the wait utility is invoked with no operands, it shall wait until all process IDs known to the invoking shell have terminated and exit with a zero exit status.
Okay, fineāthat sounds good. But to use the wait command it looks like you are meant to specify a PID. Iām not certain there is a way to determine what the PID of GDM will be, is there?
I read a little further into the wait man pages, but it quickly gets a bit dense (my understanding of bash scripting is very rudimentary).
Hoping to get lucky, I tossed wait (just the single word, by itself ) into the edit gdm file but Iām afraid itās not quite that simple.
Do you have any advice for how to set this up properly?
I suspect (and hopeā¦) you havenāt added any of those yourself. Else, explain why did you add crc32c-intel. Read this.
Since you have an Intel gpu, I would suggest removing all those modules and rebuild images.
Unless you need amdgpu, nouveau, radeon, which suggests you are⦠hiding some critical info
Restart=always is already the same, so it would not be needed, if you finally create a drop-in.
Wise choice, Watson! Wrong translation of mine Donāt play with those thingsā¦unless you know what you are doing.