Hello,
I am coming here, because this happend on TWO independent computers, both running Garuda linux in the span of two days during the same procedures.
I have the following setup.
- PHP project in a virtualbox/Vagrant setup
- One folder mounted via NFS with the following parameters (Vagrantfile)
config.nfs.map_uid = Process.uid
config.nfs.map_gid = Process.gid
config.vm.synced_folder "/home/foo/projects/php-project", "/home/project-root/current", # left being host machine, right being vm
id: "i",
type: 'nfs',
nfs_version: 4,
nfs_udp: false,
mount_options: ["actimeo=1", "nolock"],
:nfs => { :mount_options => ["dmode=0755","fmode=0664"] }
- I run
composer update
in the project folder which installs all dependencies, which aborts because it can not delete old dependency folder.
Result:
- BTRFS partition is read-only.
Error message I could extract from journalctl:
Jan 04 09:04:10 foo kernel: BTRFS critical (device dm-0): corrupt leaf: root=18446744073709551610 block=64411566080 slot=31, bad key order, prev (107391 96 39>
Jan 04 09:04:10 foo kernel: BTRFS info (device dm-0): leaf 64411566080 gen 17047 total ptrs 181 free space 1727 owner 18446744073709551610
Jan 04 09:04:10 foo kernel: item 0 key (470 1 0) itemoff 16123 itemsize 160
Jan 04 09:04:10 foo kernel: inode generation 52 size 2559483904 mode 100644
Everything else is basically lost, because after that the FS becomes RO and dmesg
is full of messages saying that stuff can not be written.
What I have tried:
First machine
-
Running
btrfsck --force /dev/md-0
-
Result: Invalid inodes on specific files in the
vendor
folder (php projects equivalent to node_modules, where the packages are installed to if I runcomposer XXX
).Sorry, I do not have the exact output anymore, because this was yesterday and it seemed like a contained problem then.
-
-
I tried to restore a snapshot from that day only to realize, that snapshots are just files and the filesystem was broken. It did nothing and was probably a bad idea.
-
Then I ran
btrfsck --repair /dev/md-0
which is strongly advised against basically everywhere (don't create that command then, I guess). It fixed problems that did not show up in thebtrfsck
run before and also the inode things. -
Afterwards the system booted one time, I retried the php project, it was read-only again, and now it fails to boot completely.
Second machine
- Running:
# sudo btrfsck --force /dev/mapper/luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa Opening filesystem to check... WARNING: filesystem mounted, continuing because of --force Checking filesystem on /dev/mapper/luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa UUID: ccdd6485-2326-4b0e-a016-810aab378edd [1/7] checking root items [2/7] checking extents [3/7] checking free space tree [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) found 76418887680 bytes used, no error found total csum bytes: 68161524 total tree bytes: 2291154944 total fs tree bytes: 2104164352 total extent tree bytes: 99516416 btree space waste bytes: 360352256 file data blocks allocated: 1905105891328 referenced 116512391168
Which seems fine, but why then go into read-only mode?
Maybe it healed itself in the meantime.
I am basically lost right now.
I can not really use the tools from the live stick on machine #1, because it is an old installation which still has Timeshift in use. I can not use the old live system for it, because I can not update it properly. Also even after updating it, it does not use the latest kernel, which I understand is strongly advised for working with BTRFS.
Machine #2 is not yet investigated enough. As seen above, there is currently no error in btrfsck
, but that does mean nothing to me. I wanted to re-install machine #1 today, but if neither work properly, I am hesitant.
I am pretty lost right now.
What do you guys recommend in this scenario?
P.S.: Adding garuda-inxi
as soon as I find a way to copy&paste it without a file in between.
Update (garuda-inxi from machine #2):
# sudo garuda-inxi | xsel --clipboard --logfile /dev/null
System:
Kernel: 6.1.1-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
root=UUID=ccdd6485-2326-4b0e-a016-810aab378edd rw rootflags=subvol=@
quiet
cryptdevice=UUID=8bbf76ae-7d61-4f11-baf6-975ba4e35aaa:luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa
root=/dev/mapper/luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa quiet splash
rd.udev.log_priority=3 vt.global_cursor_default=0 loglevel=3 ibt=off
Desktop: KDE Plasma v: 5.26.4 tk: Qt v: 5.15.7 info: latte-dock
wm: kwin_x11 dm: SDDM Distro: Garuda Linux base: Arch Linux
Machine:
Type: Laptop System: LENOVO product: xxxxxxxxxx v: ThinkPad P51
serial: <filter> Chassis: type: 10 serial: <filter>
Mobo: LENOVO model: xxxxxxxxxx serial: <filter> UEFI: LENOVO
v: N1UET85W (1.59 ) date: 07/18/2022
Battery:
ID-1: BAT0 charge: 15.2 Wh (20.5%) condition: 74.1/90.0 Wh (82.4%)
volts: 10.6 min: 11.2 model: SMP 00NY493 type: Li-poly serial: <filter>
status: discharging cycles: 327
CPU:
Info: model: Intel Xeon E3-1505M v6 socket: BGA1440 (U3E1) note: check
bits: 64 type: MT MCP arch: Kaby Lake level: v3 note: check built: 2018
process: Intel 14nm family: 6 model-id: 0x9E (158) stepping: 9
microcode: 0xF0
Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
L3: 8 MiB desc: 1x8 MiB
Speed (MHz): avg: 3144 high: 3150 min/max: 800/4000 base/boost: 3000/3000
scaling: driver: intel_pstate governor: powersave volts: 1.1 V
ext-clock: 100 MHz cores: 1: 3150 2: 3150 3: 3150 4: 3150 5: 3103 6: 3150
7: 3150 8: 3150 bogomips: 48000
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities:
Type: itlb_multihit status: KVM: Split huge pages
Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
vulnerable
Type: mds mitigation: Clear CPU buffers; SMT vulnerable
Type: meltdown mitigation: PTI
Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
Type: retbleed mitigation: IBRS
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
prctl
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
sanitization
Type: spectre_v2 mitigation: IBRS, IBPB: conditional, RSB filling,
PBRSB-eIBRS: Not affected
Type: srbds mitigation: Microcode
Type: tsx_async_abort mitigation: TSX disabled
Graphics:
Device-1: Intel HD Graphics P630 vendor: Lenovo driver: i915 v: kernel
arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports: active: eDP-1
empty: none bus-ID: 00:02.0 chip-ID: 8086:591d class-ID: 0300
Device-2: NVIDIA GM206GLM [Quadro M2200 Mobile] vendor: Lenovo
driver: nvidia v: 525.60.11 alternate: nouveau,nvidia_drm non-free: 525.xx+
status: current (as of 2022-12) arch: Maxwell code: GMxxx
process: TSMC 28nm built: 2014-19 pcie: gen: 1 speed: 2.5 GT/s lanes: 16
link-max: gen: 3 speed: 8 GT/s bus-ID: 01:00.0 chip-ID: 10de:1436
class-ID: 0302
Device-3: Acer Integrated Camera type: USB driver: uvcvideo bus-ID: 1-8:2
chip-ID: 5986:111c class-ID: 0e02 serial: <filter>
Display: x11 server: X.Org v: 21.1.6 with: Xwayland v: 22.1.7
compositor: kwin_x11 driver: X: loaded: modesetting,nvidia unloaded: nouveau
alternate: fbdev,intel,nv,vesa dri: iris gpu: i915 display-ID: :0
screens: 1
Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.00x11.22")
s-diag: 582mm (22.93")
Monitor-1: eDP-1 model: AU Optronics 0x61ed built: 2016 res: 1920x1080
hz: 60 dpi: 142 gamma: 1.2 size: 344x193mm (13.54x7.6") diag: 394mm (15.5")
ratio: 16:9 modes: 1920x1080
API: OpenGL v: 4.6 Mesa 22.3.1 renderer: Mesa Intel HD Graphics P630 (KBL
GT2) direct render: Yes
Audio:
Device-1: Intel CM238 HD Audio vendor: Lenovo driver: snd_hda_intel
v: kernel bus-ID: 00:1f.3 chip-ID: 8086:a171 class-ID: 0403
Device-2: NVIDIA GM206 High Definition Audio driver: snd_hda_intel
v: kernel pcie: speed: Unknown lanes: 63 link-max: gen: 6 speed: 64 GT/s
bus-ID: 01:00.1 chip-ID: 10de:0fba class-ID: 0403
Sound API: ALSA v: k6.1.1-zen1-1-zen running: yes
Sound Server-1: PulseAudio v: 16.1 running: no
Sound Server-2: PipeWire v: 0.3.63 running: yes
Network:
Device-1: Intel Ethernet I219-LM vendor: Lenovo driver: e1000e v: kernel
port: N/A bus-ID: 00:1f.6 chip-ID: 8086:15e3 class-ID: 0200
IF: enp0s31f6 state: down mac: <filter>
Device-2: Intel Wireless 8265 / 8275 driver: iwlwifi v: kernel pcie:
gen: 1 speed: 2.5 GT/s lanes: 1 bus-ID: 04:00.0 chip-ID: 8086:24fd
class-ID: 0280
IF: wlp4s0 state: up mac: <filter>
IF-ID-1: vboxnet0 state: up speed: 10 Mbps duplex: full mac: <filter>
Drives:
Local Storage: total: 704.24 GiB used: 73.49 GiB (10.4%)
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Lenovo
model: LENSE20256GMSP34MEAT2TA size: 238.47 GiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: 2.8.8341 temp: 37.9 C scheme: GPT
SMART: yes health: PASSED on: 72d 21h cycles: 1,621
read-units: 9,158,435 [4.68 TB] written-units: 23,437,439 [11.9 TB]
ID-2: /dev/sda maj-min: 8:0 vendor: Seagate model: ST500LM021-1KJ152
family: Laptop HDD size: 465.76 GiB block-size: physical: 4096 B
logical: 512 B sata: 3.0 speed: 6.0 Gb/s type: HDD rpm: 7200
serial: <filter> rev: LIM1 temp: 36 C scheme: GPT
SMART: yes state: enabled health: PASSED on: 90d 22h cycles: 1508
Pre-Fail: attribute: Spin_Retry_Count value: 100 worst: 100 threshold: 97
Partition:
ID-1: / raw-size: 237.26 GiB size: 237.26 GiB (100.00%)
used: 73.47 GiB (31.0%) fs: btrfs block-size: 4096 B dev: /dev/dm-0
maj-min: 254:0 mapped: luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa
ID-2: /boot/efi raw-size: 512 MiB size: 511 MiB (99.80%)
used: 14.2 MiB (2.8%) fs: vfat block-size: 512 B dev: /dev/nvme0n1p1
maj-min: 259:1
ID-3: /home raw-size: 237.26 GiB size: 237.26 GiB (100.00%)
used: 73.47 GiB (31.0%) fs: btrfs block-size: 4096 B dev: /dev/dm-0
maj-min: 254:0 mapped: luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa
ID-4: /var/log raw-size: 237.26 GiB size: 237.26 GiB (100.00%)
used: 73.47 GiB (31.0%) fs: btrfs block-size: 4096 B dev: /dev/dm-0
maj-min: 254:0 mapped: luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa
ID-5: /var/tmp raw-size: 237.26 GiB size: 237.26 GiB (100.00%)
used: 73.47 GiB (31.0%) fs: btrfs block-size: 4096 B dev: /dev/dm-0
maj-min: 254:0 mapped: luks-8bbf76ae-7d61-4f11-baf6-975ba4e35aaa
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 31.07 GiB used: 0 KiB (0.0%) priority: 100
dev: /dev/zram0
Sensors:
System Temperatures: cpu: 48.0 C pch: 44.5 C mobo: N/A
Fan Speeds (RPM): fan-1: 2314 fan-2: 2329
Info:
Processes: 303 Uptime: 57m wakeups: 1 Memory: 31.07 GiB
used: 9.93 GiB (32.0%) Init: systemd v: 252 default: graphical
tool: systemctl Compilers: gcc: 12.2.0 Packages: 1430 pm: pacman pkgs: 1425
libs: 341 tools: pamac,paru pm: flatpak pkgs: 5 Shell: garuda-inxi (sudo)
default: Bash v: 5.1.16 running-in: yakuake inxi: 3.3.24
e[1;34mGaruda (2.6.12-1):e[0m
e[1;34m System install date:e[0m 2022-09-04
e[1;34m Last full system update:e[0m 2022-12-23
e[1;34m Is partially upgraded: e[0m No
e[1;34m Relevant software: e[0m NetworkManager
e[1;34m Windows dual boot: e[0m No/Undetected
e[1;34m Snapshots: e[0m Snapper
e[1;34m Failed units: e[0m
Update #2
"Good" news: The problem seems to be reproducable.
I could extract the following from sudo dmesg --follow
this time:
[ 1692.455310] BTRFS: Transaction aborted (error -17)
[ 1692.455340] WARNING: CPU: 5 PID: 48505 at fs/btrfs/inode.c:6508 btrfs_create_new_inode.cold+0x14c/0x1b8 [btrfs]
[ 1692.455415] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr bnep vboxnetflt(OE) intel_rapl_msr vboxnetadp(OE) intel_rapl_common vboxdrv(OE) uinput intel_tcc_cooling joydev x86_pkg_temp_thermal nvidia_drm(POE) mousedev intel_powerclamp iTCO_wdt nvidia_uvm(POE) nvidia_modeset(POE) ee1004 intel_pmc_bxt coretemp iTCO_vendor_support snd_hda_codec_hdmi snd_ctl_led btusb kvm_intel btrtl snd_hda_codec_realtek iwlmvm snd_hda_codec_generic uvcvideo btbcm kvm videobuf2_vmalloc snd_hda_intel irqbypass mac80211 btintel videobuf2_memops snd_intel_dspcfg rapl videobuf2_v4l2 intel_cstate snd_intel_sdw_acpi btmtk libarc4 snd_hda_codec intel_uncore psmouse videobuf2_common bluetooth snd_hda_core iwlwifi think_lmi videodev intel_lpss_pci snd_hwdepvfat i2c_i801 firmware_attributes_class thinkpad_acpi intel_lpss ecdh_generic wmi_bmof snd_pcm fat intel_wmi_thunderbolt mc crc16 e1000e cfg80211
[ 1692.455453] i2c_smbus ie31200_edac snd_timer intel_pch_thermal idma64 ledtrig_audio platform_profile rfkill snd soundcore nvidia(POE) i2c_hid_acpi i2c_hid acpi_pad mac_hid crypto_user fuse zram bpf_preload ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic rtsx_pci_sdmmc gf128mul ghash_clmulni_intel mmc_core serio_raw sha512_ssse3 atkbd aesni_intel libps2 nvme crypto_simd vivaldi_fmap cryptd nvme_core xhci_pci rtsx_pci nvme_common xhci_pci_renesas i8042 serio radeon amdgpu gpu_sched drm_ttm_helper intel_agp crc32c_intel i915 drm_buddy video wmi ttm drm_display_helper cec intel_gtt
[ 1692.455486] CPU: 5 PID: 48505 Comm: nfsd Tainted: P OE 6.1.1-zen1-1-zen #1 14158625220d9969ff9ca1425845f6e6a542f208
[ 1692.455488] Hardware name: LENOVO 20HJS1ER00/20HJS1ER00, BIOS N1UET85W (1.59 ) 07/18/2022
[ 1692.455489] RIP: 0010:btrfs_create_new_inode.cold+0x14c/0x1b8 [btrfs]
[ 1692.455543] Code: 19 00 00 eb cb 89 cf 89 8d 60 ff ff ff e8 59 a0 ff ff 8b 8d 60 ff ff ff 84 c0 74 4f 89 ce 48 c7 c7 30 21 a3 c1 e8 ca 18 9b cf <0f> 0b 8b 8d 60 ff ff ff 41 b8 01 00 00 00 eb ba 66 90 e9 7a ff ff
[ 1692.455544] RSP: 0018:ffffb356031839f8 EFLAGS: 00010286
[ 1692.455546] RAX: 0000000000000000 RBX: ffff8a2da0a1d428 RCX: 0000000000000027
[ 1692.455547] RDX: ffff8a339f961668 RSI: 0000000000000001 RDI: ffff8a339f961660
[ 1692.455548] RBP: ffffb35603183ad0 R08: 0000000000000001 R09: 00000000ffffffea
[ 1692.455549] R10: ffffffff9265b780 R11: 00000000fffff000 R12: ffffb35603183ae0
[ 1692.455550] R13: ffff8a2da0a1d23c R14: ffff8a2d2bc55428 R15: ffff8a2c50046a28
[ 1692.455551] FS: 0000000000000000(0000) GS:ffff8a339f940000(0000) knlGS:0000000000000000
[ 1692.455552] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1692.455553] CR2: 00007ffba024b000 CR3: 0000000643210006 CR4: 00000000003726e0
[ 1692.455555] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1692.455555] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1692.455557] Call Trace:
[ 1692.455558] <TASK>
[ 1692.455562] btrfs_create_common+0xd9/0x1d0 [btrfs 78b153f35e51259f27b8ddd90cff8ef77f51246f]
[ 1692.455610] vfs_mkdir+0x1e9/0x2c0
[ 1692.455615] nfsd_create_locked+0x1fd/0x2c0 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455643] nfsd_create+0x133/0x180 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455667] nfsd4_create+0x17c/0x3f0 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455696] nfsd4_proc_compound+0x3ad/0x6f0 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455722] nfsd_dispatch+0x16b/0x280 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455745] svc_process_common+0x284/0x5e0 [sunrpc a6ab4bea3b72c1c4d365975153183bc30e8e96ba]
[ 1692.455781] ? svc_recv+0x54c/0x910 [sunrpc a6ab4bea3b72c1c4d365975153183bc30e8e96ba]
[ 1692.455815] ? nfsd_svc+0x3b0/0x3b0 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455837] ? nfsd_shutdown_threads+0xa0/0xa0 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455859] svc_process+0xb1/0x100 [sunrpc a6ab4bea3b72c1c4d365975153183bc30e8e96ba]
[ 1692.455891] nfsd+0xd9/0x190 [nfsd d77d7273737fb0c95bafb71c5fe56f732c41d9e8]
[ 1692.455913] kthread+0xdb/0x110
[ 1692.455916] ? kthread_complete_and_exit+0x20/0x20
[ 1692.455918] ret_from_fork+0x1f/0x30
[ 1692.455922] </TASK>
[ 1692.455923] ---[ end trace 0000000000000000 ]---
[ 1692.455924] BTRFS: error (device dm-0: state A) in btrfs_create_new_inode:6508: errno=-17 Object already exists
[ 1692.455928] BTRFS info (device dm-0: state EA): forced readonly
[ 1752.987829] audit: type=1701 audit(1672823986.535:223): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=50432 comm="smbd" exe="/usr/bin/smbd" sig=6 res=1
[ 1752.988507] systemd-journald[372]: /var/log/journal/d8450ad6b6774f22ae0a6dc735559d4b/system.journal: Journal file corrupted, rotating.
[ 1752.988513] systemd-journald[372]: Failed to write entry to /var/log/journal/d8450ad6b6774f22ae0a6dc735559d4b/system.journal (24 items, 719 bytes), rotating before retrying: Bad message