Professional Opinions: Disk Failure, How long do I have?

So this been happening but I really never paid attention to that SMART disk check thing on HP. Decided to look up stuff then I get this. Just wanna know if this thing bluffing, and how long I might have to backup everything if you have ever run down this rabbit hole?

Smartctl health check

Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   008   008   036    Pre-fail  Always   FAILING_NOW 11130
System:
  Kernel: 6.8.9-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc v: 13.2.1 clocksource: tsc
    avail: acpi_pm parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-zen
    root=UUID=0cf7cffe-ae81-4b74-bea3-a5a644ba91bf rw rootflags=subvol=@ quiet loglevel=3 ibt=off
  Desktop: Hyprland v: 0.39.1 with: waybar tools: avail: hyprlock,xautolock vt: 8 dm: LightDM
    v: 1.32.0 Distro: Garuda base: Arch Linux
Machine:
  Type: Laptop System: Hewlett-Packard product: HP EliteBook 840 G2 v: A3009D510203
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: Hewlett-Packard model: 2216 v: KBC Version 96.56 serial: <superuser required>
    part-nu: N0Y03UC#ABU uuid: <superuser required> UEFI: Hewlett-Packard v: M71 Ver. 01.09
    date: 09/01/2015
Battery:
  ID-1: BAT0 charge: 11.1 Wh (98.2%) condition: 11.3/11.3 Wh (100.0%) volts: 12.4 min: 11.4
    model: Hewlett-Packard Primary type: Li-ion serial: <filter> status: not charging
CPU:
  Info: model: Intel Core i5-5300U bits: 64 type: MT MCP arch: Broadwell gen: core 5 level: v3
    note: check built: 2015-18 process: Intel 14nm family: 6 model-id: 0x3D (61) stepping: 4
    microcode: 0x2F
  Topology: cpus: 1x cores: 2 tpc: 2 threads: 4 smt: enabled cache: L1: 128 KiB
    desc: d-2x32 KiB; i-2x32 KiB L2: 512 KiB desc: 2x256 KiB L3: 3 MiB desc: 1x3 MiB
  Speed (MHz): avg: 2694 min/max: 500/2900 scaling: driver: intel_cpufreq governor: schedutil
    cores: 1: 2694 2: 2694 3: 2694 4: 2694 bogomips: 18357
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities: <filter>
Graphics:
  Device-1: Intel HD Graphics 5500 vendor: Hewlett-Packard ZBook 15u G2 Mobile Workstation
    driver: i915 v: kernel arch: Gen-8 process: Intel 14nm built: 2014-15 ports: active: eDP-1
    empty: DP-1, DP-2, HDMI-A-1, HDMI-A-2 bus-ID: 00:02.0 chip-ID: 8086:1616 class-ID: 0300
  Device-2: Cheng Uei Precision Industry (Foxlink) HP EliteBook integrated HD Webcam
    driver: uvcvideo type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-7:5
    chip-ID: 05c8:0374 class-ID: 0e02
  Display: wayland server: X.org v: 1.21.1.13 with: Xwayland v: 23.2.6 compositor: Hyprland
    v: 0.39.1 driver: X: loaded: modesetting alternate: fbdev,intel,vesa dri: iris gpu: i915
    display-ID: 1
  Monitor-1: eDP-1 model: AU Optronics 0x233e built: 2012 res: 1600x900 dpi: 132 gamma: 1.2
    size: 309x174mm (12.17x6.85") diag: 355mm (14") ratio: 16:9 modes: 1600x900
  API: Vulkan v: 1.3.279 layers: 4 device: 0 type: integrated-gpu name: Intel HD Graphics 5500
    (BDW GT2) driver: mesa intel v: 24.0.6-arch1.2 device-ID: 8086:1616 surfaces: xcb,xlib,wayland
    device: 1 type: cpu name: llvmpipe (LLVM 17.0.6 256 bits) driver: mesa llvmpipe
    v: 24.0.6-arch1.2 (LLVM 17.0.6) device-ID: 10005:0000 surfaces: xcb,xlib,wayland
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
Audio:
  Device-1: Intel Broadwell-U Audio vendor: Hewlett-Packard driver: snd_hda_intel v: kernel
    bus-ID: 00:03.0 chip-ID: 8086:160c class-ID: 0403
  Device-2: Intel Wildcat Point-LP High Definition Audio vendor: Hewlett-Packard
    driver: snd_hda_intel v: kernel bus-ID: 00:1b.0 chip-ID: 8086:9ca0 class-ID: 0403
  API: ALSA v: k6.8.9-zen1-1-zen status: kernel-api tools: N/A
  Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
  Server-2: PipeWire v: 1.0.5 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Ethernet I218-LM vendor: Hewlett-Packard driver: e1000e v: kernel port: 5080
    bus-ID: 00:19.0 chip-ID: 8086:15a2 class-ID: 0200
  IF: enp0s25 state: down mac: <filter>
  Device-2: Intel Wireless 7265 driver: iwlwifi v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1
    bus-ID: 03:00.0 chip-ID: 8086:095a class-ID: 0280
  IF: wlo1 state: up mac: <filter>
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: Intel Bluetooth wireless interface driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 2-4:3 chip-ID: 8087:0a2a class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 0 state: down bt-service: enabled,running rfk-block:
    hardware: no software: yes address: <filter> bt-v: 4.0 lmp-v: 6 status: discoverable: no
    pairing: no
RAID:
  Hardware-1: Intel 82801 Mobile SATA Controller [RAID mode] driver: ahci v: 3.0 port: 5060
    bus-ID: 00:1f.2 chip-ID: 8086:282a rev: N/A class-ID: 0104
Drives:
  Local Storage: total: 931.51 GiB used: 26.83 GiB (2.9%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/sda maj-min: 8:0 vendor: Seagate model: ST1000LM035-1RK172 size: 931.51 GiB
    block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5400 serial: <filter>
    fw-rev: RSM7 scheme: GPT
Partition:
  ID-1: / raw-size: 491.68 GiB size: 491.68 GiB (100.00%) used: 26.83 GiB (5.5%) fs: btrfs
    dev: /dev/sda2 maj-min: 8:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%) used: 584 KiB (0.2%) fs: vfat
    dev: /dev/sda1 maj-min: 8:1
  ID-3: /home raw-size: 491.68 GiB size: 491.68 GiB (100.00%) used: 26.83 GiB (5.5%) fs: btrfs
    dev: /dev/sda2 maj-min: 8:2
  ID-4: /var/log raw-size: 491.68 GiB size: 491.68 GiB (100.00%) used: 26.83 GiB (5.5%) fs: btrfs
    dev: /dev/sda2 maj-min: 8:2
  ID-5: /var/tmp raw-size: 491.68 GiB size: 491.68 GiB (100.00%) used: 26.83 GiB (5.5%) fs: btrfs
    dev: /dev/sda2 maj-min: 8:2
Swap:
  Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: zram size: 11.56 GiB used: 0 KiB (0.0%) priority: 100 comp: zstd
    avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 4 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 59.0 C mobo: N/A
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 12 GiB available: 11.56 GiB used: 2.37 GiB (20.5%)
  Processes: 202 Power: uptime: 13m states: freeze,mem,disk suspend: deep avail: s2idle
    wakeups: 0 hibernate: platform avail: shutdown, reboot, suspend, test_resume image: 4.61 GiB
    services: upowerd Init: systemd v: 255 default: graphical tool: systemctl
  Packages: 1666 pm: pacman pkgs: 1650 libs: 443 tools: pacseek,paru pm: flatpak pkgs: 16
    Compilers: clang: 17.0.6 gcc: 13.2.1 Client: shell wrapper v: 5.2.26-release inxi: 3.3.34
Garuda (2.6.26-1):
  System install date:     2024-04-29
  Last full system update: 2024-05-08
  Is partially upgraded:   No
  Relevant software:       snapper NetworkManager dracut
  Windows dual boot:       No/Undetected
  Failed units: 

How long? Nobody can say. But not too much I think.
I would immediately create an image using dd from the device then plug it off and try to safe the data from the image. When successfull you can do more analyses e.g. with manufacturer tools.

2 Likes

Likely not much, SSDs fail VERY rapidly like minutes-hours (and you never really know when they start to die), it’s best if you already have backups of your data by that point.

what bout hdds?

Same, but you’ll likely have more time :wink:
Usually they fail much more gracefully…unless it’s physical damage or something.

Hope its in months? :joy:, I have to backup 1tb

Who knows…but faster you do that = better.

This days i always buy x2 of exactly same models HDDs to store data…no matter if it’s 1 or 16 Tb - i just always keep full clone of physically separated backups.

2 Likes

I do the same, but with a backup from 1 SSD to another SSD to an HDD.

What I have learned after a lifetime of making backups…

  • My data is most in danger from me.
  • Backups always fail.
  • They always fail when most needed.
8 Likes

Yeah, you should never trust yourself - as a :clown_face: :postal_horn: i fully approve! :rofl:

If the data is important, and you don’t have another backup, do not make this backup yourself.

Instead, turn off your computer as soon as possible, remove the drive from it and take it to a data recovery professional. If you are not comfortable removing the drive from the computer yourself, take the whole computer.

The longer this drive continues to spin, the greater the chance of data loss.

2 Likes

If you are getting that message, it means exactly what it says. The drive is dead, and just hanging on for random reasons. You should assume it’s done now, as of the first time you saw that message. You can’t predict beyond that, except to say I stopped buying seagates years ago due to DOA issues. Spinning disk failures can be very hard to judge, and the only rational course is to backup the data and get a new drive, or clone the existing drive to a new one if you are short on storage, though 1 TiB are quite cheap now, even SSD are finally cheap at that size. However, keep in mind a failing drive can develop block errors, which means you will start to lose blocks of your data/os at total random, without anyway to know it’s happened.

This is why sudo inxi -Da shows the smartctl data, that’s critical to know.

It’s not how long it will last that is the question, it’s how important is the data on your system to you. If not at all, then let it die then replace, if it has any value, you must replace, it’s not an option. Go with Western Digital next time for spinning hdd, and a good name brand for SSD, one with very high write/read cycle life, like crucial, samsung, intel. I know I got 2 TiB crucial NVMe for $67 recently, so you’re talking probably close to $30 for 1 TiB ssd. SSDs do not suffer the issue of being prone to failure due to dropping or bumping, which is a significant problem, particularly on laptops, where I’d put the HDD life at easily 1/2 to 1/4 of the equivalent desktop full size drive. Since you have a laptop, you should assume your drive is only booting due to divine intervention at this point, so it’s simply up to you to decide how deep your faith in that divine hand is.

2 Likes

The best is when your doing the back up an it fails an the drive you were backing up from dies. that’s happened to me several times. for important things i now backup to tape once a month.

1 Like

get a new 1tb hard drive and clone your other drive to it using clonezilla

1 Like

I wouldn’t advise that, it’s a very risky thing to do. It is quite likely that the old drive will just fail in the middle of that process.

1 Like

it’s not any less risky than using DD to clone the drive to a file :person_shrugging:

1 Like

Thanks @all for the input.

Update:

  • I noticed some of what you are saying. I use tmux, and for some reason accessing more than 32bit from CPU, I might not have phrased this right

In short things were beginning to fail at OS level.
Sadly, had to reinstall Garuda on a spare drive I’v got. Definately should have written an ansible script since I have spent the whole day restoring my system. Anyway Thanks for everybodies input. I think all answers were great answers.

1 Like

Let’s all raise our glasses for that drive!
R.I.P.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.