Where to get information on failed disk?

5,072

If your drives have individual LEDs, you can generate some disk activity to make the LEDs light up with:

dd if=/dev/sdb of=/dev/null 

And try this on the responsive disks to find the bad disk by process of elimination.

Share:
5,072

Related videos on Youtube

Zhro
Author by

Zhro

Updated on September 18, 2022

Comments

  • Zhro
    Zhro almost 2 years

    I have a disk inside my server which has failed and I'm trying to figure out which one it is. I did not make a list of all serial numbers as I should have. I plan on doing this but in the meantime, can I pull any additional information from the running system?

    WARNING: Your hard drive is failing
    Device: /dev/sdc [SAT], unable to open device
    

    smartctl result:

    $smartctl --all /dev/sdc
    smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.2.2.el7.x86_64] (local build)
    Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Smartctl open device: /dev/sdc failed: No such device
    

    Since the disk is no longer online, is there someplace I can still query information on it?

    Update

    Grepped dmesg for sdc:

    $dmesg | grep sdc
    [   12.074540] sd 0:0:2:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
    [   12.074542] sd 0:0:2:0: [sdc] 4096-byte physical blocks
    [   12.083407] sd 0:0:2:0: [sdc] Write Protect is off
    [   12.083410] sd 0:0:2:0: [sdc] Mode Sense: 7f 00 10 08
    [   12.084143] sd 0:0:2:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
    [   12.798801]  sdc: sdc1 sdc9
    [   12.807266] sd 0:0:2:0: [sdc] Attached SCSI disk
    [716178.562173] sd 0:0:2:0: [sdc] Synchronizing SCSI cache
    [716178.562252] sd 0:0:2:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
    

    Then grepped for those drives in fdisk:

    $fdisk -l 2>/dev/null | egrep -i '^disk /dev+.' | grep 3.00 | sort
    Disk /dev/sda: 3000.6 GB, 3000592982016 bytes, 5860533168 sectors
    Disk /dev/sdd: 3000.6 GB, 3000592982016 bytes, 5860533168 sectors
    Disk /dev/sds: 3000.6 GB, 3000592982016 bytes, 5860533168 sectors
    

    I only have three 3TB disks in this system and they are all online. However the last one is all the way at the bottom of the fdisk list at /dev/sds. If a disk drops out and then comes back online is it reassigned the same dev id or a new one? This might be the drive.

    • Michael D.
      Michael D. over 7 years
      I would think that it's the same dev id because it's connected to the same physical socket on the mainboard. Maybe you can get more info from your disks using hdparm -I /dev/sdc
    • Michael D.
      Michael D. over 7 years
      Usually the SATA ports on the mainboard have numbers 1,2,3. I would assume that 1 is sda, 2 sdb and so forth.
    • Michael D.
      Michael D. over 7 years
      Try to do a hdparm -I on /dev/sda and /dev/sdb so you might get the serial numbers from drives that are working which leaves the defective drive.
    • Zhro
      Zhro over 7 years
      The mystery is that it was reported that /dev/sdc was failing. But there is no device by that ID and all drives are online.
  • peterh
    peterh over 6 years
    Good answer! Although today most hard disks don't have leds already.
  • bootbeast
    bootbeast over 6 years
    A lot of servers with front-panel disk slots still have some kind of indicator light.