How to recover a crashed Linux md RAID5 array?

63,294

Solution 1

First check the disks, try running smart selftest

for i in a b c d; do
    smartctl -s on -t long /dev/sd$i
done

It might take a few hours to finish, but check each drive's test status every few minutes, i.e.

smartctl -l selftest /dev/sda

If the status of a disk reports not completed because of read errors, then this disk should be consider unsafe for md1 reassembly. After the selftest finish, you can start trying to reassembly your array. Optionally, if you want to be extra cautious, move the disks to another machine before continuing (just in case of bad ram/controller/etc).

Recently, I had a case exactly like this one. One drive got failed, I re-added in the array but during rebuild 3 of 4 drives failed altogether. The contents of /proc/mdadm was the same as yours (maybe not in the same order)

md1 : inactive sdc2[2](S) sdd2[4](S) sdb2[1](S) sda2[0](S)

But I was lucky and reassembled the array with this

mdadm --assemble /dev/md1 --scan --force

By looking at the --examine output you provided, I can tell the following scenario happened: sdd2 failed, you removed it and re-added it, So it became a spare drive trying to rebuild. But while rebuilding sda2 failed and then sdb2 failed. So the events counter is bigger in sdc2 and sdd2 which are the last active drives in the array (although sdd didn't have the chance to rebuild and so it is the most outdated of all). Because of the differences in the event counters, --force will be necessary. So you could also try this

mdadm --assemble /dev/md1 /dev/sd[abc]2 --force

To conclude, I think that if the above command fails, you should try to recreate the array like this:

mdadm --create /dev/md1 --assume-clean -l5 -n4 -c64 /dev/sd[abc]2 missing

If you do the --create, the missing part is important, don't try to add a fourth drive in the array, because then construction will begin and you will lose your data. Creating the array with a missing drive, will not change its contents and you'll have the chance to get a copy elsewhere (raid5 doesn't work the same way as raid1).

If that fails to bring the array up, try this solution (perl script) here Recreating an array

If you finally manage to bring the array up, the filesystem will be unclean and probably corrupted. If one disk fails during rebuild, it is expected that the array will stop and freeze not doing any writes to the other disks. In this case two disks failed, maybe the system was performing write requests that wasn't able to complete, so there is some small chance you lost some data, but also a chance that you will never notice it :-)

edit: some clarification added.

Solution 2

I experienced many problems while I used mdadm, but never lost data. You should avoid the --force option, or use it very carefully, becasuse you can lose all of your data. Please post your /etc/mdadm/mdadm.conf

Share:
63,294

Related videos on Youtube

stribika
Author by

stribika

I am a student at Budapest University of Technology and Economics studying software engineering.

Updated on September 17, 2022

Comments

  • stribika
    stribika over 1 year

    Some time ago I had a RAID5 system at home. One of the 4 disks failed but after removing and putting it back it seemed to be OK so I started a resync. When it finished I realized, to my horror, that 3 out of 4 disks failed. However I don't belive that's possible. There are multiple partitions on the disks each part of a different RAID array.

    • md0 is a RAID1 array comprised of sda1, sdb1, sdc1 and sdd1.
    • md1 is a RAID5 array comprised of sda2, sdb2, sdc2 and sdd2.
    • md2 is a RAID0 array comprised of sda3, sdb3, sdc3 and sdd3.

    md0 and md2 reports all disks up while md1 reports 3 failed (sdb2, sdc2, sdd2). It's my uderstanding that when hard drives fail all the partitions should be lost not just the middle ones.

    At that point I turned the computer off and unplugged the drives. Since then I was using that computer with a smaller new disk.

    Is there any hope of recovering the data? Can I somehow convince mdadm that my disks are in fact working? The only disk that may really have a problem is sdc but that one too is reported up by the other arrays.

    Update

    I finally got a chance to connect the old disks and boot this machine from SystemRescueCd. Everything above was written from memory. Now I have some hard data. Here is the output of mdadm --examine /dev/sd*2

    /dev/sda2:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 53eb7711:5b290125:db4a62ac:7770c5ea
      Creation Time : Sun May 30 21:48:55 2010
         Raid Level : raid5
      Used Dev Size : 625064960 (596.11 GiB 640.07 GB)
         Array Size : 1875194880 (1788.33 GiB 1920.20 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 1
    
        Update Time : Mon Aug 23 11:40:48 2010
              State : clean
     Active Devices : 3
    Working Devices : 4
     Failed Devices : 1
      Spare Devices : 1
           Checksum : 68b48835 - correct
             Events : 53204
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     0       8        2        0      active sync   /dev/sda2
    
       0     0       8        2        0      active sync   /dev/sda2
       1     1       8       18        1      active sync   /dev/sdb2
       2     2       8       34        2      active sync   /dev/sdc2
       3     3       0        0        3      faulty removed
       4     4       8       50        4      spare   /dev/sdd2
    /dev/sdb2:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 53eb7711:5b290125:db4a62ac:7770c5ea
      Creation Time : Sun May 30 21:48:55 2010
         Raid Level : raid5
      Used Dev Size : 625064960 (596.11 GiB 640.07 GB)
         Array Size : 1875194880 (1788.33 GiB 1920.20 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 1
    
        Update Time : Mon Aug 23 11:44:54 2010
              State : clean
     Active Devices : 2
    Working Devices : 3
     Failed Devices : 1
      Spare Devices : 1
           Checksum : 68b4894a - correct
             Events : 53205
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     1       8       18        1      active sync   /dev/sdb2
    
       0     0       0        0        0      removed
       1     1       8       18        1      active sync   /dev/sdb2
       2     2       8       34        2      active sync   /dev/sdc2
       3     3       0        0        3      faulty removed
       4     4       8       50        4      spare   /dev/sdd2
    /dev/sdc2:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 53eb7711:5b290125:db4a62ac:7770c5ea
      Creation Time : Sun May 30 21:48:55 2010
         Raid Level : raid5
      Used Dev Size : 625064960 (596.11 GiB 640.07 GB)
         Array Size : 1875194880 (1788.33 GiB 1920.20 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 1
    
        Update Time : Mon Aug 23 11:44:54 2010
              State : clean
     Active Devices : 1
    Working Devices : 2
     Failed Devices : 2
      Spare Devices : 1
           Checksum : 68b48975 - correct
             Events : 53210
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     2       8       34        2      active sync   /dev/sdc2
    
       0     0       0        0        0      removed
       1     1       0        0        1      faulty removed
       2     2       8       34        2      active sync   /dev/sdc2
       3     3       0        0        3      faulty removed
       4     4       8       50        4      spare   /dev/sdd2
    /dev/sdd2:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 53eb7711:5b290125:db4a62ac:7770c5ea
      Creation Time : Sun May 30 21:48:55 2010
         Raid Level : raid5
      Used Dev Size : 625064960 (596.11 GiB 640.07 GB)
         Array Size : 1875194880 (1788.33 GiB 1920.20 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 1
    
        Update Time : Mon Aug 23 11:44:54 2010
              State : clean
     Active Devices : 1
    Working Devices : 2
     Failed Devices : 2
      Spare Devices : 1
           Checksum : 68b48983 - correct
             Events : 53210
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     4       8       50        4      spare   /dev/sdd2
    
       0     0       0        0        0      removed
       1     1       0        0        1      faulty removed
       2     2       8       34        2      active sync   /dev/sdc2
       3     3       0        0        3      faulty removed
       4     4       8       50        4      spare   /dev/sdd2
    

    It appears that things have changed since the last boot. If I'm reading this correctly sda2, sdb2 and sdc2 are working and contain synchronized data and sdd2 is spare. I distinctly remember seeing 3 failed disks but this is good news. Yet the array still isn't working:

    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md125 : inactive sda2[0](S) sdb2[1](S) sdc2[2](S)
          1875194880 blocks
    
    md126 : inactive sdd2[4](S)
          625064960 blocks
    
    md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
          64128 blocks [4/4] [UUUU]
    
    unused devices: <none>
    

    md0 appears to be renamed to md127. md125 and md126 are very strange. They should be one array not two. That used to be called md1. md2 is completely gone but that was my swap so I don't care.

    I can understand the different names and it doesn't really matter. But why is an array with 3 "active sync" disks unreadable? And what's up with sdd2 being in a separate array?

    Update

    I tried the following after backing up the superblocks:

    root@sysresccd /root % mdadm --stop /dev/md125
    mdadm: stopped /dev/md125
    root@sysresccd /root % mdadm --stop /dev/md126
    mdadm: stopped /dev/md126
    

    So far so good. Since sdd2 is spare I don't want to add it yet.

    root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c}2 missing 
    mdadm: cannot open device missing: No such file or directory
    mdadm: missing has no superblock - assembly aborted
    

    Apparently I can't do that.

    root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c}2        
    mdadm: /dev/md1 assembled from 1 drive - not enough to start the array.
    root@sysresccd /root % cat /proc/mdstat 
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md1 : inactive sdc2[2](S) sdb2[1](S) sda2[0](S)
          1875194880 blocks
    
    md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
          64128 blocks [4/4] [UUUU]
    
    unused devices: <none>
    

    That didn't work either. Let's try with all the disks.

    mdadm --stop /dev/md1
    mdadm: stopped /dev/md1
    root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c,d}2
    mdadm: /dev/md1 assembled from 1 drive and 1 spare - not enough to start the array.
    root@sysresccd /root % cat /proc/mdstat                           
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md1 : inactive sdc2[2](S) sdd2[4](S) sdb2[1](S) sda2[0](S)
          2500259840 blocks
    
    md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
          64128 blocks [4/4] [UUUU]
    
    unused devices: <none>
    

    No luck. Based on this answer I'm planning to try:

    mdadm --create /dev/md1 --assume-clean --metadata=0.90 --bitmap=/root/bitmapfile --level=5 --raid-devices=4 /dev/sd{a,b,c}2 missing
    mdadm --add /dev/md1 /dev/sdd2
    

    Is it safe?

    Update

    I publish the superblock parser script I used to make that table in the my comment. Maybe someone will find it useful. Thanks for all your help.

    • Admin
      Admin about 13 years
      I guess mdadm --re-add isn't what you're looking for. Did you do a memory test recently? Do you have any log message related to the array failure?
    • Admin
      Admin about 13 years
      @Gilles: I don't have logs from before the crash since they were stored on the failed array. And I don't think I can fix it with the standard mdadm interface. Any sort of operation that involves a resync is impossible with 1 of 4 disks. I think the 3 "failed" disks contain enough information to restore everything. For example I can read them with dd. The "good" one could be out of sync. I will do a memtest but that machine is now working perfectly with a new disk.
    • Admin
      Admin about 13 years
      Did you try stopping the array and reassembling a new one with mdadm -A /dev/md1 /dev/sd{b,c,d}2 (perhaps --force)? (If you haven't, back up the superblocks first.)
    • Admin
      Admin about 13 years
      @Gilles: I updated my question with up to date information. What do I need to back up exactly? The first few blocks of the disks or is there a specific tool for this?
    • Admin
      Admin about 13 years
      @stribika: The superblock is the last full 64kB block aligned on a 64kB boundary on the partition. I have no idea how /dev/sdd2 can be in a separate array despite having the same UUID as sd{a,b,c}2.
    • Admin
      Admin about 13 years
      @Gilles: Tried it but didn't work. I posted the parsed superblocks (based on the mdadm source code) here: sprunge.us/iAFh The different superblocks store different information about the number of active disks.
    • Admin
      Admin about 13 years
      @stribika: Do you get any output from mdadm --detail /dev/md1?
    • Admin
      Admin about 13 years
      @forcefsck: No, it says the array is inactive.
  • stribika
    stribika about 13 years
    mdadm --assemble /dev/md1 /dev/sd[abc]2 --force worked. Thank you. You saved my data! :) I will not attempt to add the fourth disk because the first 3 is not as good as I previously tought. The selftest revealed each have 10-20 unreadable blocks. I feel stupid for not checking this first.
  • 0xC0000022L
    0xC0000022L about 11 years
    Thanks for a comprehensive answer. Rewarded with 50 rep from me.
  • poige
    poige over 10 years
    "If you do the --create, the missing part is important, don't try to add a fourth drive in the array, because then construction will begin and you will lose your data. " — BS. If you specified --assume-clean (you did) it won't.