How to recover a crashed Linux md RAID5 array?
Solution 1
First check the disks, try running smart selftest
for i in a b c d; do
smartctl -s on -t long /dev/sd$i
done
It might take a few hours to finish, but check each drive's test status every few minutes, i.e.
smartctl -l selftest /dev/sda
If the status of a disk reports not completed because of read errors, then this disk should be consider unsafe for md1 reassembly. After the selftest finish, you can start trying to reassembly your array. Optionally, if you want to be extra cautious, move the disks to another machine before continuing (just in case of bad ram/controller/etc).
Recently, I had a case exactly like this one. One drive got failed, I re-added in the array but during rebuild 3 of 4 drives failed altogether. The contents of /proc/mdadm was the same as yours (maybe not in the same order)
md1 : inactive sdc2[2](S) sdd2[4](S) sdb2[1](S) sda2[0](S)
But I was lucky and reassembled the array with this
mdadm --assemble /dev/md1 --scan --force
By looking at the --examine output you provided, I can tell the following scenario happened: sdd2 failed, you removed it and re-added it, So it became a spare drive trying to rebuild. But while rebuilding sda2 failed and then sdb2 failed. So the events counter is bigger in sdc2 and sdd2 which are the last active drives in the array (although sdd didn't have the chance to rebuild and so it is the most outdated of all). Because of the differences in the event counters, --force will be necessary. So you could also try this
mdadm --assemble /dev/md1 /dev/sd[abc]2 --force
To conclude, I think that if the above command fails, you should try to recreate the array like this:
mdadm --create /dev/md1 --assume-clean -l5 -n4 -c64 /dev/sd[abc]2 missing
If you do the --create
, the missing
part is important, don't try to add a fourth drive in the array, because then construction will begin and you will lose your data. Creating the array with a missing drive, will not change its contents and you'll have the chance to get a copy elsewhere (raid5 doesn't work the same way as raid1).
If that fails to bring the array up, try this solution (perl script) here Recreating an array
If you finally manage to bring the array up, the filesystem will be unclean and probably corrupted. If one disk fails during rebuild, it is expected that the array will stop and freeze not doing any writes to the other disks. In this case two disks failed, maybe the system was performing write requests that wasn't able to complete, so there is some small chance you lost some data, but also a chance that you will never notice it :-)
edit: some clarification added.
Solution 2
I experienced many problems while I used mdadm
, but never lost data.
You should avoid the --force
option, or use it very carefully, becasuse you can lose all of your data.
Please post your /etc/mdadm/mdadm.conf
Related videos on Youtube
stribika
I am a student at Budapest University of Technology and Economics studying software engineering.
Updated on September 17, 2022Comments
-
stribika over 1 year
Some time ago I had a RAID5 system at home. One of the 4 disks failed but after removing and putting it back it seemed to be OK so I started a resync. When it finished I realized, to my horror, that 3 out of 4 disks failed. However I don't belive that's possible. There are multiple partitions on the disks each part of a different RAID array.
- md0 is a RAID1 array comprised of sda1, sdb1, sdc1 and sdd1.
- md1 is a RAID5 array comprised of sda2, sdb2, sdc2 and sdd2.
- md2 is a RAID0 array comprised of sda3, sdb3, sdc3 and sdd3.
md0 and md2 reports all disks up while md1 reports 3 failed (sdb2, sdc2, sdd2). It's my uderstanding that when hard drives fail all the partitions should be lost not just the middle ones.
At that point I turned the computer off and unplugged the drives. Since then I was using that computer with a smaller new disk.
Is there any hope of recovering the data? Can I somehow convince mdadm that my disks are in fact working? The only disk that may really have a problem is sdc but that one too is reported up by the other arrays.
Update
I finally got a chance to connect the old disks and boot this machine from SystemRescueCd. Everything above was written from memory. Now I have some hard data. Here is the output of
mdadm --examine /dev/sd*2
/dev/sda2: Magic : a92b4efc Version : 0.90.00 UUID : 53eb7711:5b290125:db4a62ac:7770c5ea Creation Time : Sun May 30 21:48:55 2010 Raid Level : raid5 Used Dev Size : 625064960 (596.11 GiB 640.07 GB) Array Size : 1875194880 (1788.33 GiB 1920.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Update Time : Mon Aug 23 11:40:48 2010 State : clean Active Devices : 3 Working Devices : 4 Failed Devices : 1 Spare Devices : 1 Checksum : 68b48835 - correct Events : 53204 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 2 0 active sync /dev/sda2 0 0 8 2 0 active sync /dev/sda2 1 1 8 18 1 active sync /dev/sdb2 2 2 8 34 2 active sync /dev/sdc2 3 3 0 0 3 faulty removed 4 4 8 50 4 spare /dev/sdd2 /dev/sdb2: Magic : a92b4efc Version : 0.90.00 UUID : 53eb7711:5b290125:db4a62ac:7770c5ea Creation Time : Sun May 30 21:48:55 2010 Raid Level : raid5 Used Dev Size : 625064960 (596.11 GiB 640.07 GB) Array Size : 1875194880 (1788.33 GiB 1920.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Update Time : Mon Aug 23 11:44:54 2010 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Checksum : 68b4894a - correct Events : 53205 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 18 1 active sync /dev/sdb2 0 0 0 0 0 removed 1 1 8 18 1 active sync /dev/sdb2 2 2 8 34 2 active sync /dev/sdc2 3 3 0 0 3 faulty removed 4 4 8 50 4 spare /dev/sdd2 /dev/sdc2: Magic : a92b4efc Version : 0.90.00 UUID : 53eb7711:5b290125:db4a62ac:7770c5ea Creation Time : Sun May 30 21:48:55 2010 Raid Level : raid5 Used Dev Size : 625064960 (596.11 GiB 640.07 GB) Array Size : 1875194880 (1788.33 GiB 1920.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Update Time : Mon Aug 23 11:44:54 2010 State : clean Active Devices : 1 Working Devices : 2 Failed Devices : 2 Spare Devices : 1 Checksum : 68b48975 - correct Events : 53210 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 34 2 active sync /dev/sdc2 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 34 2 active sync /dev/sdc2 3 3 0 0 3 faulty removed 4 4 8 50 4 spare /dev/sdd2 /dev/sdd2: Magic : a92b4efc Version : 0.90.00 UUID : 53eb7711:5b290125:db4a62ac:7770c5ea Creation Time : Sun May 30 21:48:55 2010 Raid Level : raid5 Used Dev Size : 625064960 (596.11 GiB 640.07 GB) Array Size : 1875194880 (1788.33 GiB 1920.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Update Time : Mon Aug 23 11:44:54 2010 State : clean Active Devices : 1 Working Devices : 2 Failed Devices : 2 Spare Devices : 1 Checksum : 68b48983 - correct Events : 53210 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 4 8 50 4 spare /dev/sdd2 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 34 2 active sync /dev/sdc2 3 3 0 0 3 faulty removed 4 4 8 50 4 spare /dev/sdd2
It appears that things have changed since the last boot. If I'm reading this correctly sda2, sdb2 and sdc2 are working and contain synchronized data and sdd2 is spare. I distinctly remember seeing 3 failed disks but this is good news. Yet the array still isn't working:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md125 : inactive sda2[0](S) sdb2[1](S) sdc2[2](S) 1875194880 blocks md126 : inactive sdd2[4](S) 625064960 blocks md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 64128 blocks [4/4] [UUUU] unused devices: <none>
md0 appears to be renamed to md127. md125 and md126 are very strange. They should be one array not two. That used to be called md1. md2 is completely gone but that was my swap so I don't care.
I can understand the different names and it doesn't really matter. But why is an array with 3 "active sync" disks unreadable? And what's up with sdd2 being in a separate array?
Update
I tried the following after backing up the superblocks:
root@sysresccd /root % mdadm --stop /dev/md125 mdadm: stopped /dev/md125 root@sysresccd /root % mdadm --stop /dev/md126 mdadm: stopped /dev/md126
So far so good. Since sdd2 is spare I don't want to add it yet.
root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c}2 missing mdadm: cannot open device missing: No such file or directory mdadm: missing has no superblock - assembly aborted
Apparently I can't do that.
root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c}2 mdadm: /dev/md1 assembled from 1 drive - not enough to start the array. root@sysresccd /root % cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : inactive sdc2[2](S) sdb2[1](S) sda2[0](S) 1875194880 blocks md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 64128 blocks [4/4] [UUUU] unused devices: <none>
That didn't work either. Let's try with all the disks.
mdadm --stop /dev/md1 mdadm: stopped /dev/md1 root@sysresccd /root % mdadm --assemble /dev/md1 /dev/sd{a,b,c,d}2 mdadm: /dev/md1 assembled from 1 drive and 1 spare - not enough to start the array. root@sysresccd /root % cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : inactive sdc2[2](S) sdd2[4](S) sdb2[1](S) sda2[0](S) 2500259840 blocks md127 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 64128 blocks [4/4] [UUUU] unused devices: <none>
No luck. Based on this answer I'm planning to try:
mdadm --create /dev/md1 --assume-clean --metadata=0.90 --bitmap=/root/bitmapfile --level=5 --raid-devices=4 /dev/sd{a,b,c}2 missing mdadm --add /dev/md1 /dev/sdd2
Is it safe?
Update
I publish the superblock parser script I used to make that table in the my comment. Maybe someone will find it useful. Thanks for all your help.
-
Admin about 13 yearsI guess
mdadm --re-add
isn't what you're looking for. Did you do a memory test recently? Do you have any log message related to the array failure? -
Admin about 13 years@Gilles: I don't have logs from before the crash since they were stored on the failed array. And I don't think I can fix it with the standard mdadm interface. Any sort of operation that involves a resync is impossible with 1 of 4 disks. I think the 3 "failed" disks contain enough information to restore everything. For example I can read them with dd. The "good" one could be out of sync. I will do a memtest but that machine is now working perfectly with a new disk.
-
Admin about 13 yearsDid you try stopping the array and reassembling a new one with
mdadm -A /dev/md1 /dev/sd{b,c,d}2
(perhaps--force
)? (If you haven't, back up the superblocks first.) -
Admin about 13 years@Gilles: I updated my question with up to date information. What do I need to back up exactly? The first few blocks of the disks or is there a specific tool for this?
-
Admin about 13 years@stribika: The superblock is the last full 64kB block aligned on a 64kB boundary on the partition. I have no idea how
/dev/sdd2
can be in a separate array despite having the same UUID assd{a,b,c}2
. -
Admin about 13 years@Gilles: Tried it but didn't work. I posted the parsed superblocks (based on the mdadm source code) here: sprunge.us/iAFh The different superblocks store different information about the number of active disks.
-
Admin about 13 years@stribika: Do you get any output from
mdadm --detail /dev/md1
? -
Admin about 13 years@forcefsck: No, it says the array is inactive.
-
stribika about 13 years
mdadm --assemble /dev/md1 /dev/sd[abc]2 --force
worked. Thank you. You saved my data! :) I will not attempt to add the fourth disk because the first 3 is not as good as I previously tought. The selftest revealed each have 10-20 unreadable blocks. I feel stupid for not checking this first. -
0xC0000022L about 11 yearsThanks for a comprehensive answer. Rewarded with 50 rep from me.
-
poige over 10 years"If you do the --create, the missing part is important, don't try to add a fourth drive in the array, because then construction will begin and you will lose your data. " — BS. If you specified
--assume-clean
(you did) it won't.