mdadm: /dev/md0 assembled from 3 drives - not enough to start the array

16,342

First off, the fact that you're backing up the drives using dd is a good thing—a very sensible first action.

You can see from the events counter and the last update timestamp that sda dropped out of the array, but not much changed between when it dropped out and the last access. You can also see the device states on sda say 0–3 are active, 4 missing (mdadm counts from 0); on the other drives, 0–2 active, 3 & 4 missing. So you suffered a second disk failure on RAID5, which of course stopped the array.

So basically you need to ask mdadm to ignore the fact that sda is outdated, and assemble the array anyway. Which is what --force does—so that should have worked. It's possible you just need to add --run (or use --scan). I think that's what's happening here.

Another possibility is that the boot scripts have half-assembled the array, making devices busy. Check (e.g., cat or less) /proc/mdstat to make sure that hasn't happened, and mdadm --stop any unwanted arrays present.

You could also add --verbose to get a better idea of why mdadm isn't assembling the array.

Once the array is assembled, you can use mdadm -a to add your new disk to it, and a rebuild should start immediately. You should also consider replacing sda, as it'd appear to be flaky (it dropped out before).

In any case, zeroing a superblock is a drastic, near-last-ditch approach to recovering an array. It shouldn't be needed here.

Share:
16,342

Related videos on Youtube

mock
Author by

mock

Geek by birth. Linux by choice.

Updated on September 18, 2022

Comments

  • mock
    mock almost 2 years

    I had a drive in my raid5 fail a while back. I believe the problem was due to a power failure at the time, but originally thought it might be the hard drive controllers on the motherboard (this is a system I put together myself).

    Since then, I built a replacement system and transferred the drives over and attempted to start them up. What I'm getting now is that one drive is still not good for the system to start.

    Here is what I get when trying to assemble:

    [root@localhost ~]# mdadm --assemble --force /dev/md0 /dev/sdf1 /dev/sde1 /dev/sdd1 /dev/sda1 -v
    mdadm: looking for devices for /dev/md0
    mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 0.
    mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 1.
    mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
    mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 3.
    mdadm: added /dev/sde1 to /dev/md0 as 1
    mdadm: added /dev/sdd1 to /dev/md0 as 2
    mdadm: added /dev/sda1 to /dev/md0 as 3 (possibly out of date)
    mdadm: no uptodate device for slot 8 of /dev/md0
    mdadm: added /dev/sdf1 to /dev/md0 as 0
    mdadm: /dev/md0 assembled from 3 drives - not enough to start the array.
    

    When I examine the drives, I get this:

    [root@localhost ~]# mdadm --examine /dev/sd[a-z]1
    /dev/sda1:
              Magic : a92b4efc
            Version : 1.1
        Feature Map : 0x1
         Array UUID : 491fdb85:372da78e:8022a675:04a2932c
               Name : kenya:0
      Creation Time : Wed Aug 21 14:18:41 2013
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
         Array Size : 7813527552 (7451.56 GiB 8001.05 GB)
      Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
        Data Offset : 262144 sectors
       Super Offset : 0 sectors
       Unused Space : before=262072 sectors, after=1024 sectors
              State : clean
        Device UUID : 879d0ddf:9f9c91c5:ffb0185f:c69dd71f
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Thu Feb  5 06:05:09 2015
           Checksum : 758a6362 - correct
             Events : 624481
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 3
       Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)
    
    mdadm: No md superblock detected on /dev/sdb1.
    
    /dev/sdd1:
              Magic : a92b4efc
            Version : 1.1
        Feature Map : 0x1
         Array UUID : 491fdb85:372da78e:8022a675:04a2932c
               Name : kenya:0
      Creation Time : Wed Aug 21 14:18:41 2013
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
         Array Size : 7813527552 (7451.56 GiB 8001.05 GB)
      Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
        Data Offset : 262144 sectors
       Super Offset : 0 sectors
       Unused Space : before=262072 sectors, after=1024 sectors
              State : clean
        Device UUID : 3a403437:9a1690ea:f6ce8525:730d1d9c
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Thu Feb  5 06:07:11 2015
           Checksum : 355d0e32 - correct
             Events : 624485
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 2
       Array State : AAA.. ('A' == active, '.' == missing, 'R' == replacing)
    
    /dev/sde1:
              Magic : a92b4efc
            Version : 1.1
        Feature Map : 0x1
         Array UUID : 491fdb85:372da78e:8022a675:04a2932c
               Name : kenya:0
      Creation Time : Wed Aug 21 14:18:41 2013
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
         Array Size : 7813527552 (7451.56 GiB 8001.05 GB)
      Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
        Data Offset : 262144 sectors
       Super Offset : 0 sectors
       Unused Space : before=262072 sectors, after=1024 sectors
              State : clean
        Device UUID : 7d7ec5fe:b4b55c4e:4e903357:1aa3bae3
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Thu Feb  5 06:07:11 2015
           Checksum : da06428d - correct
             Events : 624485
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 1
       Array State : AAA.. ('A' == active, '.' == missing, 'R' == replacing)
    
    /dev/sdf1:
              Magic : a92b4efc
            Version : 1.1
        Feature Map : 0x1
         Array UUID : 491fdb85:372da78e:8022a675:04a2932c
               Name : kenya:0
      Creation Time : Wed Aug 21 14:18:41 2013
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
         Array Size : 7813527552 (7451.56 GiB 8001.05 GB)
      Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
        Data Offset : 262144 sectors
       Super Offset : 0 sectors
       Unused Space : before=262072 sectors, after=1024 sectors
              State : clean
        Device UUID : c091025f:8296517b:0237935f:5cc03cfc
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Thu Feb  5 06:07:11 2015
           Checksum : 8819fa93 - correct
             Events : 624485
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 0
       Array State : AAA.. ('A' == active, '.' == missing, 'R' == replacing)
    /dev/sdg1:
       MBR Magic : aa55
    Partition[0] :       808960 sectors at            0 (type 17)
    

    and then there's this:

    [root@localhost ~]# cat /proc/mdstat
    Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] [linear]
    unused devices: <none>
    

    I gathered this info from booting into recovery. The system is centos 6.2. I have learned from some irc help that the sda drive is out of sync with the rest of them. I believe the drive which had failed is now listed as sdg, but i'm not certain of that. I also know the order to the drives is now feda (sdf, sde, sdd, sda).

    I have a replacement drive for the dead one ready for insertion when I can get the rest of this built. I was originally going to try to list it as removed from the array, but I cannot get that status to take.

    My attempt to sign up for and use the linux-raid mailing list have left me wondering if it is even active anymore. ("delivery to [email protected] has failed permanently.") the help from the centos irc channel suggested getting further help from that source. I'm now trying here.

    I also have read through this post but wanted to ask in another forum for a more specific opinion before attempting any of the suggestions toward the end of the thread: http://ubuntuforums.org/showthread.php?t=2276699.

    If there is a working email thread for mdadm or linux-raid, I'm willing to post there. If more data about this situation can be provided, please let me know.

    • Sami Kuhmonen
      Sami Kuhmonen almost 9 years
      mdadm sees that sda1 has a different Events number and assumes it is out of date. It would fix things by syncing but since one disk is missing it can't. That's why it shows only three out of five disks available and can't start the array. There are ways to use --zero-superblock and --assume-clean to force recreation of the array, since the numbers are very close
    • mock
      mock almost 9 years
      zeroing the super blocks is what the ubuntu forums link discusses. i have been considering that possibility, but it sounds like a final step in the attempts to restore the drives. i was looking for any other opinions before i walked down that road...which is probably the course of action i will be end up taking.
    • Sami Kuhmonen
      Sami Kuhmonen almost 9 years
      It certainly seems like one. If there is any other way, I would do that but at least I can't find and don't know of any other method unfortunately.
    • mock
      mock over 8 years
      ok, backed everything up and zeroed out the superblocks on /dev/sd[feda]1. i then was able to create the array using: # mdadm --create /dev/md0 --level=5 --raid-devices=4 --assume-clean /dev/sdf1 /dev/sde1 /dev/sdd1 /dev/sda1. now: # cat /proc/mdstat gives me: Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] [linear] md0 : active raid5 sda1[3] sdd1[2] sde1[1] sdf1[0] 5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] bitmap: 0/15 pages [0KB], 65536KB chunk
    • mock
      mock over 8 years
      the array is put back together. i just now have a problem with LVM not being happy with things, but i think that's a separate matter. i appreciate the help.