How do I reactivate my MDADM RAID5 array?

177,328

Solution 1

The S labels means the disk is regarded as "spare". You should try stopping and re-starting the array:

  mdadm --stop /dev/md0
  mdadm --assemble --scan

to re-assemble the array and if that doesn't work, you may need to update your mdadm.conf, see for example this question for details on how to do that.

Solution 2

This question is a bit old, but the answer might help someone facing a similar situation. Looking at the event counts from the mdadm --examine output you have provided, they seem close enough (955190 - for sdb1 and sdf1, 955219 for sde1 and for sdd1 you've got 955205). If they are below 40-50, this is OK, and in that case the recommended course of action is to assemble your array manually, forcing mdadm to accept the drives despite the event count difference:

Stop the array:

mdadm --stop /dev/md0

Then try to reassemble the array manually:

mdadm --assemble --force /dev/md0 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1

Check the status of the array, to examine if the drive list/structure is OK (bottom of command output will show which drive is at what status and at which position in the array):

mdadm --detail /dev/md0

If the structure is OK, check rebuilding progress:

cat /proc/mdstat
Share:
177,328
Jon Cage
Author by

Jon Cage

A Computer Systems Engineer based in the UK

Updated on September 18, 2022

Comments

  • Jon Cage
    Jon Cage over 1 year

    I've just moved house which involved dismantling my server and re-connecting it. Since doing so, one of my MDADM RAID5 arrays is appearing as inactive:

    root@mserver:/tmp# cat /proc/mdstat 
    Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
    md1 : active raid5 sdc1[1] sdh1[2] sdg1[0]
          3907023872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
    
    md0 : inactive sdd1[0](S) sdf1[3](S) sde1[2](S) sdb1[1](S)
          3907039744 blocks
    
    unused devices: <none>
    

    It looks to me as though it's found all of the disks but for some reason doesn't want to use them.

    So what do the (S) labels mean and how can I tell MDADM to start using the array again?

    [Edit] I just tried stopping and assembling the array with -v:

    root@mserver:~# mdadm --stop /dev/md0
    mdadm: stopped /dev/md0
    
    root@mserver:~# mdadm --assemble --scan -v
    mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
    mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
    mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 0.
    mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
    mdadm: added /dev/sdd1 to /dev/md0 as 0 (possibly out of date)
    mdadm: added /dev/sdb1 to /dev/md0 as 1 (possibly out of date)
    mdadm: added /dev/sdf1 to /dev/md0 as 3 (possibly out of date)
    mdadm: added /dev/sde1 to /dev/md0 as 2
    mdadm: /dev/md0 assembled from 1 drive - not enough to start the array.
    

    ..and entering cat /proc/mdstat looks no different.

    [Edit2] Not sure if it helps but this is the result of examining each disk:

    root@mserver:~# mdadm --examine /dev/sdb1

    /dev/sdb1:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
      Creation Time : Sun Feb  1 20:53:39 2009
         Raid Level : raid5
      Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
         Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 0
    
        Update Time : Sat Apr 20 13:22:27 2013
              State : clean
     Active Devices : 4
    Working Devices : 4
     Failed Devices : 0
      Spare Devices : 0
           Checksum : 6c8f71a3 - correct
             Events : 955190
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     1       8       17        1      active sync   /dev/sdb1
    
       0     0       8      113        0      active sync   /dev/sdh1
       1     1       8       17        1      active sync   /dev/sdb1
       2     2       8       97        2      active sync   /dev/sdg1
       3     3       8       33        3      active sync   /dev/sdc1
    

    root@mserver:~# mdadm --examine /dev/sdd1

    /dev/sdd1:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
      Creation Time : Sun Feb  1 20:53:39 2009
         Raid Level : raid5
      Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
         Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
       Raid Devices : 4
      Total Devices : 2
    Preferred Minor : 0
    
        Update Time : Sat Apr 20 18:37:23 2013
              State : active
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 2
      Spare Devices : 0
           Checksum : 6c812869 - correct
             Events : 955205
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     0       8      113        0      active sync   /dev/sdh1
    
       0     0       8      113        0      active sync   /dev/sdh1
       1     1       0        0        1      faulty removed
       2     2       8       97        2      active sync   /dev/sdg1
       3     3       0        0        3      faulty removed
    

    root@mserver:~# mdadm --examine /dev/sde1

    /dev/sde1:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
      Creation Time : Sun Feb  1 20:53:39 2009
         Raid Level : raid5
      Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
         Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
       Raid Devices : 4
      Total Devices : 2
    Preferred Minor : 0
    
        Update Time : Sun Apr 21 14:00:43 2013
              State : clean
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 2
      Spare Devices : 0
           Checksum : 6c90cc70 - correct
             Events : 955219
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     2       8       97        2      active sync   /dev/sdg1
    
       0     0       0        0        0      removed
       1     1       0        0        1      faulty removed
       2     2       8       97        2      active sync   /dev/sdg1
       3     3       0        0        3      faulty removed
    

    root@mserver:~# mdadm --examine /dev/sdf1

    /dev/sdf1:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
      Creation Time : Sun Feb  1 20:53:39 2009
         Raid Level : raid5
      Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
         Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
       Raid Devices : 4
      Total Devices : 4
    Preferred Minor : 0
    
        Update Time : Sat Apr 20 13:22:27 2013
              State : clean
     Active Devices : 4
    Working Devices : 4
     Failed Devices : 0
      Spare Devices : 0
           Checksum : 6c8f71b7 - correct
             Events : 955190
    
             Layout : left-symmetric
         Chunk Size : 64K
    
          Number   Major   Minor   RaidDevice State
    this     3       8       33        3      active sync   /dev/sdc1
    
       0     0       8      113        0      active sync   /dev/sdh1
       1     1       8       17        1      active sync   /dev/sdb1
       2     2       8       97        2      active sync   /dev/sdg1
       3     3       8       33        3      active sync   /dev/sdc1
    

    I have some notes which suggest the drives were originally assembled as follows:

    md0 : active raid5 sdb1[1] sdc1[3] sdh1[0] sdg1[2]
          2930279808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
    

    [Edit3]

    Looking through the log it looks like the following happened (based on the Update Time in the --examine results):

    1. sdb and sdf were knocked out some time after 13:22 on the 20th
    2. sdd was knocked out some time after 18:37 on the 20th
    3. the server was shut down some time after 14:00 on the 1st

    Given that two disks went down (apparently) simultaneously I think it should be reasonably safe to assume the array wouldn't have been written to after that point(?) and so it should be relatively safe to force it to re-instate in the correct order? What's the safest command to do that with and is there a way to do it without writing any changes?

  • Jon Cage
    Jon Cage almost 11 years
    Tried that (and added -v to see what was going on) but all the disks which should be added get the responses along the following lines: mdadm: /dev/sdb1 is busy - skipping.
  • krizna
    krizna almost 11 years
    just stop md0 and re-assemble the array
  • Jon Cage
    Jon Cage almost 11 years
    tried that - still no luck (see my edit)
  • Stefan Seidel
    Stefan Seidel almost 11 years
    Ok, it looks like it thinks the RAID wasn't shut down properly, if you are sure it wasn't, try -R or -f. If that fails, too, re-create the array using mdadm create /dev/md0 --assume-clean <original create options> /dev/sd[dbfe]1. Be warned: All of these options may destroy your data.
  • Jon Cage
    Jon Cage almost 11 years
    We had a power cut not long before we moved house (which is the only reason I dismantled the array) so it's quite possible it wasn't shut down properly. Is this likely to be caused because the drives are now in a different order? i.e. would it be safer to simply swap the drives back to the correct order? I'd rather not do anything which result in a dead array until that's the last option.
  • Jon Cage
    Jon Cage almost 11 years
    Do the resync before or after the assemble?
  • Stefan Seidel
    Stefan Seidel almost 11 years
    After, because before it wouldn't be possible to do it.
  • Jon Cage
    Jon Cage almost 11 years
    Well I went for it and mdadm --assemble --scan --force worked. The array is back up and running and I have access to my data :)
  • Nathan V
    Nathan V about 4 years
    --assemble --force is what I needed to get an array back online today.