How do I reactivate my MDADM RAID5 array?

linux ubuntu-12.04 software-raid raid-5 mdadm

177,328

Solution 1

The S labels means the disk is regarded as "spare". You should try stopping and re-starting the array:

  mdadm --stop /dev/md0
  mdadm --assemble --scan

to re-assemble the array and if that doesn't work, you may need to update your mdadm.conf, see for example this question for details on how to do that.

Solution 2

This question is a bit old, but the answer might help someone facing a similar situation. Looking at the event counts from the mdadm --examine output you have provided, they seem close enough (955190 - for sdb1 and sdf1, 955219 for sde1 and for sdd1 you've got 955205). If they are below 40-50, this is OK, and in that case the recommended course of action is to assemble your array manually, forcing mdadm to accept the drives despite the event count difference:

Stop the array:

mdadm --stop /dev/md0

Then try to reassemble the array manually:

mdadm --assemble --force /dev/md0 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1

Check the status of the array, to examine if the drive list/structure is OK (bottom of command output will show which drive is at what status and at which position in the array):

mdadm --detail /dev/md0

If the structure is OK, check rebuilding progress:

cat /proc/mdstat

177,328

Author by

Jon Cage

A Computer Systems Engineer based in the UK

Updated on September 18, 2022

Comments

Jon Cage over 1 year

I've just moved house which involved dismantling my server and re-connecting it. Since doing so, one of my MDADM RAID5 arrays is appearing as inactive:

root@mserver:/tmp# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md1 : active raid5 sdc1[1] sdh1[2] sdg1[0]
      3907023872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md0 : inactive sdd1[0](S) sdf1[3](S) sde1[2](S) sdb1[1](S)
      3907039744 blocks

unused devices: <none>

It looks to me as though it's found all of the disks but for some reason doesn't want to use them.

So what do the (S) labels mean and how can I tell MDADM to start using the array again?

[Edit] I just tried stopping and assembling the array with -v:

root@mserver:~# mdadm --stop /dev/md0
mdadm: stopped /dev/md0

root@mserver:~# mdadm --assemble --scan -v
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: added /dev/sdd1 to /dev/md0 as 0 (possibly out of date)
mdadm: added /dev/sdb1 to /dev/md0 as 1 (possibly out of date)
mdadm: added /dev/sdf1 to /dev/md0 as 3 (possibly out of date)
mdadm: added /dev/sde1 to /dev/md0 as 2
mdadm: /dev/md0 assembled from 1 drive - not enough to start the array.

..and entering cat /proc/mdstat looks no different.

[Edit2] Not sure if it helps but this is the result of examining each disk:

root@mserver:~# mdadm --examine /dev/sdb1

/dev/sdb1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
  Creation Time : Sun Feb  1 20:53:39 2009
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Sat Apr 20 13:22:27 2013
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6c8f71a3 - correct
         Events : 955190

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       17        1      active sync   /dev/sdb1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       8       33        3      active sync   /dev/sdc1

root@mserver:~# mdadm --examine /dev/sdd1

/dev/sdd1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
  Creation Time : Sun Feb  1 20:53:39 2009
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
   Raid Devices : 4
  Total Devices : 2
Preferred Minor : 0

    Update Time : Sat Apr 20 18:37:23 2013
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 2
  Spare Devices : 0
       Checksum : 6c812869 - correct
         Events : 955205

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8      113        0      active sync   /dev/sdh1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       0        0        3      faulty removed

root@mserver:~# mdadm --examine /dev/sde1

/dev/sde1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
  Creation Time : Sun Feb  1 20:53:39 2009
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
   Raid Devices : 4
  Total Devices : 2
Preferred Minor : 0

    Update Time : Sun Apr 21 14:00:43 2013
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 2
  Spare Devices : 0
       Checksum : 6c90cc70 - correct
         Events : 955219

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       97        2      active sync   /dev/sdg1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       0        0        3      faulty removed

root@mserver:~# mdadm --examine /dev/sdf1

/dev/sdf1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2f331560:fc85feff:5457a8c1:6e047c67 (local to host mserver)
  Creation Time : Sun Feb  1 20:53:39 2009
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Sat Apr 20 13:22:27 2013
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6c8f71b7 - correct
         Events : 955190

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       33        3      active sync   /dev/sdc1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       8       33        3      active sync   /dev/sdc1

I have some notes which suggest the drives were originally assembled as follows:

md0 : active raid5 sdb1[1] sdc1[3] sdh1[0] sdg1[2]
      2930279808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

[Edit3]

Looking through the log it looks like the following happened (based on the Update Time in the --examine results):

sdb and sdf were knocked out some time after 13:22 on the 20th
sdd was knocked out some time after 18:37 on the 20th
the server was shut down some time after 14:00 on the 1st

Given that two disks went down (apparently) simultaneously I think it should be reasonably safe to assume the array wouldn't have been written to after that point(?) and so it should be relatively safe to force it to re-instate in the correct order? What's the safest command to do that with and is there a way to do it without writing any changes?

Jon Cage almost 11 years

Tried that (and added -v to see what was going on) but all the disks which should be added get the responses along the following lines: mdadm: /dev/sdb1 is busy - skipping.
krizna almost 11 years

just stop md0 and re-assemble the array
Jon Cage almost 11 years

tried that - still no luck (see my edit)
Stefan Seidel almost 11 years

Ok, it looks like it thinks the RAID wasn't shut down properly, if you are sure it wasn't, try -R or -f. If that fails, too, re-create the array using mdadm create /dev/md0 --assume-clean <original create options> /dev/sd[dbfe]1. Be warned: All of these options may destroy your data.
Jon Cage almost 11 years

We had a power cut not long before we moved house (which is the only reason I dismantled the array) so it's quite possible it wasn't shut down properly. Is this likely to be caused because the drives are now in a different order? i.e. would it be safer to simply swap the drives back to the correct order? I'd rather not do anything which result in a dead array until that's the last option.
Jon Cage almost 11 years

Do the resync before or after the assemble?
Stefan Seidel almost 11 years

After, because before it wouldn't be possible to do it.
Jon Cage almost 11 years

Well I went for it and mdadm --assemble --scan --force worked. The array is back up and running and I have access to my data :)
Nathan V about 4 years

--assemble --force is what I needed to get an array back online today.