What does mdadm's "spare" number mean?

22,912

Solution 1

For the sake of clarity, I'll aggregate the information given by derobert and Alexandre Alves and some further testing of mine here:

mdadm's --spare-devices parameter does work as the man page states, i.e. it defines the number of "hot spare" drives in an array. A "hot spare", as in normal RAID terminology, does not have anything to do with the extra drives present in a RAID 5 or RAID 6 array -- it is an extra drive meant to take over as soon as a drive in the array has failed.

The number of spare drives is given at array creation time. Later, it can be checked using #mdadm --detail --scan.

However, during the brief period of the initialization of a mdadm-based RAID 5, there is an optimization, described in https://raid.wiki.kernel.org/index.php/Initial_Array_Creation, that makes an additional spare drive appear in the output of that command:

"For raid5 there is an optimisation: mdadm takes one of the disks and marks it as 'spare'; it then creates the array in degraded mode. The kernel marks the spare disk as 'rebuilding' and starts to read from the 'good' disks, calculate the parity and determines what should be on the spare disk and then just writes to it."

After array initialization has finished, the number of spares reported goes back to the number selected at creation time.

Solution 2

That output is correct. You created an RAID 5 with 5 disks ( only 4 of these will be "used" for space). And you added a extra spare drive.

So you actually have a RAID 5 that allow one disk failure + a extra spare drive.

If what you want is actually RAID 5 with 6 disks and have the space of 5 disks then you need to change your command to:

mdadm --create /dev/md0 --level=5 --raid-devices=6 \
/dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1

But in this case you can tolerate only one disk failure as per the specs of RAID 5.

EDIT: Adding the link from the official raid page: You can see a raid 5 with 6 disks and it states spare=1: Initial Array Creation

UPDATE: I decided to create a raid 5 in my system and the spare value disappears once the array is in a clean state:

   Raid Devices : 4
  Total Devices : 4
              State : clean, degraded, recovering
     Active Devices : 3
    Working Devices : 4
     Failed Devices : 0
      Spare Devices : 1

Clean state:

   Raid Devices : 4
  Total Devices : 4
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

So it is as the OP comment, during initial raid 5 creation it has the spare drive until the raid build/sync is completed.

Share:
22,912

Related videos on Youtube

jstarek
Author by

jstarek

Work Innovation Engineer in a public-sector software company Previously: Scientific data management in a large petascale scientific computing facility Hobby Restoring Germany's oldest radio telescope, the Stockert 25 meter dish Learning radio astronomy and high-frequency electronics Design, construction and operation of the observatory's Linux network and scientific workstations Dabbling in Astrophotography

Updated on September 18, 2022

Comments

  • jstarek
    jstarek over 1 year

    I created a mdadm-based RAID 5 from six hard drives using the following command:

    # mdadm --create /dev/md0 --level=5 --raid-devices=5 \
    /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 \
    --spare-devices=1 /dev/sdg1
    

    I expected the array to have one hot spare, namely /dev/sdg1. However, checking mdadm --detail shows 2 spares:

    # mdadm --detail --scan 
    ARRAY /dev/md0 metadata=1.2 spares=2 name=...
    

    Also, the array size as shown by df is 2 TB, which would correspond to only four of my 500 GB drives being used.

    So what exactly is the semantics of --spare-devices? The man page states that it "Specif[ies] the number of spare (eXtra) devices in the initial array.", but that does not seem to be the case here.

    • user
      user almost 11 years
      RAID 5 uses one device's worth of parity. I don't know, which is why I'm making this a comment, but could that possibly have something to do with it? (mdadm --detail including the parity drive in the "spares" count.) You could check this by making a RAID 6 array with no hot spare; if my theory holds, it too will show spares=2.
    • 200_success
      200_success almost 11 years
      In standard RAID terminology, a spare is just an inactive mostly blank disk that can be automatically spring into action to help rebuild the array after another disk fails. A parity disk is not a spare disk. Also, the number of spare disks is not the number of disks that can fail before your data is at risk. (The number is not well defined. A four-disk RAID 10 could handle up to 2 disk failures, but it might also be dead with 2 disk failures.)
    • derobert
      derobert almost 11 years
      I'm guessing you're seeing two spares, because its still doing the initial array init—it is "rebuilding" on to one of the two spares. Once that is done (check progress by cat /proc/mdstat), I think you'll see the expected 1.
    • jstarek
      jstarek almost 11 years
      @derobert, you are right: After the initial rebuild was finished, I got ARRAY /dev/md0 metadata=1.2 spares=1 name=[...] from mdadm --detail --scan and an equivalent info from /proc/mdstat.
  • jstarek
    jstarek almost 11 years
    I think the key information from the wiki page you linked to is this: "For raid5 there is an optimisation: mdadm takes one of the disks and marks it as 'spare'; it then creates the array in degraded mode." -- this would fit nicely with the observation made by derobert in the comments above. Apart from that, I'm afraid you misunderstood my question: I did not want to have 5 drives worth of space. Mentioning the 2 TB was just an observation.