mdadm: Which disk is being rebuilt?

10,835

Solution 1

When an actual rebuild is being performed, the output of mdadm --detail shows which disk is active and which disk is being rebuilt (at the bottom):

# mdadm --detail /dev/md4
/dev/md4:
        Version : 0.90
  Creation Time : Wed May  4 17:27:03 2016
     Raid Level : raid1
     Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 4
    Persistence : Superblock is persistent

    Update Time : Thu May  5 10:32:11 2016
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 5% complete

           UUID : ef8e2106:7247b48b:06921ed9:9b69639a
         Events : 0.4788

    Number   Major   Minor   RaidDevice State
       2       8       65        0      spare rebuilding   /dev/sde1
       1       8       81        1      active sync   /dev/sdf1

In this case we can see that /dev/sde1 (spare rebuilding) is being rebuilt from /dev/sdf1 (active sync).

Solution 2

I'll just go with the information presented by iostat since there's nothing else that would be accessing the individual disks outside of the software RAID:

# iostat
avg-cpu:  %user   %nice  %system  %iowait  %steal  %idle
          15.35   0.00   1.81     0.27     0.00    82.57

Device:         tps        Blk_read/s   Blk_wrtn/s   Blk_read    Blk_wrtn
sdb             219.27     996.77       19033.92     90847986    1734799374
sda             233.08     17037.32     3364.78      1552824003  306674334

Looks like /dev/sdb is the drive that's degraded here :)

Solution 3

The fact that both disks show as up (U) means that neither of them are degraded as far as md is concerned. Are you sure that this isn't just a regular array check action? If the array were recovering from a failure then I would expect it to say recovery, not resync.

https://raid.wiki.kernel.org/index.php/Resync

AFAIK any device that md considers to be "up" can receive reads/writes.

Share:
10,835

Related videos on Youtube

Richard Diaz
Author by

Richard Diaz

Updated on September 18, 2022

Comments

  • Richard Diaz
    Richard Diaz over 1 year

    I noticed my software RAID1 degraded, and wanted to figure out which of the two disks in the array is being rebuilt/re-synced since they both show as being up. I am hoping someone can shed some light on this, if it's even possible to figure out which of any disks in a software RAID1 are degraded and being rebuilt to.

    # cat /proc/mdstat
    md1 : active raid1 sda2[0] sdb2[1]
      955789176 blocks super 1.0 [2/2] [UU]
      [==============>......]  resync = 72.2% (690357504/955789176) finish=4025.9min speed=1098K/sec
    
    md0 : active raid1 sda1[0] sdb1[1]
      20970424 blocks super 1.0 [2/2] [UU]
    
    unused devices: <none>
    
    # mdadm --detail /dev/md1
    /dev/md1:
            Version : 1.0
      Creation Time : Fri Dec  7 04:55:25 2012
         Raid Level : raid1
         Array Size : 955789176 (911.51 GiB 978.73 GB)
      Used Dev Size : 955789176 (911.51 GiB 978.73 GB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Fri Mar 29 23:41:16 2013
              State : active, resyncing 
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
      Resync Status : 72% complete
    
               Name : 
               UUID : 
             Events : 222
    
        Number   Major   Minor   RaidDevice State
           0       8        2        0      active sync   /dev/sda2
           1       8       18        1      active sync   /dev/sdb2
    

    Thanks in advance!

    • cjc
      cjc about 11 years
      I don't think you can see in /proc/mdstat, but there should be something in the log files. Does /var/log/messages show a failure in sda or sdb?
    • Richard Diaz
      Richard Diaz about 11 years
      I didn't find anything in /var/log/messages. Here's what I could find in dmesg: ata2.00: ACPI _SDD failed (AE 0x5) ata1.00: ACPI _SDD failed (AE 0x5)
    • tgharold
      tgharold over 10 years
      Install a tool like "atop" and see which drive is being heavily written to (the drive that is out of sync) or heavily read from (the source drive).