mdadm: Which disk is being rebuilt?

linux software-raid mdadm raid1

10,835

Solution 1

When an actual rebuild is being performed, the output of mdadm --detail shows which disk is active and which disk is being rebuilt (at the bottom):

# mdadm --detail /dev/md4
/dev/md4:
        Version : 0.90
  Creation Time : Wed May  4 17:27:03 2016
     Raid Level : raid1
     Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 4
    Persistence : Superblock is persistent

    Update Time : Thu May  5 10:32:11 2016
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 5% complete

           UUID : ef8e2106:7247b48b:06921ed9:9b69639a
         Events : 0.4788

    Number   Major   Minor   RaidDevice State
       2       8       65        0      spare rebuilding   /dev/sde1
       1       8       81        1      active sync   /dev/sdf1

In this case we can see that /dev/sde1 (spare rebuilding) is being rebuilt from /dev/sdf1 (active sync).

Solution 2

I'll just go with the information presented by iostat since there's nothing else that would be accessing the individual disks outside of the software RAID:

# iostat
avg-cpu:  %user   %nice  %system  %iowait  %steal  %idle
          15.35   0.00   1.81     0.27     0.00    82.57

Device:         tps        Blk_read/s   Blk_wrtn/s   Blk_read    Blk_wrtn
sdb             219.27     996.77       19033.92     90847986    1734799374
sda             233.08     17037.32     3364.78      1552824003  306674334

Looks like /dev/sdb is the drive that's degraded here :)

Solution 3

The fact that both disks show as up (U) means that neither of them are degraded as far as md is concerned. Are you sure that this isn't just a regular array check action? If the array were recovering from a failure then I would expect it to say recovery, not resync.

https://raid.wiki.kernel.org/index.php/Resync

AFAIK any device that md considers to be "up" can receive reads/writes.

10,835

Richard Diaz

Updated on September 18, 2022

Comments

Richard Diaz over 1 year

I noticed my software RAID1 degraded, and wanted to figure out which of the two disks in the array is being rebuilt/re-synced since they both show as being up. I am hoping someone can shed some light on this, if it's even possible to figure out which of any disks in a software RAID1 are degraded and being rebuilt to.

# cat /proc/mdstat
md1 : active raid1 sda2[0] sdb2[1]
  955789176 blocks super 1.0 [2/2] [UU]
  [==============>......]  resync = 72.2% (690357504/955789176) finish=4025.9min speed=1098K/sec

md0 : active raid1 sda1[0] sdb1[1]
  20970424 blocks super 1.0 [2/2] [UU]

unused devices: <none>

# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.0
  Creation Time : Fri Dec  7 04:55:25 2012
     Raid Level : raid1
     Array Size : 955789176 (911.51 GiB 978.73 GB)
  Used Dev Size : 955789176 (911.51 GiB 978.73 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Fri Mar 29 23:41:16 2013
          State : active, resyncing 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

  Resync Status : 72% complete

           Name : 
           UUID : 
         Events : 222

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2

Thanks in advance!

cjc about 11 years

I don't think you can see in /proc/mdstat, but there should be something in the log files. Does /var/log/messages show a failure in sda or sdb?
Richard Diaz about 11 years

I didn't find anything in /var/log/messages. Here's what I could find in dmesg: ata2.00: ACPI _SDD failed (AE 0x5) ata1.00: ACPI _SDD failed (AE 0x5)
tgharold over 10 years

Install a tool like "atop" and see which drive is being heavily written to (the drive that is out of sync) or heavily read from (the source drive).