How do I recover files from a single degraded mdadm raid1 drive? "not enough to start the array"

24,997

If it's RAID 1, and if you know the data offset (e.g. 2048 sectors, you can usually find out the exact data offset by mdadm --examine /dev/sdb1), then you can create a read-only (to be safe) loop device like so:

# losetup --find --show --read-only --offset $((2048*512)) /dev/sdb1

/dev/loop7

And then try to check then mount the printed loop device:

# fsck.ext3 -n -v /dev/loop7
# mount -o ro /dev/loop7 /mnt/recovery

mount might be able do this directly with the -o ro,loop,offset= options, but I prefer to create the loop device manually, just to make sure it's really read-only.

If the filesystem itself is damaged, you should make an image using dd, and run experiments such as fsck on the image. Alternatively you can use Linux network block device to put a copy-on-write layer on top of the disk, so you can fsck that layer without actually writing anything onto the disk itself (nbd-server -c/nbd-client, will create a /dev/nbdX device for you to play with). It might be possible to do the same with device mapper - but I've never tried it.

Share:
24,997

Related videos on Youtube

Bryce
Author by

Bryce

Updated on September 18, 2022

Comments

  • Bryce
    Bryce over 1 year

    Given a single raid1 drive in degraded/rebuilding state, can it be force mounted? I'd like to recover all the files before undertaking the dangerous operation of pairing it and rebuilding. As far as I can tell the drive is in perfectly good shape, fully intact. The pair drive is partly failed.

    If the drive was not in rebuilding state I'd know exactly what to do. Here is what I have tried:

    # mdadm --verbose --assemble /dev/md8 /dev/sdb1  --force
    mdadm: looking for devices for /dev/md8
    mdadm: /dev/sdb1 is identified as a member of /dev/md8, slot 1.
    mdadm: no uptodate device for slot 0 of /dev/md8
    mdadm: added /dev/sdb1 to /dev/md8 as 1
    mdadm: /dev/md8 assembled from 0 drives and  1 rebuilding - not enough to start the array.
    
    # cat /proc/mdstat                       
    md8 : inactive sdb1[1](S)
          976759808 blocks super 1.2          
    md0 : active raid1 sdc1[0]
          976759672 blocks super 1.2 [2/1] [U_]
    
    # mdadm --stop /dev/md8
    mdadm: stopped /dev/md8
    
    # mount /dev/sdb1 /mnt/temp2
    mount: unknown filesystem type 'linux_raid_member'
    
    # mount -o ro -t ext3 -b 2048 /dev/sdb1 /mnt/temp1
    mount: wrong fs type, bad option, bad superblock on /dev/sdb1.
    
    # foremost -i /dev/sdb -o /tmp/foo    (this results in perfectly good files)
    

    In this particular case the foremost command recovers files, so something is definitely on the drive, if I could only get the superblock offset correct.

    And in this particular case assembling both halves of the array crashes the kernel(!), so that's not a real option anyway (aside from the safety issues).


    UPDATE: added output of mdadm

    # mdadm --examine /dev/sdb1
    /dev/sdb1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x2
         Array UUID : e00a291e:016bbe47:09526c90:3be48df3
               Name : ubuntu:0
      Creation Time : Wed May 11 12:26:39 2011
         Raid Level : raid1
       Raid Devices : 2
    
     Avail Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
         Array Size : 1953519344 (931.51 GiB 1000.20 GB)
      Used Dev Size : 1953519344 (931.51 GiB 1000.20 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
    Recovery Offset : 0 sectors
              State : clean
        Device UUID : 41346f44:ccacbbf7:0c17c133:eb7b341f
    
        Update Time : Sat Apr 13 00:02:08 2013
           Checksum : 483a0a44 - correct
             Events : 402833
    
    
       Device Role : Active device 1
       Array State : AA ('A' == active, '.' == missing)
    
    • frostschutz
      frostschutz about 11 years
      Output of mdadm --examine /dev/sdb1?
    • Bryce
      Bryce about 11 years
      Added to question.
    • frostschutz
      frostschutz about 11 years
      blockdev --getsize /dev/sdb1 is 1953521664 or larger? cat /proc/mdstat includes Personalities : [raid1]?
    • Bryce
      Bryce about 11 years
      @frostschutz yes to both.
    • Bryce
      Bryce about 11 years
      Each is bad in its own way. Between the two I hope to get everything.
  • Bryce
    Bryce about 11 years
    # mdadm /dev/md8 --grow --raid-devices=1 --force gives "mdadm: /dev/md8 is not an active md array - aborting"
  • Bryce
    Bryce about 11 years
    "mount -o ro -t ext3 /dev/loop7 /mnt/temp1" gives "mount: wrong fs type, bad option, bad superblock on /dev/loop7". A similar raid1 drive works with "mount -o ro -t ext3 -b 2048 /dev/sde1 /mnt/temp1".
  • Anthon
    Anthon about 11 years
    Have you tried mdadm --assemble --scan that might work because md8 shows up in the /proc/mdstat. You still have to mount md8 afterwards
  • frostschutz
    frostschutz about 11 years
    What does file -s say for the created loop device?
  • Bryce
    Bryce about 11 years
    I'd really prefer to just skip the raid complexity and just mount the underlying fs.
  • Bryce
    Bryce about 11 years
    "file -s" shows "Linux rev 1.0 ext3 filesystem data, UUID=ad88ff39-8f6c-4bb9-80de-bf56feae31b1 (needs journal recovery) (large files)". See also superuser.com/questions/256251/…
  • frostschutz
    frostschutz about 11 years
    So the loop device / offset should be good. Anything in dmesg when you try to mount? The filesystem may be damaged somehow.
  • Anthon
    Anthon about 11 years
    In that case you have to go with the offset. I tried recreating the error you get with a script, but even though I get the no uptodate device for slot 1 of /dev/md8 the /dev/md8 start (in degraded mode). I have had that both with just removing the other drive and with assigning the other one to /dev/md0. I did have to reboot the machine often, as somehow the kernel keeps information about the raid partitions even when a raid device is stopped and zero-ed. But even without rebooting I was never able to reproduce the problem.
  • Bryce
    Bryce about 11 years
    yeah, looks like the drive is bad, though SMART self-check passes: "fsck.ext3: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/loop1"
  • dvkch
    dvkch almost 10 years
    Wow. Just wow. I've just spent hours reading everything I could to mount my single-drive broken raid0 and this is the only thing that worked! I created a loop device using the data offset found by mdadm --examine /dev/DRIVE, and can now mount it. Thank you a thousand times!