Restore lost RAID1 volume on Synology with mdam

raid mdadm synology

8,715

Eventually (after days of exploration) I managed to force the array to work and to copy the data.

First, the cause was disk bad sectors - I suppose in the area of raid superblock and/or partition table.

Second, I had to use dmesg to see errors during mdadm --assemble or mdadm --create:

 [Thu Mar 19 23:27:04 2015] end_request: I/O error, dev sdc, sector 9437194

So I took the following steps to get rid of situation. Please keep in mind, I'm not guarantee that this way is correct in all details, and probably could lead to data loss, but it helped me.

Bad Sectors

First of all, I take care of bad disk sectors (I don't know why they didn't automatically remapped). And probably this caused some problems with data on another disk to.

Checked several sectors around fault one:

hdparm --read-sector 9437191 /dev/sdc
...
hdparm --read-sector 9437195 /dev/sdc
....
hdparm --read-sector 9437199 /dev/sdc

And then fixed bad ones:

hdparm --yes-i-know-what-i-am-doing --write-sector 9437198 /dev/sdc

/dev/sdc Partition Table

Next I wanted to restore and check sdc partition table: I used testdisk which is not a part of Synology standard distribution, but could be installed from [Synocommunity repository][1]. After installation it can be accessed from console by /usr/local/testdisk/bin/testdisk .

Select a disk, then [EFI GPT] partition map.
Analyze, and Quick search. It found several partitions.

TestDisk 7.0-WIP, Data Recovery Utility, January 2015
Christophe GRENIER 
http://www.cgsecurity.org

Disk /dev/sdc - 3000 GB / 2794 GiB - CHS 364801 255 63
     Partition               Start        End    Size in sectors
 D MS Data                      256    4980607    4980352 [1.41.10-2219]
 P Linux Raid                   256    4980735    4980480 [md0]
 D Linux Swap               4980736    9174895    4194160
>P Linux Raid               4980736    9175039    4194304 [md1]
 P Linux Raid               9437184 5860523271 5851086088 [DiskStation:3]

Marked all Linux Raid as P (primary) partitions, the rest market as D (deleted). Write partition table.

Eventually - partprobe /dev/sdc to update system partition table (without need to reboot).

mdadm

Now it's became possible to restore raid superblock.

mdadm --zero-superblock /dev/sdc3

This helped me to clear old and probably damaged info about raid array. I think this action is dangerous in many cases.

mdadm --create /dev/md3 --verbose --assume-clean --metadata=1.2 --level=1 --raid-devices=2 /dev/sdc3 missing

But in my case it has restored a raid1 with 1 disk available, and without data loss.

I don't know what was the reason, but file system size (ext4) on md3 was slightly different comparing to physical size of md3. So I run:

resize2fs /dev/md3

And file system check:

fsck.ext4 -f -C 0 /dev/md3

And now it became possible to mount array:

mount -t ext4 -o ro /dev/sdc3 /volume2

So I successfully copied all the data.

8,715

Andre

Updated on September 18, 2022

Comments

Andre almost 2 years

Several days ago I found my DS412+ in fatal state. Volume1 crashed, system volume too. Moreover, Volume2 disappeared from the system! It looks like Volume1 had no free space and cannot transfer data from a couple of bad blocks to a new place and that hurt the system data. (it's just a theory).

I managed to return Volume1 back to life using the procedures described here (e2fsck, mdadm reassemble). BTW have to mention new syno_poweroff_task command that simplifies the process!

Then I restored system volume using Synology GUI. Everything started working OK except that I cannot restore Volume2. It was RAID1 array consist of 2 disks of the same size. This is excerpt from /etc/space_history*.xml of the date right before the crash:

<space path="/dev/md3" reference="/volume2" >
    <device>
        <raid path="/dev/md3" uuid="927afd83:*" level="raid1" version="1.2">
            <disks>
                <disk status="normal" dev_path="/dev/sdc3" model="WD30EFRX-68AX9N0        " serial="WD-*" partition_version="7" slot="1">
                </disk>
                <disk status="normal" dev_path="/dev/sdd3" model="WD30EFRX-68AX9N0        " serial="WD-*" partition_version="7" slot="0">
                </disk>
            </disks>
        </raid>
    </device>
    <reference>
        <volume path="/volume2" dev_path="/dev/md3">
        </volume>
    </reference>

RAID members (/dev/sdc3 and /dev/sdd3) are still on their places, and it looks like they are OK, at least /dev/sdc3.

DiskStation> mdadm --misc --examine /dev/sdc3
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 600cff1e:0e27a96d:883007c3:610e73ef
           Name : DiskStation:3  (local to host DiskStation)
  Creation Time : Thu Mar 19 22:21:08 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 5851088833 (2790.02 GiB 2995.76 GB)
     Array Size : 5851088512 (2790.02 GiB 2995.76 GB)
      Used Dev Size : 5851088512 (2790.02 GiB 2995.76 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f0b910a0:1de7081f:dd65ec22:a2a16d58

    Update Time : Thu Mar 19 22:21:08 2015
       Checksum : a09b6690 - correct
         Events : 0

   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing)

I've tried a lot of tricks with mdadm, in many forms like this:

mdadm -v --assemble /dev/md3 /dev/sdc3 /dev/sdd3
mdadm --verbose --create /dev/md3 --level=1 --raid-devices=2 /dev/sdc3 /dev/sdd3 --force
mdadm --verbose --create /dev/md3 --level=1 --raid-devices=2 /dev/sdc3 missing

All of them resulting in something like that:

mdadm: ADD_NEW_DISK for /dev/sdc3 failed: Invalid argument

Is there any chance to restore RAID volume? Or is there any chance to restore data from the volume? For example, mounting /dev/sdc3 member directly?

More mdadm info:

DiskStation> cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid1 sdb3[0]
      2925544256 blocks super 1.2 [1/1] [U]

md1 : active raid1 sdb2[0] sdc2[1]
      2097088 blocks [4/2] [UU__]

md0 : active raid1 sdb1[2] sdc1[0]
      2490176 blocks [4/2] [U_U_]