mdadm and 4k sectors (advanced format)

7,772

Yes, it is made for 4k sector alignment.

With 1.1 and 1.2 superblocks, space is reserved at the start of each disk so that the superblock doesn't get trampled. The superblock creation code forces this reserved space to be a multiple of 4kB. All physical reads are offset from the end of this reserved space, not from the end of the superblock. This therefore preserves the alignment for any sector size that divides evenly into 4kB.

If you're interested, here is the proof from the mdadm source code (super1.c):

/* force 4K alignment */
reserved &= ~7ULL;
sb->data_offset = __cpu_to_le64(reserved);

And this data_offset parameter is used by the RAID1 code in the kernel to offset the physical reads, e.g. in the read path:

read_bio->bi_sector = r1_bio->sector + mirror->rdev->data_offset
Share:
7,772

Related videos on Youtube

Halfgaar
Author by

Halfgaar

Updated on September 18, 2022

Comments

  • Halfgaar
    Halfgaar over 1 year

    There are numerous questions on Serverfault about aligning 4k sectors disks, but one thing is not really clear to me yet.

    I successfully aligned my RAID1+LVM. One of the things I did was use mdadm superblock version 1.0 (which stores the superblock at the end of the disk).

    The manpage says this:

    The different sub-versions store the superblock at different locations on the device, either at the end (for 1.0), at the start (for 1.1) or 4K from the start (for 1.2). "1" is equivalent to "1.0". "default" is equivalent to "1.2".

    Is the 1.2 version, which is default, made for 4k sectors drives? The way I see it, it is not, because 4k from the start + the length of the superblock is not a multitude of 4k (the superblock is about 200 bytes long, if I remember correctly).

    Any insight into this is welcome.

    edit:

    below was answered that mdadm superblock 1.1 and 1.2 are meant for 4k alignment. I just created a whole-device raid with:

    mdadm --create /dev/md4 -l 1 -n 2 /dev/sdb /dev/sdd
    

    Then I added a logical volume to it:

    vgcreate universe2 /dev/md4
    

    The array is syncing at 16 MB/s:

    md4 : active raid1 sdd[1] sdb[0]
          1465137424 blocks super 1.2 [2/2] [UU]
          [>....................]  resync =  0.8% (13100352/1465137424) finish=1471.6min speed=16443K/sec
    

    So I doubt it is properly aligned.

    (disks are 1.5 TB WD EARS. I have them in my desktop PC and they synced at about 80 MB/s.)

    Edit2:

    Here's --examine output:

    # mdadm --examine /dev/sdb
    /dev/sdb:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 79843828:7d939cce:1c8f0b32:cf339870
               Name : brick:4  (local to host brick)
      Creation Time : Sat Jul  9 10:47:33 2011
         Raid Level : raid1
       Raid Devices : 2
    
     Avail Dev Size : 2930275120 (1397.26 GiB 1500.30 GB)
         Array Size : 2930274848 (1397.26 GiB 1500.30 GB)
      Used Dev Size : 2930274848 (1397.26 GiB 1500.30 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : active
        Device UUID : dd2e3b5f:33214b96:1cb88169:25deb050
    
        Update Time : Sat Jul  9 10:49:06 2011
           Checksum : 4f7cd785 - correct
             Events : 1
    
    
       Device Role : Active device 0
       Array State : AA ('A' == active, '.' == missing)
    

    Data offset is 2048 sectors, which is dividable by 8, so one would think it's ok. The volume group has a physical extent size of 4 MiB, which is also dividable by 8. But that wouldn't even matter, because the resync is not related to what the device contains.

    Another edit: it doesn't appear to be an alignment issue; since hdparm -t shows a very low read speed for one of the disks (30 MB/s). Something else is amiss.

    Edit2: I never remember to update this post when I found the answer. All is nicely aligned. One of the disks was broken. Apparently it was on its last leg and even that broke at some point. A replacement disk worked fine.

  • Halfgaar
    Halfgaar almost 13 years
    If both 1.1 and 1.2 are suitable for 4k alignment, what is the 1.2 version good for? I mean, why would I want to have the superblock start 4k from the start?
  • Naman Bansal
    Naman Bansal almost 13 years
    It's so the start of the disk can be reserved for boot blocks, allowing the disk to be used as a boot disk.
  • Halfgaar
    Halfgaar almost 13 years
    I just updated my post. By the looks of it, my new array is not properly aligned.