What's the point of hard drives reporting their physical sector size?

24,937

Solution 1

The 512-byte emulation is intended for compatibility with older systems. However, writes involving only part of a physical 4K sector can cause reduced performance because the sector needs to be read and modified before it can actually be written.

When a legacy operating system tries to write to an Advanced Format disk, performance issues can arise because the logical sectors written may not match up with the physical sectors.

  • When only part of a 4K physical sector is read, the data is simply read off the physical sector and there is no reduction in performance. However, when the system tries to write to part of a physical sector (e.g. an emulated 512-byte sector rather than the whole physical sector), the hard drive needs to read the whole physical sector, modify the changed portion in the hard drive's internal memory, and write it back to the platters. This is called read-modify-write (RMW), an operation which requires an extra rotation of the disk and therefore reduces performance. Seagate explains this as follows:

[...] the hard drive must first read the entire 4K sector containing the targeted location of the host write request, merge the existing data with the new data and then rewrite the entire 4K sector:

Read-modify-write cycle

In this instance, the hard drive must perform extra mechanical steps in the form of reading a 4K sector, modifying the contents and then writing the data. This process is called a read-modify-write cycle, which is undesirable because it has a negative impact on hard drive performance.

Disk partitions that are not aligned to a 4K boundary can cause degraded performance as well.

  • Traditionally, the first partition on a hard disk starts at sector 63. Windows XP and older operating systems partitioned disks in this manner. Newer versions of Windows will create partitions on a 1 MB boundary, ensuring proper alignment to the physical sectors. This is called Alignment 0.

  • Because LBA 63 is not a multiple of 8 (eight 512-byte legacy sectors fit into a 4K sector), an Advanced Format disk which is formatted in the old manner will have clusters (the smallest unit of filesystem data allocation, typically 4K in size) that are not aligned to the physical sectors on a 4K disk, a condition called Alignment 1. As a result, an I/O operation that otherwise involves 4K of data now spans two sectors leading to a read-modify-write operation that reduces performance.

While information about physical sector size is unnecessary if the OS always writes data on a 4K boundary, this information may still be needed by applications which perform low-level I/O.

  • When a drive reports that its physical sector size is 4K, the OS or application can tell that it is an Advanced Format drive and therefore must avoid performing I/O operations that do not span full physical sectors. A drive that reports 512-byte native sectors does not impose this restriction. While newer operating systems will usually try to read or write data in 4K units whenever possible (making this information irrelevant), applications which perform low-level I/O may need to know the physical sector size so that they can adjust accordingly and avoid misaligned or partial-sector writes that cause slow RMW cycles.

Your SSD provides the ability to change the reported physical sector size because it is necessary for compatibility with certain storage arrays.

  • Datacenters often have storage arrays consisting of legacy 512n drives. 4K drives, even those that emulate 512-byte sectors, may not be compatible with such arrays, so this feature is necessary to ensure compatibility. See this forum thread:

    We can't just stick a 4K drive in an array formatted with 512b disks. Many arrays (most notably ZFS based storage, which is increasingly popular as software defined storage makes waves) will not accept a replacement drive with a different physical sector format.

    Note that better performance will be attained on modern systems if the drive is configured to use 4K sectors.

Solution 2

What benefit does an OS gain by being aware of the physical sector size when, regardless, the OS has to talk to the drive in 512-byte sectors.

The logical size is a minimum size to transfer data. Since this is a block device, any data transfer between host computer and drive will be in multiples of this logical block size.

The physical size is an optimal size to transfer data, and reflects the size of the actual read and write operations at the controller/drive level.

When the host computer requests a read of a logical sector, the controller/drive will perform a read operation of the physical sector that contains the logical sector.
When the logical sector size is equal to the physical sector size, the operation is simple. When the logical sector size is less than the physical sector size, the logical sector has to be extracted from the physical sector by the controller for transfer to the host computer.

When the host computer requests a write of a logical sector, the size of the physical sector matters.
When the logical sector size is equal to the physical sector size, the write operation is simple, and can proceed directly. The condition of the previous contents of the sector will not affect the write operation.

When the logical sector size is less than the physical sector size, the controller must first perform a read operation of the physical sector that contains the logical sector.
If the read is successful, then the logical sector is inserted into the physical sector, and the physical sector is written in its entirely.
If the read is not successful (even after retries), the write operation cannot be completed.

If the OS performs the read and write operations with the physical sector size (by utilizing the multi-sector operations available in the ATAPI command set), the write operations will be performed more efficiently (and without an unnecessary chance of incompletion).

The LOGICAL sector size entirely defines how an OS can talk to a drive. No exceptions. What use is it knowing the physical sector size, when you're only allowed to communicate in logical sector size?

Your contention of "no exceptions" is incorrect.
The ATAPI command set, which was introduced with the IDE HDD, has always had the capability to perform read and write operations with a sector count parameter. This is merely an extension of existing disk and floppy controller interfaces that were also capable of multi-sector read/write operations (so long as the sectors were on the same track).

Solution 3

If the OS knows the underlying physical sector size, it can optimize its queries to require as few physical operations as possible. Particularly with SSDs, the physical operation limit (4KB IOPS limit) is often the ultimate limit of device speed, so being able to make best use of this capacity is important.

Solution 4

512/4096 = OS responsible for alignment/optimization,

512/512 = Drive responsible for this

See also : http://support.microsoft.com/en-us/kb/2510009

Solution 5

There are two different ways of accessing a location within a drive, one is the CHS scheme and the other is the LBA scheme.

CHS stands for Cylinder, Head, Sector and is the most low-level method of determining where to read or write from the drive. You tell it to use cylinder x, head y, and sector z and read or write the contents of that location to or from an address in the memory (a buffer). It is derived from the actual, physical components of a (traditional, spinning rust) hard drive, where you have physical cylinders and read heads. The sector is the smallest addressable unit, and was traditionally fixed at 512 bytes.

LBA is logical byte addressing wherein the drive reads from and writes to a sector address by its offset, for example, read the 123837th sector on the disk or write this to the 123734th sector on the disk (starting from zero).

The problem? Each of these values is limited in range. In fact, because of how severely limited CHS was, LBA had to be introduced. For CHS, the possible values for C (the cylinder) is 1023, while H (heads) can be 255 maximum, and S (sector) can only go up to 63, meaning you can have at most 1024 cylinders x 255 heads x 64 sectors x 512 bytes mapped in traditional CHS format, giving you a grand total of under 8 GiB! Using CHS, it's simply not possible to access a disk larger than 8 GiB!

So LBA was introduced with a 32-bit limit giving you 2^32 x 512 bytes or 2 TiB limit on disk size - this is the reason an MBR disk cannot exceed 2TiB because it uses CHS and LBA to specify partition sizes, and neither can support anything over 2TiB.

Newer, better options have been introduced like the GPT partitioning scheme which extends LBA to 64 bits, giving you a heck of a lot more than you'll ever need at 2^64 x 512 bytes - but there's a catch: a lot of legacy hardware and legacy operating systems and legacy BIOS implementations and legacy drivers don't support UEFI or GPT, and a lot of people would like to have something that can be more-easily upgraded to go past the 2TiB limit without having to rewrite the entire stack from scratch. And, at long last, we reach the 4096 sector size.

See, throughout all the limitations discussed above, one thing has been a fixed assumption: the sector size. From day one, it has been 512 bytes and it's stayed that way ever since. But recently, hard disk manufacturers realized there's an opportunity to work some magic: take the traditional CHS or 32-bit LBA and simply replace the sector size with 4096 (4k) instead of 512 bytes. When an OS says "give me the 2nd sector on the disk" by requesting LBA 1 (because LBA 0 is the first), we aren't going to give it bytes 512 - 1023 but rather bytes 4096 - 8191.

Suddenly, our 2TiB limit is upgraded to 2^32 x 4096 bytes, or 16 TiB, without having to ditch MBR, switch to UEFI or GPT, or anything!

The only catch is that if the OS isn't aware that this is a magic disk that uses 4096 sectors instead of 512 byte sectors, there's going to be a mismatch. Each time the OS says "hey, you, disk, write me these 512 bytes to offset xxx" the disk will use up 4096 bytes to store these 512 bytes (the rest being zeros or junk data, assuming you don't end up with a memory underflow) because they don't communicate in bytes, they communicate in sectors.

So BIOSes now (sometimes) include an option to let you manually specify that a 512-byte sector size should be used instead of the native 4096 byte sector size that newer disks are using - with the caveat that you cannot use it to access more than 2TiB of the disk on an MBR system, just like it was in the "good old days." But modern OSes that are 4k-aware can take advantage of all this to use this magic to read and write in 4096-byte chunks and voilà!

(An additional advantage is that things are a lot faster because if you're reading and writing 4096 bytes at a time, it's fewer operations to read or write, say, 4GiB of data.)

Share:
24,937

Related videos on Youtube

misha256
Author by

misha256

Updated on September 18, 2022

Comments

  • misha256
    misha256 almost 2 years

    I have an SSD that can be configured to report its physical sector size to an OS in two different ways:

    Option 1: Logical = 512 Bytes, Physical = 512 Bytes

    Option 2: Logical = 512 Bytes, Physical = 4096 Bytes (4K)

    What benefit does an OS gain by being aware of the 4K physical sector size, considering:

    • The OS must talk to the drive in 512-byte sectors regardless

    • All modern OSes align to 4K and utilize 4K or multiples of 4K I/O regardless

    The setting seems pointless, because modern OSes are already optimized for 4K sector drives. Modern OSes don't need to "ask" a drive whether its sectors are 512b or 4K, because the OS does everything in a 4K-friendly way by default.

    For example, Windows 7 aligns partitions to 1MB (a multiple of 4K), NTFS cluster size is 4K or multiple thereof, and all I/O is done in 4K or multiple thereof. Windows doesn't give a damn what hard drive you have, it will apply the above behavior in all cases.

    Anyway... my SSD has this "physical sector size" setting and so it must be there for some good reason... it's the reason for this I'm looking for.

    BTW, for what it's worth, the drive is an Intel SSD DC S3510. The drive's datasheet says this (page 27):

    By using SCT command 0xD801 with State=0, Option=1, ID Word 106 can be changed from 0x6003 to 0x4000 (4KB physical sector size to 512B physical sector size support change).

    • Admin
      Admin over 8 years
      4096 Bytes is the Advanced format Advanced format hard drives can do either, depending on the OS if the hard drive will emulate 512K
    • Admin
      Admin over 8 years
      Storage interfaces are treasure troves of legacy decisions... "4 KB Physical sector size" is not true either. Flash has physical sector sizes that usually exceed 256 kB. All reported sector sizes are (il)logical.
  • misha256
    misha256 over 8 years
    But why does an OS need to know the physical sector size, and what will it do differently, given that it has to talk to the drive in logical sectors anyway? It seems like absolutely useless information for an OS to know.
  • sawdust
    sawdust over 8 years
    "why does an OS need to know..." -- The logical size is a minimum size to transfer data. The physical size is an optimal size to transfer data, and reflects the size of the actual read/write operation at the drive level.. "It seems like absolutely useless information..." -- Perhaps it seems "useless" to you because you're not developing or involved in an operating system?
  • sawdust
    sawdust over 8 years
    This really doesn't answer the question. Explaining CHS and LBT are irrelevant. This reads like a brain dump of what you know about "sectors". " From day one, it has been 512 bytes..." -- That's only true for the IBM PC.
  • davidgo
    davidgo over 8 years
    @sawdust I disagree - Even ignoring the (imho important) background about CHS and LBA, the succint answer to your question is in the second last paragraph "But modern OSes that are 4k-aware can take advantage of all this to use this magic to read and write in 4096-byte chunks and voilà!" - ie the assumption in the question - that an OS has to talk in 512 byte chunks - is wrong.
  • misha256
    misha256 over 8 years
    @sawdust But the OS can't use the optimal transfer size you're talking about because the drive is hard-wired to 512-Byte logical sectors. Native 4K drives are different, they have 4K logical sectors and a supporting OS (e.g. Win 8.1) is forced to read and write in 4K logical sectors. But my drive is not a 4K logical drive. It's a 512-byte logical drive.
  • sawdust
    sawdust over 8 years
    @misha256 -- I've posted my own answer.
  • misha256
    misha256 over 8 years
    Argh, this can't be right. Modern OS' are inherently optimized. All of them use file-systems with "block" sizes (aka clusters) that are 2^n bytes, starting at 2^12 (i.e. 4K, think NTFS default). Following from this, all I/O operations end up being some multiple of 4K. Whether a disk is physically 512-Byte or 4K shouldn't make a difference. You can't optimize further than this, surely?
  • misha256
    misha256 over 8 years
    This might be the correct answer... but I'm still not convinced. Modern OS operate with file-systems and I/O block sizes of 4K and multiples of 4K. They are already optimized for use with hard drives that have 4K physical sectors. Furthermore, the I/O block sizes employed are still 4K and multiples of 4K even on a 512b physical hard drive. What gives?!
  • bwDraco
    bwDraco over 8 years
    What if the OS doesn't get the alignment right and an I/O operation winds up spanning two physical sectors? You'd get degraded performance.
  • misha256
    misha256 over 8 years
    Bingo! The sector count parameter you speak of... even the ancient Windows XP reads/writes in I/O block size of 8 sectors or multiples thereof. It's already fully optimized! That's why XP performs extremely well with SSDs so long as the partition is aligned. It's extremely 4K friendly. So the question still goes un-answered. What more can an OS do knowing the physical sector size is 4K. Remember, the OS is already optimized for 4K I/O.
  • Ross Ridge
    Ross Ridge over 8 years
    @misha256 If by default the OS happens to do all reads and writes in multiples of the physical sector size and these reads and writes are aligned on physical sector boundaries. then, yes, this information isn't that useful because it won't change the behaviour. If not, the information can be used to improve the performance of disk I/O.
  • sawdust
    sawdust over 8 years
    "They are already optimized ... " -- Not necessarily. The "start" sector would have to be always aligned with a physical sector. That is not guaranteed to be true when the OS was not cognizant of physical and logical sectors, but merely trying to be more efficient by using multi-sector operations.
  • misha256
    misha256 over 8 years
    @RossRidge I hear you, what you say makes good sense. So in the case of my SSD, the only tangible benefit I could ever see is that the "4K physical sector reporting" might inspire a modern OS to align partitions correctly. Windows aligns to 1MB (naturally a multiple of 4K) regardless, so at the end of the day I get zero benefit. Yet the SSD supports the changing of the setting. And only two options! 512b and 4K. Sheesh. What's the point.
  • misha256
    misha256 over 8 years
    @sawdust If you use a 4K file-system like NTFS (or multiple of 4K), on a partition that is aligned (all this is guaranteed with Windows Vista or later) you will NEVER see mis-aligned reads and writes. The "start" sector will, by definition, always align perfectly with the start of a physical sector.
  • Ross Ridge
    Ross Ridge over 8 years
    @misha256 No, it's not that simple. "Windows XP, Windows Server 2003, and Windows Server 2003 R2 do not support 512e or 4Kn media. While the system may boot up and be able to operate minimally, there may be unknown scenarios of functionality issues, data loss, or sub-optimal performance. Thus, Microsoft strongly cautions against using 512e media with Windows XP ..." msdn.microsoft.com/en-us/library/windows/desktop/…
  • misha256
    misha256 over 8 years
    @RossRidge No no, that's not true. Windows XP/2003 supports any drive that exposes 512-byte logical sector interface. You're interpreting that document in the wrong way. All that needs to be done (for best performance) is correct partition alignment. Some of MS' "cautions" are very misleading.
  • sawdust
    sawdust over 8 years
    @misha256 -- You're cherry picking the conditions, and then proclaiming that for all circumstances that this information is useless. Not everyone is going to use such an SSD with Windows and NTFS and >4k clusters. " NTFS doesn't even support less than 4K I/O" -- Not true. Cluster sizes of 512, 1024, and 2048 bytes are still options in my (up-to-date) copy of Win7 for NTFS. . .
  • Ross Ridge
    Ross Ridge over 8 years
    @misha256 If you know better than everyone, including Microsoft, then why are you wasting our time?
  • misha256
    misha256 over 8 years
    @davidgo At the driver level, the OS talks to the drive in n chunks of 512-byte chunks. The n is a number which, from Windows XP, is NEVER less than 8, and always a multiple of 8. Which means every OS from XP onwards, and I believe all modern Linux distros too, are already optimized for 4K drives. The smallest I/O is 4K, and all other I/O sizes are multiples of that.
  • misha256
    misha256 over 8 years
    @RossRidge I don't know better. That's why I'm asking. Most of the answers and comments here don't address the crux of the question. Intel put that setting onto my SSD for a reason. Not many drives even have this setting. Lots of SSDs report 512b logical AND 512b physical, even though physically they are 4K. Modern OS' don't have a problem with such drives. There will be a damn good reason for physical sector size reporting and I want to know what that reason is. If not from this forum, then I shall approach Intel directly.
  • misha256
    misha256 over 8 years
    @sawdust OK true, for certain drive sizes, you can manually force sector sizes less than 4K. But that would be a purposefully manual operation done by you. In all other cases the OS will go with 4K or multiples of 4K.
  • Mahmoud Al-Qudsi
    Mahmoud Al-Qudsi over 8 years
    I think I make it very clear that even if you group n sectors into one operation, you are still telling the disk to seek at 512-byte chunks meaning you're limited in how much you can seek. 4096 sectors solve the seek problem. I also clarified that the OS knowledge of block size is imperative, otherwise 512 bytes will be stored in 4096 chunks!
  • Mahmoud Al-Qudsi
    Mahmoud Al-Qudsi over 8 years
    Also, I think you're confused about logical vs physical. The physical is always either 512 or 4096. If the logical size is 4096 but the OS blindly assumes it's 512, you'll run into the problems I described. They must match.
  • misha256
    misha256 over 8 years
    The irony is that OS' that don't know how to align properly also won't be capable of querying a hard drive for "physical sector size". OS' that do know how to align properly don't need to query a hard drive for "physical sector size" because they align properly by default. E.g. Windows aligns to 1MB.
  • David Schwartz
    David Schwartz over 8 years
    @misha256 There's no incompatibility between what you said and what I said. It's true that beyond getting alignment right, most filesystems don't benefit much from knowing the physical sector size. Some databases do.
  • misha256
    misha256 over 8 years
    I have to say... I'm miffed. I've never seen a drive that lets you change the "physical sector size" reporting setting. I can't understand why such a setting needs to exist, considering the only options are 512b and 4K, and considering that modern OSs do everything the 4K way regardless of what kind of drive you use.
  • misha256
    misha256 over 8 years
    This is probably the best answer of the lot, but still, I think it's time to hunt down an Intel engineer and get an authoritative answer. Seems to be to be a highly esoteric thing.
  • misha256
    misha256 over 8 years
    @MahmoudAl-Qudsi You're talking about native 4K drives. I'm talking about 512 and 512e drives, where the logical sector size is 512-bytes. In those cases an OS must work with I/O commands in the form of n x 512-bytes. n can be any number up to 65535 I believe, but the optimal for drives using 4K physical sectors is 8, 16, 32, 64, etc. Windows XP, 7, 8, 10, Linux all do this by default out-of-the-box. They don't need to ask a drive for its physical sector size.
  • misha256
    misha256 over 8 years
    @DavidSchwartz Right, ok, so this may all be for the benefit of some esoteric OS` or file-systems used in data centers or the like? Some fancy RAID arrays maybe?
  • Wayne Jhukie
    Wayne Jhukie over 8 years
    Windows aligns new partitions, but upgrading from XP to 7 does not retroactively fix the partitions. And XP isn't dead yet. I have a point-of-sale system here running embedded XP off an SSD. I suspect Intel had some customer who needed a million SSDs that supported XP.
  • qasdfdsaq
    qasdfdsaq over 8 years
    While relevant for hard drives, this answer is irrelevant for SSDs. SSDs write/erase block sizes are several megabytes, so even the 4K "physical" isn't close to the real physical sector size.
  • MSalters
    MSalters over 8 years
    Confusing physical/logical drives with physical/logical sector sizes.
  • Wayne Jhukie
    Wayne Jhukie over 8 years
    @qasdfdsaq write size is not necessarily the same as erase size. 4K will be the granularity of the block "in use" tracking. Meanwhile I'm now convinced that the last part of this answer about ZFS is the correct one: utcc.utoronto.ca/~cks/space/blog/tech/…
  • Wayne Jhukie
    Wayne Jhukie over 8 years
    An article about raid controllers not supporting 4K: serverfault.com/questions/593742/…
  • Wayne Jhukie
    Wayne Jhukie over 8 years
    It does seem to be about non-windows ("esoteric") OS and RAID controllers.
  • David Schwartz
    David Schwartz over 8 years
    @misha256 Well, the biggest factor for typical systems is probably getting the alignment right. RAID is not particularly fancy these days, but most modern RAID controller firmware likely assumes 4KB physical sectors anyway. Disks aren't really designed around filesystems though. The OP asked the reason the disk reports this -- the reason is so that the OS can optimize around it. That it doesn't do much doesn't change that.
  • misha256
    misha256 over 8 years
    Excellent answer, I appreciate your research effort and for pointing me to the ZFS debacle.
  • DavidPostill
    DavidPostill almost 8 years
    Please quote the essential parts of the answer from the reference link(s), as the answer can become invalid if the linked page(s) change.
  • Small Boy
    Small Boy over 7 years
    @misha256 65535 x 512 bytes?! What you're saying is an OS is only accessing the first 32MiB !
  • misha256
    misha256 over 7 years
    @SmallBoy Good point, n can be far more than 65535, my mistake.
  • Ramhound
    Ramhound almost 7 years
    I am not entirely sure this answers the proposed question, it answers a question, just not the one asked by the author.
  • Ben N
    Ben N almost 7 years
    This is very interesting information, but it appears to be a response to a somewhat different question. Once you have sufficient reputation, you'll be able to comment everywhere. For an intro to our site, see the tour.