How can I find out where a file is physically located on the disk (block numbers)?

16,522

Solution 1

You could use debugfs for this:

debugfs -R "stat ~/myfile" /dev/hda1

Change the hard/partition drive accordingly and make sure the drive is unmounted. You will get a list with all the blocks used:

BLOCKS:
(0):1643532
TOTAL: 1

Solution 2

You can use the FIBMAP ioctl, as exemplified here, or using hdparm:

/ $ sudo /sbin/hdparm --fibmap /etc/X11/xorg.conf

/etc/X11/xorg.conf:
 filesystem blocksize 4096, begins at LBA 0; assuming 512 byte sectors.
 byte_offset  begin_LBA    end_LBA    sectors
           0    1579088    1579095          8

Solution 3

This thread may give you some insight into ext4 file placement algorithm.

debugfs has a bmap function, which seems to give the data you want. You should be able to give it consecutive blocks of a file and get the physical block numbers.

Solution 4

The question is rather old, but there is another answer that could be useful for those finding this on Google: filefrag (in Debian it is inside package e2fsprogs).

# filefrag -eX /usr/bin/aptitude
Filesystem type is: ef53
File size of /usr/bin/aptitude is 4261400 (1041 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..     1fa:    15bd805..   15bd9ff:    1fb:            
   1:      1fb..     3f2:    15c6608..   15c67ff:    1f8:    15bda00:
   2:      3f3..     410:    15c8680..   15c869d:     1e:    15c6800: last,eof
/usr/bin/aptitude: 3 extents found

It has the advantage that it works also for other filesystems (I used it for UDF), which do not appear to be supported by other tools described here.

The offset presented in the output are meant to be in multiple of the block size written in the second line (4096 here). Beware that logical offsets might not be contiguous, as a file can have holes in it (when supported by the filesystem).

Share:
16,522

Related videos on Youtube

Rick Koshi
Author by

Rick Koshi

Updated on September 18, 2022

Comments

  • Rick Koshi
    Rick Koshi over 1 year

    This is an obscure question, I know. I'm trying to do some performance testing of some disks on a Linux box. I'm getting some inconsistent results, running the same test on the same disk. I know that disks have different performance depending on which part of the disk is being accessed. In particular, reads and writes to the outside of the disk have much higher throughput than reads and writes to the inside part of the disk, due to near-constant data density and constant rotational speed.

    I'd like to see if my inconsistencies can be attributed to this geometry-induced variance in throughput. Is it possible, using existing tools, to find out where on the disk a file has been placed?

    If not, I suppose I can write something to directly seek, read, and write to the device file itself, bypassing (and destroying) the filesystem, but I'm hoping to avoid that. I'm currently using ext4 on a 3.0 kernel (Arch Linux, if it matters), but I'm interested in techniques for other filesystems as well.

    • Sirex
      Sirex over 12 years
      who says files are in one place ? If they get fragmented (which they usually do) they can end up all over.
    • Rick Koshi
      Rick Koshi over 12 years
      Absolutely. But they're still someplace :-) And in my particular case, writing files to a newly-created filesystem, they're quite likely to be (mostly) unfragmented.
    • Jason C
      Jason C almost 10 years
      You can't do this. The best you can get is the LBA block numbers of the files, which don't necessarily correspond to specified physical locations (at least not in a way that you can determine, as drives don't publish this mapping). There are other things, too, for example, blocks 3-5 may be consecutively numbered, but 4 may have been reallocated to a completely different location on the drive because the original sector at 4 was physically damaged, etc. You cannot get the information you are looking for unless the drive manufacturer is willing to give you detailed address specs.
  • Rick Koshi
    Rick Koshi over 12 years
    Unfortunately, nothing output by stat is the information I need. Size in bytes and blocks, inode number, permissions... None of these reflect which blocks contain the file's data. As an example, my test files (which are all the same size) all show exactly the same data, except for inode number and access/modification times.
  • ThexTallxDude
    ThexTallxDude over 12 years
    Yes, you are right, I'm sorry, I didn't read properly. I changed my answer to stg more appropriate.
  • Rick Koshi
    Rick Koshi over 12 years
    This is perfect, thanks. I'm not sure why you said to make sure the drive is unmounted, though. According to the manual page, debugfs opens in read-only mode by default, so this command should be completely safe even on an active filesystem. It might provide questionable results if the queried file is being actively changed at the time, of course, but no other problems should result. Have I missed something?
  • Rick Koshi
    Rick Koshi over 12 years
    hdparm does indeed give me what I need, and in a somewhat more readable format than debugfs. I had to go find it, though, since it's not installed (on Arch Linux) by default. debugfs is part of e2fsprogs (same package that gives us mkfs and fsck), so is installed by default.
  • Rick Koshi
    Rick Koshi over 12 years
    Thanks for the pointer to the thread about ext4 file placement. That was enlightening. :-)
  • APR
    APR over 12 years
    No, you are right. It's more of a 'best practice' then a must. If you are doing it on an active filesystem, files may change etc.
  • Jason C
    Jason C almost 10 years
    The LBA block number does not tell you where the file is physically located on the disk. These days conversion from LBA to physical location is generally not possible, due to the complexity of the physical geometry of modern drives, behind-the-scenes sector reallocations, etc. Generally speaking it's usually a safe bet that for disc-based media lower LBAs are towards the outside of the drive, but that's just because that layout has been typical in the past, back in CHS addressing days. Modern drives don't even publish real CHS geometry any more, because they can't.
  • Jason C
    Jason C almost 10 years
    The LBA does not tell you where the file is physically located on the drive. It is not possible to get information about actual physical mapping of LBAs.
  • Jason C
    Jason C almost 10 years
    The LBA does not tell you where the file is physically located on the drive. It is not possible to get information about actual physical mapping of LBAs.
  • dashesy
    dashesy over 8 years
    what about fat fie systems?
  • dashesy
    dashesy over 8 years
    I get this on fat: HDIO_GETGEO failed: Inappropriate ioctl for device
  • Admin
    Admin almost 2 years
    And if you want to read the contents of the file, given that you're using a 4096 block size and determined the physical_offset in decimal, you'd use following command: sudo dd if=/dev/<your disk> bs=8M skip=$((<physical offset in decimal>*4096)) iflag=skip_bytes | head -c 4096 which will print the entire block (including part of the file).