Do hard links really take up so much disk space?

6,141

The second thing you said is exactly correct. The file contents only exist once on disk. A hard link is an extra reference, which costs very little space - the size of a directory entry, which is the length of the filename plus a few bytes.

I don't know if this applies to OSX, but in the version of GNU coreutils I have handy, du is aware of hard links, so you can use it to get an accurate report of the total size of a set of files. If it finds multiple links to a file, it only adds it to the total once. ls -l on the other hand does the wrong thing and adds everything it sees in a directory for its total line.

$ ls -sl
total 296
296 -rw-r--r-- 1 user group 300324 Feb 17 19:08 f1
$ du
296     .
$ ln f1 f2
$ ls -sl
total 592
296 -rw-r--r-- 2 user group 300324 Feb 17 19:08 f1
296 -rw-r--r-- 2 user group 300324 Feb 17 19:08 f2
$ du
296     .
$ cp f1 f3
$ ls -sl
total 888
296 -rw-r--r-- 2 user group 300324 Feb 17 19:08 f1
296 -rw-r--r-- 2 user group 300324 Feb 17 19:08 f2
296 -rw-r--r-- 1 user group 300324 Feb 17 19:08 f3
$ du
592     .
$

The ultimate demonstration would be to create a huge file, more than half the size of the disk. Then see how many hard links you can create to it. Should be quite a lot.

Share:
6,141

Related videos on Youtube

JVC
Author by

JVC

Updated on September 18, 2022

Comments

  • JVC
    JVC almost 2 years

    I've found that I need to use hard links with a particular program (Ableton Live) that is unable to see aliases/symlinks, which is of course how I have all my working files organized. But making hard links is creating what appears to be duplicates of the original file.

    Do they actually take up as much space as the original? Or is the filesystem (OSX in this case) merely showing the size of the actual data on disk, and the fact that it is being referenced in two places does not actually double the amount of data?

    • Wildcard
      Wildcard over 6 years
    • Wildcard
      Wildcard over 6 years
      Are you sure that Ableton Live doesn't work with symlinks? Or does it just not work with aliases? They're quite different; see this post about their differences on the Apple stack exchange and in particular this comment.
    • JVC
      JVC over 6 years
      Yes positive… There is much consternation about this online and even amongst their own support staff. Their file browser absolutely does not support any method of pointing to files other than using the actual files themselves.
  • JVC
    JVC over 7 years
    Excellent, this is what made sense to me and is exactly what I was hoping for! The only catch of sorts is that I need to make sure I don't overlook the fact that these are hard links when I look at them, since OSX's GUI shows them as regular directories of a given size, which obviously they aren't exactly. But that shouldn't really be a problem as this is only my workstation and I'm doing this for a very specific use-case of my own. Thanks for confirming!
  • Byba
    Byba over 7 years
    Creating a single hard link should be practically instant. It's not possible to create a hard link to a directory, so I don't know what you're talking about with the recursive linking. If you're creating a tree of hard links, you'll be doing a mkdir() for every directory in the tree, and a link() for every regular file. The normal ln command doesn't do recursion though.
  • JVC
    JVC over 7 years
    Yeah I came to the conclusion that what I was trying to do was really not going to work, and what I thought I had done was probably not what really happened. I've ended up having to just re-think the way I organize my files, which is a real shame because the alias/symlink approach I have now, works beautifully on so many levels. All because one program won't recognize symlinks. Sigh.
  • Wildcard
    Wildcard over 6 years
    WumpusQ.Wumbley, actually, I believe I read long ago that Mac OS "Time Machine backups" are made by use of directory hard links. But that may be incorrect; I never investigated it personally.
  • Wildcard
    Wildcard over 6 years
    @JonathanvanClute, symlinks are entirely different from Mac OS "aliases." Symlinks are transparent at a much deeper level than aliases. I highly doubt the program you mention will reject symlinks. Rejecting aliases, on the other hand, is much less surprising. Be sure to test this again.
  • John Pancoast
    John Pancoast almost 6 years
    @WumpusQ.Wumbley, regarding your comment about hardlinks and if/how Time Machine works with directories, my hunch is that they handle it similarly to Git. Git maintains change history for files (never directories, just like TM), but because you have records of changed file paths, you can still mention that the parent directory changd. For example, TimeMachine might maintain a change of a folder called "My Tree" only because it knows that its underlying file "My Branch File" was added or edited. It might allow you to restore that directory, but it's really just restoring the changed files there
  • John Pancoast
    John Pancoast almost 6 years
    @WumpusQ.Wumbley, although that's definitely a hunch and leaves some open questions to how apple handles TimeMachine using hardlinks, if that's in fact what they do. Like, you can rename a directory from "foo" to "bar" and the inode won't change, but you'd think the user would still want that updated directory name if they restored from timemachine. If apple only went off of inodes, then they wouldnt see this change... They likely store metadata, but I've always been under the impression that they use hardlinks for their backups as well. Oh well, hope I added something useful to the convo.