How to get folder size ignoring hard links?

20,399

Solution 1

Total size in bytes of all files in hourly.2 which have only one link:

$ find ./hourly.2 -type f -links 1 -printf "%s\n" | awk '{s=s+$1} END {print s}'

From find man-page:

   -links n
          File has n links.

To get the sum in kilobytes instead of bytes, use -printf "%k\n"

To list files with different link counts, play around with find -links +1 (more than one link), find -links -5 (less than five links) and so on.

Solution 2

As @Gilles says, since du counts only the first of all hardlinks pointing to the same inode it encounters, you can give it directories in a row:

$ du -hc --max-depth=0 dirA dirB
29G /hourly.1
 1G /hourly.2
30G total

I.e. any file in 'hourly.2' referencing an inode (aka "real" file) already referenced in 'hourly.1', will not be counted.

Solution 3

If you specifically want the size of the files that are present under hourly.2 but not under hourly.1, you can obtain it a little indirectly with du. If du processes the same file more than once (even under different names, i.e. hard links), it only counts the file the first time. So what du hourly.1 hourly.2 reports for hourly.2 is the size you're looking for. Thus:

du -ks hourly.1 hourly.2 | sed -n '2s/[^0-9]*//p'

(Works on any POSIX system and most other Unix variants. Assumes that the directory name hourly.1 doesn't contain any newline.)

Solution 4

More simple

du -hc --max-depth=1 path/

Example

9.4G    daily/users/rockspa/home/daily.21
3.6G    daily/users/rockspa/home/daily.30
4.2G    daily/users/rockspa/home/daily.11
1.1G    daily/users/rockspa/home/daily.4
4.2G    daily/users/rockspa/home/daily.9
3.0G    daily/users/rockspa/home/daily.25
3.5G    daily/users/rockspa/home/daily.20
4.2G    daily/users/rockspa/home/daily.13
913M    daily/users/rockspa/home/daily.5
2.8G    daily/users/rockspa/home/daily.26
1.4G    daily/users/rockspa/home/daily.1
2.6G    daily/users/rockspa/home/daily.28
4.2G    daily/users/rockspa/home/daily.15
3.8G    daily/users/rockspa/home/daily.19
327M    daily/users/rockspa/home/daily.8
4.2G    daily/users/rockspa/home/daily.17
3.1G    daily/users/rockspa/home/daily.23
...

Solution 5

Awesomely BusyBox's builds of find comes without -printf support. Here is modification to @grebneke's answer:

find . -type f -links 1 -exec ls -l {} \;| awk '{s=s+$5} END {print s}'
Share:
20,399
Benubird
Author by

Benubird

Updated on September 18, 2022

Comments

  • Benubird
    Benubird over 1 year

    I use rsnapshot for backups, which generates a series of folders containing files of the same name. Some of the files are hard linked, while others are separate. For instance, hourly.1/file1 and hourly.2/file1 might be hard linked to the same file, while hourly.1/file2 and hourly.2/file2 are entirely separate files.

    I want to find the amount of space used by the folder hourly.2 ignoring any files which are hard links to files in hourly.1. So in the above example, I would want to get the size of file2, but ignore file1.

    I'm using bash on linux, and I want to do this from the command line as simply as possible, so no big graphical or other-OS-only solutions please.

  • cuonglm
    cuonglm about 10 years
    If a file some where is hard link to a file in hourly2, your command will procedure wrong answer.
  • grebneke
    grebneke about 10 years
    @Gnouc - Well yes - it depends on how the files end up in hourly.2. If they are copied there, they will not have extra links and my command will work. If they are hard-linked, obviously it will fail. I'm assuming new backup-files are copied.
  • akavel
    akavel over 7 years
    According to du --help, option --max-depth=0 is equivalent to -s, so above can be shortened as: $ du -hcs dirA dirB
  • Andreas Krey
    Andreas Krey over 5 years
    For some strange reason du doesn't always notices hardlinked files on RHEL5 - if I do 'du -sh dir/sub dir' the output for dir is the same as if I just said 'du -sh dir' - not excluding the size of 'dir/sub'.
  • TiberiusKirk
    TiberiusKirk over 4 years
    Thanks Abdel. This should be the accepted answer.
  • dimitarvp
    dimitarvp over 4 years
    Awesome. This worked for me on the first try on my macOS 10.15. Thank you.