How to compare the size of two directories?

12,945

Solution 1

i can't know what you want clearly. Maybe you want this?
diff <(du -sh dir1) <(du -sh dir2)

Solution 2

If your version of find has -printf you may find this to be quite a bit faster.

find dir1 ! -type d -printf "%s\n" | awk '{sum += $1} END{print sum}'

There are at least two ways to avoid scientific notation for outputting large numbers in AWK.

END {OFMT = "%.0f"; print sum}

END {printf "%.0f\n", sum}

The .0 truncates the decimal places since we're really dealing with an integer and gawk's %d seems to incorrectly act like %g in version 3.1.5 (but not 3.1.6 and later).

However, from the gawk documentation:

NOTE: When using the integer format-control letters for values that are outside the range of the widest C integer type, 'gawk' switches to the '%g' format specifier.

Beware of exceeding the maximum integer for your system/version of AWK.

Share:
12,945
rafak
Author by

rafak

Maths/Crypto Thesis

Updated on June 26, 2022

Comments

  • rafak
    rafak almost 2 years

    I want to compare the total size of two directories dir1 and dir2 on different file-systems so that if diff -r dir1 dir2 returns 0 then the total sizes will be equal. The du command returns the disk usage, and its option --apparent-size doesn't solve the problem. I now use something like

    find dir1 ! -type d |xargs wc -c |tail -1
    

    to know an approximation of dir1's size. Is there a better solution?

    edit: for example, I have (diff -r dir1 dir2 returns 0: they are equal):

    du -s dir1 --> 540
    du -s dir2 --> 166
    
    du -sb dir1 --> 250815 (the -b option is equivalent to --apparent-size -B1)
    du -sb dir2 --> 71495
    
    find dir1 ! -type d |xargs wc -c --> 62399
    find dir2 ! -type d |xargs wc -c --> 62399 
    
  • John McFarlane
    John McFarlane over 11 years
    I tried this and got an imprecise result: 2.11437e+11. Awk has a printf function so I tried: find dir1 ! -type d -printf "%s\n" | awk '{sum += $1} END{printf "%f\n", sum}' and I think it got a precise result: 211436502457.000000. Edit your answer and I'll definitely +1!
  • SourceSeeker
    SourceSeeker over 11 years
    @JMcF: What version of AWK are you using? I edited my answer to add a little more information.
  • John McFarlane
    John McFarlane over 11 years
    It's mawk 1.3.3 which comes with Ubuntu 12.04. It's the 64-bit version so these numbers should be well within range. Nevertheless, a very thorough answer, thanks.
  • Peter Mellett
    Peter Mellett over 10 years
    Maybe not, but it was exactly what I was searching for.
  • Andry
    Andry about 6 years
    This is not reliable, because du -sh can return different result for "visibly" (file by file) equal directories. For example, the . element can contain not 0 size which includes to the whole directory size.