Git: show total file size difference between two commits?
Solution 1
git cat-file -s
will output the size in bytes of an object in git. git diff-tree
can tell you the differences between one tree and another.
Putting this together into a script called git-file-size-diff
located somewhere on your PATH will give you the ability to call git file-size-diff <tree-ish> <tree-ish>
. We can try something like the following:
#!/bin/bash
USAGE='[--cached] [<rev-list-options>...]
Show file size changes between two commits or the index and a commit.'
. "$(git --exec-path)/git-sh-setup"
args=$(git rev-parse --sq "$@")
[ -n "$args" ] || usage
cmd="diff-tree -r"
[[ $args =~ "--cached" ]] && cmd="diff-index"
eval "git $cmd $args" | {
total=0
while read A B C D M P
do
case $M in
M) bytes=$(( $(git cat-file -s $D) - $(git cat-file -s $C) )) ;;
A) bytes=$(git cat-file -s $D) ;;
D) bytes=-$(git cat-file -s $C) ;;
*)
echo >&2 warning: unhandled mode $M in \"$A $B $C $D $M $P\"
continue
;;
esac
total=$(( $total + $bytes ))
printf '%d\t%s\n' $bytes "$P"
done
echo total $total
}
In use this looks like the following:
$ git file-size-diff HEAD~850..HEAD~845
-234 Documentation/RelNotes/1.7.7.txt
112 Documentation/git.txt
-4 GIT-VERSION-GEN
43 builtin/grep.c
42 diff-lib.c
594 git-rebase--interactive.sh
381 t/t3404-rebase-interactive.sh
114 t/test-lib.sh
743 tree-walk.c
28 tree-walk.h
67 unpack-trees.c
28 unpack-trees.h
total 1914
By using git-rev-parse
it should accept all the usual ways of specifying commit ranges.
EDIT: updated to record the cumulative total. Note that bash runs the while read in a subshell, hence the additional curly braces to avoid losing the total when the subshell exits.
EDIT: added support for comparing the index against another tree-ish by using a --cached
argument to call git diff-index
instead of git diff-tree
. eg:
$ git file-size-diff --cached master
-570 Makefile
-134 git-gui.sh
-1 lib/browser.tcl
931 lib/commit.tcl
18 lib/index.tcl
total 244
Solution 2
You can pipe the out put of
git show some-ref:some-path-to-file | wc -c
git show some-other-ref:some-path-to-file | wc -c
and compare the 2 numbers.
Solution 3
I made a bash script to compare branches/commits etc by actual file/content size. It can be found at https://github.com/matthiaskrgr/gitdiffbinstat and also detects file renames.
Solution 4
Expanding on matthiaskrgr's answer, https://github.com/matthiaskrgr/gitdiffbinstat can be used like the other scripts:
gitdiffbinstat.sh HEAD..HEAD~4
Imo it really works well, much faster than anything else posted here. Sample output:
$ gitdiffbinstat.sh HEAD~6..HEAD~7
HEAD~6..HEAD~7
704a8b56161d8c69bfaf0c3e6be27a68f27453a6..40a8563d082143d81e622c675de1ea46db706f22
Recursively getting stat for path "./c/data/gitrepo" from repo root......
105 files changed in total
3 text files changed, 16 insertions(+), 16 deletions(-) => [±0 lines]
102 binary files changed 40374331 b (38 Mb) -> 39000258 b (37 Mb) => [-1374073 b (-1 Mb)]
0 binary files added, 3 binary files removed, 99 binary files modified => [-3 files]
0 b added in new files, 777588 b (759 kb) removed => [-777588 b (-759 kb)]
file modifications: 39596743 b (37 Mb) -> 39000258 b (37 Mb) => [-596485 b (-582 kb)]
/ ==> [-1374073 b (-1 Mb)]
The output directory is funky with ./c/data... as /c is actually the filesytem root.
Solution 5
A comment to the script: git-file-size-diff, suggested by patthoyts. The script is very useful, however, I have found two issues:
-
When someone change permissions on the file, git returns a another type in the case statement:
T) echo >&2 "Skipping change of type" continue ;;
-
If a sha-1 value doesn't exist anymore (for some reason), the script crashes. You need to validate the sha before getting the file size:
$(git cat-file -e $D) if [ "$?" = 1 ]; then continue; fi
The complete case statement will then look like this:
case $M in
M) $(git cat-file -e $D)
if [ "$?" = 1 ]; then continue; fi
$(git cat-file -e $C)
if [ "$?" = 1 ]; then continue; fi
bytes=$(( $(git cat-file -s $D) - $(git cat-file -s $C) )) ;;
A) $(git cat-file -e $D)
if [ "$?" = 1 ]; then continue; fi
bytes=$(git cat-file -s $D) ;;
D) $(git cat-file -e $C)
if [ "$?" = 1 ]; then continue; fi
bytes=-$(git cat-file -s $C) ;;
T) echo >&2 "Skipping change of type"
continue ;;
*)
echo >&2 warning: unhandled mode $M in \"$A $B $C $D $M $P\"
continue
;;
esac
Related videos on Youtube
Mathias Bynens
I work on Chrome and web standards at Google. ♥ JavaScript, HTML, CSS, HTTP, performance, security, Bash, Unicode, macOS.
Updated on September 07, 2020Comments
-
Mathias Bynens over 3 years
Is it possible to show the total file size difference between two commits? Something like:
$ git file-size-diff 7f3219 bad418 # I wish this worked :) -1234 bytes
I’ve tried:
$ git diff --patch-with-stat
And that shows the file size difference for each binary file in the diff — but not for text files, and not the total file size difference.
Any ideas?
-
Stas Dashkovsky almost 10 yearsHere is the 3-lines bashscript giving you size of certain commit stackoverflow.com/a/23985353/2062041
-
-
Mathias Bynens almost 12 years+1 This is great for quickly checking the size difference of a file between versions. But how can this be used to get the total file difference between two commits? I want to see how many bytes were added/removed project-wide between two refs.
-
Mathias Bynens almost 12 years+1 Thanks! This would be absolutely perfect if it would print out the total size difference at the bottom. I want to see how many bytes were added/removed project-wide between two refs (not just per file, but in total, too).
-
Mathias Bynens almost 12 yearsAnother question: why are you sourcing
git-sh-setup
here? You don’t seem to be using any of the functions it defines. Just wondering! -
patthoyts almost 12 yearsIt does basic checks like producing a sensible message if you run this command in a directory that is not a git repository. It also can help abstract out some platform differences. Mostly habit though. When writing a git script - first bring in the git-sh-setup file.
-
AlecRust over 10 yearsGot an example usage of this?
-
csch about 10 yearsThanks for the script! I archived it in a gist (gist.github.com/cschell/9386715), I hope you do not mind. Impatient ones can now do something like
curl -s https://gist.githubusercontent.com/cschell/9386715/raw/43996adb0f785a5afc17358be7a43ff7ee973215/git-file-size-diff | bash -s <tree-ish> <tree-ish>
-
Aziz Alto about 9 yearsThanks for the awesome script! I was looking for someway to monitor the increase of size after each commit and this helps a lot. I made a small gist to show only the total increase between all (some of) the commits in the repository gist.github.com/iamaziz/1019e5a9261132ac2a9a thanks again!
-
Mogsdad about 8 yearsYou didn't need to comment on Matthias' post - you could have suggested an edit to it instead, with these details that he didn't provide. By current standards, his answer would be considered a "link-only answer", and be deleted, so these sorts of details are important.
-
guest about 8 yearswho can take my answer and include it into matthias?
-
Mogsdad about 8 yearsIf you want, you can make a suggested edit yourself. (In my experience, it would tend to get get rejected by reviewers, but a clear explanation in the Edit Summary could help.) But maybe I wasn't clear in my comment to you... your answer is a stand-alone answer, a good update of Matthias' older answer. You didn't need to include the text that explained that you meant to comment, is all. I edited the answer in a way that gives appropriate credit to Matthias. You don't need to do more.
-
escapecharacter over 7 yearsThe use case I'm looking for is to preview large commits before I make them. Is there a way I can find the size changes of the currently staged changes? I've read through the tree-ish documentation, and I could not find a way to reference "current staged changes".
-
patthoyts over 7 yearsAdded support for comparing against the index using
git-diff-index
. -
Josh over 7 yearsyou can run
echo $PATH
to see your path directories to see where you can put this script file. I put mine in/usr/local/git/bin
and it worked great. You can also add a path to your$PATH
if you want to put the script somewhere else. -
Dee Choksi over 6 yearsYou can skip the
| wc -c
if you usecat-file -s
instead ofshow
-
mr5 over 6 yearsHow do I use this? What is
HEAD~850
? Can I just use instead the commit id? -
patthoyts over 6 years@mr5 HEAD~850 is 850 commits before HEAD. It is just another notation for a commit and yes you can use a specific commit id or a tag or anything that can be resolved to a commit. The script uses
git rev-parse
so see the manual section "Specifying Revisions" in the git-rev-parse documentation for the full details. (git-scm.com/docs/git-rev-parse) -
webninja over 6 yearsUsing the improvement suggested by @neu242, I wrote this bash function:
gdbytes () { echo "$(git cat-file -s $1:$3) -> $(git cat-file -s $2:$3)" }
Which makes it easy to see how file size changed since last commit with e.g.,gdbytes @~ @ index.html
-
40detectives almost 6 yearsif the
some-ref:
part is skipped, do you obtain the file size in the working directory? -
Philzen about 4 yearsHow would i be able to see what size the files had before? I am currently preparing a pull request that optimizes file output structure and would like to calculate a percentage of size decrease.