Show number of changed lines per author in git

35,995

Solution 1

It's an old post but if someone is still looking for it:

install git extras

brew install git-extras

then

git summary --line

https://github.com/tj/git-extras

Solution 2

one line code(support time range selection):

git log --since=4.weeks --numstat --pretty="%ae %H" | sed 's/@.*//g' | awk '{ if (NF == 1){ name = $1}; if(NF == 3) {plus[name] += $1; minus[name] += $2}} END { for (name in plus) {print name": +"plus[name]" -"minus[name]}}' | sort -k2 -gr

explain:

git log --since=4.weeks --numstat --pretty="%ae %H" \
    | sed 's/@.*//g'  \
    | awk '{ if (NF == 1){ name = $1}; if(NF == 3) {plus[name] += $1; minus[name] += $2}} END { for (name in plus) {print name": +"plus[name]" -"minus[name]}}' \
    | sort -k2 -gr

# query log by time range
# get author email prefix
# count plus / minus lines
# sort result

output:

user-a: +5455 -3471
user-b: +5118 -1934

Solution 3

Since the SO question "How to count total lines changed by a specific author in a Git repository?" is not completely satisfactory, commandlinefu has alternatives (albeit not per branch):

git ls-files | while read i; do git blame $i | sed -e 's/^[^(]*(//' -e 's/^\([^[:digit:]]*\)[[:space:]]\+[[:digit:]].*/\1/'; done | sort | uniq -ic | sort -nr

It includes binary files, which is not good, so you could (to remove really random binary files):

git ls-files | grep -v "\.\(pdf\|psd\|tif\)$"

(Note: as commented by trcarden, a -x or --exclude option wouldn't work.
From git ls-files man page, git ls-files -x "*pdf" ... would only excluded untracked content, if --others or --ignored were added to the git ls-files command.)

Or:

git ls-files "*.py" "*.html" "*.css" 

to only include specific file types.


Still, a "git log"-based solution should be better, like:

git log --numstat --pretty="%H" --author="Your Name" commit1..commit2 | awk 'NF==3 {plus+=$1; minus+=$2} END {printf("+%d, -%d\n", plus, minus)}'

but again, this is for one path (here 2 commits), not for all branches per branches.

Solution 4

On my repos I've gotten a lot of trash output from the one-liners floating around, so here is a Python script to do it right:

import subprocess
import collections
import sys


def get_lines_from_call(command):
    return subprocess.check_output(command).splitlines()

def get_files(paths=()):
    command = ['git', 'ls-files']
    command.extend(paths)
    return get_lines_from_call(command)

def get_blame(path):
    return get_lines_from_call(['git', 'blame', path])


def extract_name(line):
    """
    Extract the author from a line of a standard git blame
    """
    return line.split('(', 1)[1].split(')', 1)[0].rsplit(None, 4)[0]


def get_file_authors(path):
    return [extract_name(line) for line in get_blame(path)]


def blame_stats(paths=()):
    counter = collections.Counter()
    for filename in get_files(paths):
        counter.update(get_file_authors(filename))
    return counter


def main():
    counter = blame_stats(sys.argv[1:])
    max_width = len(str(counter.most_common(1)[0][1]))
    for name, count in reversed(counter.most_common()):
        print('%s %s' % (str(count).rjust(max_width), name))

if __name__ == '__main__':
    main()

Note that the arguments to the script will be passed to git ls-files, so if you only want to show Python files: blame_stats.py '**/*.py'

If you only want to show files in one subdirectory:blame_stats.py some_dir

And so on.

Share:
35,995
knittl
Author by

knittl

#SOreadytohelp web: php, html, css, js programming: c(|++|#), java, python graphics: inkscape, blender, gimp vcs: git, hg, bzr, svn vi-user If my answers (or questions) were helpful to you and you are experimenting with bitcoins, please consider sending me a small amount of milli-bitcoins: 1C4up92fVvPai7d7W2J4evFnDSiWBZNQCV

Updated on April 25, 2020

Comments

  • knittl
    knittl about 4 years

    i want to see the number of removed/added line, grouped by author for a given branch in git history. there is git shortlog -s which shows me the number of commits per author. is there anything similar to get an overall diffstat?

  • jjxtra
    jjxtra over 11 years
    git log is the only thing that doesn't barf for me, nice suggestion!
  • trcarden
    trcarden about 10 years
    You actually can't ignore binary files via the method specified. the -x command on ls-files is only available for "untracked files" Common error.
  • VonC
    VonC about 10 years
    @trcarden Very good point. I have edited the answer and proposed an alternative way of excluding binaries.
  • user3167101
    user3167101 almost 8 years
    apt-get install git-extras for Linux users
  • Maghoumi
    Maghoumi almost 8 years
    fatal: unrecognized argument: --line I think they've removed the option in the newest release
  • dav
    dav over 7 years
    @M2X, it looks like that git line-summary works, though it is said in the docs, that its deprecated in favor of --line github.com/tj/git-extras/blob/master/…
  • janeshs
    janeshs over 7 years
    I liked the output of this tool. Nice one.
  • Bacon
    Bacon about 7 years
    @alex you for people whose distribution uses apt to manage packets... :)
  • Mikhail Golubitsky
    Mikhail Golubitsky about 3 years
    I've visited this answer every time I need to ask this question for the entire life of the repo; all I do is change 4.weeks to 10.years
  • Sakari Cajanus
    Sakari Cajanus almost 2 years
    Is there a way to make the line version of the command show only changes from a certain commit onward? The help shows it allows only <committish> without --line.