How to get a list of all Subversion commit author usernames?

51,795

Solution 1

To filter out duplicates, take your output and pipe through: sort | uniq. Thus:

svn log --quiet | grep "^r" | awk '{print $3}' | sort | uniq

I woud not be surprised if this is the way to do what you ask. Unix tools often expect the user to do fancy processing and analysis with other tools.

P.S. Come to think of it, you can merge the grep and awk...

svn log --quiet | awk '/^r/ {print $3}' | sort | uniq

P.P.S. Per Kevin Reid...

svn log --quiet | awk '/^r/ {print $3}' | sort -u

P3.S. Per kan, using the vertical bars instead of spaces as field separators, to properly handle names with spaces (also updated the Python examples)...

svn log --quiet | awk -F ' \\\\|' '/^r/ {print $2}' | sort -u

For more efficient, you could do a Perl one-liner. I don't know Perl that well, so I'd wind up doing it in Python:

#!/usr/bin/env python
import sys
authors = set()
for line in sys.stdin:
    if line[0] == 'r':
        authors.add(line.split('|')[1].strip())
for author in sorted(authors):
    print(author)

Or, if you wanted counts:

#!/usr/bin/env python
from __future__ import print_function # Python 2.6/2.7
import sys
authors = {}
for line in sys.stdin:
    if line[0] != 'r':
        continue
    author = line.split('|')[1].strip()
    authors.setdefault(author, 0)
    authors[author] += 1
for author in sorted(authors):
    print(author, authors[author])

Then you'd run:

svn log --quiet | ./authorfilter.py

Solution 2

In PowerShell, set your location to the working copy and use this command.

svn.exe log --quiet |
? { $_ -notlike '-*' } |
% { ($_ -split ' \| ')[1] } |
Sort -Unique

The output format of svn.exe log --quiet looks like this:

r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
------------------------------------------------------------------------
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Filter out the horizontal rules with ? { $_ -notlike '-*' }.

r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Split by ' \| ' to turn a record into an array.

$ 'r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)' -split ' \| '
r20209
tinkywinky
2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)

The second element is the name.

Make an array of each line and select the second element with % { ($_ -split ' \| ')[1] }.

tinkywinky
dispy
lala
po
tinkywinky

Return unique occurrences with Sort -Unique. This sorts the output as a side effect.

dispy
lala
po
tinkywinky

Solution 3

I had to do this in Windows, so I used the Windows port of Super Sed ( http://www.pement.org/sed/ ) - and replaced the AWK & GREP commands:

svn log --quiet --xml | sed -n -e "s/<\/\?author>//g" -e "/[<>]/!p" | sort | sed "$!N; /^\(.*\)\n\1$/!P; D" > USERS.txt

This uses windows "sort" that might not be present on all machines.

Solution 4

One a remote repository you can use:

 svn log --quiet https://url/svn/project/ | grep "^r" | awk '{print $3}' | sort | uniq

Solution 5

svn log  path-to-repo | grep '^r' | grep '|' | awk '{print $3}' | sort | uniq > committers.txt

This command has the additional grep '|' that eliminates false values. Otherwise, Random commits starting with 'r' get included and thus words from commit messages get returned.

Share:
51,795
Quinn Taylor
Author by

Quinn Taylor

I'm a Computer Science nerd, longtime Mac addict, and software engineer in Silicon Valley. Happily, I work mostly with Objective-C and a bit in Swift, but I also enjoy Python, use Java when I must, and avoid C++.

Updated on February 17, 2021

Comments

  • Quinn Taylor
    Quinn Taylor over 3 years

    I'm looking for an efficient way to get the list of unique commit authors for an SVN repository as a whole, or for a given resource path. I haven't been able to find an SVN command specifically for this (and don't expect one) but I'm hoping there may be a better way that what I've tried so far in Terminal (on OS X):

    svn log --quiet | grep "^r" | awk '{print $3}'
    
    svn log --quiet --xml | grep author | sed -E "s:</?author>::g"
    

    Either of these will give me one author name per line, but they both require filtering out a fair amount of extra information. They also don't handle duplicates of the same author name, so for lots of commits by few authors, there's tons of redundancy flowing over the wire. More often than not I just want to see the unique author usernames. (It actually might be handy to infer the commit count for each author on occasion, but even in these cases it would be better if the aggregated data were sent across instead.)

    I'm generally working with client-only access, so svnadmin commands are less useful, but if necessary, I might be able to ask a special favor of the repository admin if strictly necessary or much more efficient. The repositories I'm working with have tens of thousands of commits and many active users, and I don't want to inconvenience anyone.

  • Quinn Taylor
    Quinn Taylor over 14 years
    +1 for the useful suggestion. I was aware of sort but not uniq, and it seems the latter takes a -c parameter than prepends the number of occurrences for each line. I'm still hoping for a more efficient (and scalable) way, but this does the trick in a pinch.
  • Kevin Reid
    Kevin Reid over 14 years
    By the way, if you have XPath handy, then the query //author/text() will get just the author names out of svn log --xml robustly. (Mac OS X has an xpath command which almost does this job, but produces extraneous text and can't be configured not to. Maybe there's something else.)
  • Quinn Taylor
    Quinn Taylor over 14 years
    @Kevin, you should add your own answer so people can vote for you. I like all your comments, particularly the sort/uniq tip.
  • Admin
    Admin over 13 years
    I've also made a batch file that iterates through a folder and compiles a unique list of all repositories: pastebin.com/CXiqLddp
  • v01pe
    v01pe about 11 years
    thats why the --quiet or -q argument is used in the other suggestions. This only prints the log headers (revision, author and date, time)
  • Mike DeSimone
    Mike DeSimone almost 11 years
    @ojblass: I don't ask many questions, but I still learn a lot on SO. I'm surprised some Perl ace hasn't posted a one-liner for this by now, though.
  • echristopherson
    echristopherson almost 9 years
    This would only look at cpp files that exist in the filesystem at the time this is run.
  • Tom Kuijsten
    Tom Kuijsten almost 9 years
    The Sort -Unique is case insensitive, you should use Sort-Object | Get-Unique –AsString or Select-Object -Unique instead to get a case sensitive check.
  • kan
    kan almost 9 years
    As svn username could have spaces, it would be better to use more accurate filtering awk -F " \\\\| " '{print $2}'
  • Mike DeSimone
    Mike DeSimone almost 9 years
    @kan Could you give an example of how usernames with spaces appear in the output? I'd need to update the other examples to handle that, too.
  • kan
    kan almost 9 years
    Just appears as is, with spaces, nothing fancy: r114502 | Full Name | 2015-08-24 18:05:58 +0100 (Mon, 24 Aug 2015) | 1 line
  • MJar
    MJar almost 8 years
    great answer, though I had to change the last of the awk's to svn log --quiet | awk -F ' \\\\| ' '/^r/ {print $3}' | sort -u otherwise I was just getting empty line
  • Mike DeSimone
    Mike DeSimone almost 8 years
    It's been years and they might have changed the log output format. I don't have any SVN repos handy to test with any more...
  • Nathan Moinvaziri
    Nathan Moinvaziri almost 7 years
    Alternatively: ([xml](svn log --xml)).SelectNodes('//author') | % {$_.InnerText} | Select -Unique
  • seyfahni
    seyfahni about 4 years
    I didn't find this command till I figured it out by myself... If you just want to get the users of a remote repository to e.g. convert it to git (see git svn --help) this is really useful as a checkout only to execute this command can take way too much time.