Get the current git hash in a Python script
Solution 1
The git describe
command is a good way of creating a human-presentable "version number" of the code. From the examples in the documentation:
With something like git.git current tree, I get:
[torvalds@g5 git]$ git describe parent v1.0.4-14-g2414721
i.e. the current head of my "parent" branch is based on v1.0.4, but since it has a few commits on top of that, describe has added the number of additional commits ("14") and an abbreviated object name for the commit itself ("2414721") at the end.
From within Python, you can do something like the following:
import subprocess
label = subprocess.check_output(["git", "describe"]).strip()
Solution 2
No need to hack around getting data from the git
command yourself. GitPython is a very nice way to do this and a lot of other git
stuff. It even has "best effort" support for Windows.
After pip install gitpython
you can do
import git
repo = git.Repo(search_parent_directories=True)
sha = repo.head.object.hexsha
Something to consider when using this library. The following is taken from gitpython.readthedocs.io
Leakage of System Resources
GitPython is not suited for long-running processes (like daemons) as it tends to leak system resources. It was written in a time where destructors (as implemented in the
__del__
method) still ran deterministically.In case you still want to use it in such a context, you will want to search the codebase for
__del__
implementations and call these yourself when you see fit.Another way assure proper cleanup of resources is to factor out GitPython into a separate process which can be dropped periodically
Solution 3
This post contains the command, Greg's answer contains the subprocess command.
import subprocess
def get_git_revision_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
def get_git_revision_short_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip()
when running
print(get_git_revision_hash())
print(get_git_revision_short_hash())
you get output:
fd1cd173fc834f62fa7db3034efc5b8e0f3b43fe
fd1cd17
Solution 4
Here's a more complete version of Greg's answer:
import subprocess
print(subprocess.check_output(["git", "describe", "--always"]).strip().decode())
Or, if the script is being called from outside the repo:
import subprocess, os
print(subprocess.check_output(["git", "describe", "--always"], cwd=os.path.dirname(os.path.abspath(__file__))).strip().decode())
Or, if the script is being called from outside the repo and you like pathlib
:
import subprocess
from pathlib import Path
print(subprocess.check_output(["git", "describe", "--always"], cwd=Path(__file__).resolve().parent).strip().decode())
Solution 5
numpy
has a nice looking multi-platform routine in its setup.py
:
import os
import subprocess
# Return the git revision as a string
def git_version():
def _minimal_ext_cmd(cmd):
# construct minimal environment
env = {}
for k in ['SYSTEMROOT', 'PATH']:
v = os.environ.get(k)
if v is not None:
env[k] = v
# LANGUAGE is used on win32
env['LANGUAGE'] = 'C'
env['LANG'] = 'C'
env['LC_ALL'] = 'C'
out = subprocess.Popen(cmd, stdout = subprocess.PIPE, env=env).communicate()[0]
return out
try:
out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD'])
GIT_REVISION = out.strip().decode('ascii')
except OSError:
GIT_REVISION = "Unknown"
return GIT_REVISION
Related videos on Youtube
![Victor](https://i.stack.imgur.com/ZlHFm.png?s=256&g=1)
Victor
Updated on July 31, 2022Comments
-
Victor almost 2 years
I would like to include the current git hash in the output of a Python script (as a the version number of the code that generated that output).
How can I access the current git hash in my Python script?
-
Mel Nicholson over 11 yearsStart with
git rev-parse HEAD
from the command line. The output syntax should be obvious. -
Charlie Parker almost 3 yearsdo
subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip()
after havingimport subprocess
-
-
JosefAssad over 11 yearsThis has the drawback that the version printing code will be broken if the code is ever run without the git repo present. For example, in production. :)
-
Greg Hewgill over 11 years@JosefAssad: If you need a version identifier in production, then your deployment procedure should run the above code and the result should be "baked in" to the code deployed to production.
-
JosefAssad over 11 yearsThat's only 1 way to accomplish that; there's other ways. git attributes can be used to inject version information upon checkout. Even though git attributes are not transferred to clones, as long as they're defined in the master copy and the ops take their code from master, this would be a simpler solution.
-
grasshopper almost 10 yearsAdd a strip() to the result to get this without line breaks :)
-
kynan almost 10 yearsNote that git describe will fail if there are not tags present:
fatal: No names found, cannot describe anything.
-
pkamb over 9 yearsHow would you run this for a git repo at a particular path?
-
Leonardo over 9 years
git describe --always
will fallback to the last commit if no tags are found -
Zac Crites almost 9 years@pkamb Use os.chdir to cd to the path of the git repo you are interested in working with
-
Christian Herenz over 8 yearsDoes this work if the script is somewhere in my $PATH variable - but I'm running it from somewhere else in the filesystem??
-
max over 8 yearsWouldn't that give the wrong answer if the currently checked out revision is not the branch head?
-
djangonaut over 8 yearsTo get a format like above:
<last tag>-<num commits after tag>-<hash>
I had to usegit describe --long --tags
-
Charlie Parker almost 8 yearsdidn't work:
>>> label = subprocess.check_output(["git", "describe"]) fatal: No names found, cannot describe anything. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 573, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['git', 'describe']' returned non-zero exit status 128
-
Greg Hewgill almost 8 years@CharlieParker:
git describe
normally requires at least one tag. If you don't have any tags, use the--always
option. See the git describe documentation for more information. -
user5359531 about 7 years@crishoj Not sure how you can call it portable when this happens:
ImportError: No module named gitpython
. You cannot rely on the end user havinggitpython
installed, and requiring them to install it before your code works makes it not portable. Unless you are going to include automatic installation protocols, at which point it is no longer a clean solution. -
crishoj about 7 years@user5359531 I beg to differ. GitPython provides a pure Python implementation, abstracting away platform-specific details, and it is installable using standard package tools (
pip
/requirements.txt
) on all platforms. What's not "clean"? -
user5359531 about 7 years
pip
is not available on all systems. For that matter, neither is the external internet access needed bypip
to install said packages. -
13aal over 6 yearsI like this, pretty clean and no external libraries
-
OldTinfoil over 6 yearsThis is the normal way to do things in Python. If the OP needs those requirements, then they would have said so. We are not mind-readers, we can't predict every eventuality in each question. That way lies madness.
-
Jblasco about 6 years@user5359531, I am unclear why
import numpy as np
can be assumed throughout the whole of stackoverflow but installing gitpython is beyond 'clean' and 'portable'. I think this is by far the best solution, because it does not reinvent the wheel, hides away the ugly implementation and does not go around hacking the answer of git from subprocess. -
user5359531 about 6 yearsIf its not in the standard library, its not 'portable'. Numpy is no exception.
subprocess
is a standard method for interacting with CLI programs from within Python. Installing 3rd party libraries as a crux to solve every simple problem in Python is not a great practice and causes issues the moment you need to run your code on any other system. If you want to hide the 'ugly implementation', then use a function. If the code is never going to be run by anyone or anywhere else, then of course use whatever solution you like. -
Ryan Allen about 6 yearsand
subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD'])
for the branch name -
MD004 almost 6 yearsYuji's answer provides a similar solution in only one line of code that produces the same result. Can you explain why
numpy
found it necessary to "construct a minimal environment"? (assuming they had good reason to) -
Ryan almost 6 years@user5359531 While I agree in general that you shouldn't throw a shiny new library at every small problem, your definition of "portability" seems to neglect modern scenarios where developers have full control over all environments said applications run in. In 2018 we have Docker containers, virtual environments, and machine images (e.g. AMIs) with
pip
or the ability to easily installpip
. In these modern scenarios, apip
solution is just as portable as a "standard library" solution. -
user5359531 almost 6 years@Ryan thanks but I am a developer and have no control over my development environment, Docker is banned on company hardware due to security concerns, need to design & run programs on ancient servers, devs lacks admin rights to server, etc. etc... The year might be 2018 now but plenty of systems out there havent been updated since 2012 or earlier and not all devs have these luxuries you describe. Virtualenv also has compatibility issues between different Python versions.
-
ryanjdillon almost 6 yearsI just noticed this in their repo, and decided to add it to this question for folks interested. I don't develop in Windows, so I haven't tested this, but I had assumed that setting up the
env
dict was necessary for cross-platform functionality. Yuji's answer does not, but perhaps that works on both UNIX and Windows. -
Dims over 5 years
fatal: Not a valid object name parent
-
Greg Hewgill over 5 years@Dims: The use of the branch name
parent
is an example provided in the documentation. You would use your own branch name there. -
pfm over 5 yearsAdd a
.decode('ascii').strip()
to decode the binary string (and remove the line break). -
Dims over 5 years@GregHewgill then it should be specified
<branch_name>
or emphasised in text -
José L. Patiño over 5 years> If its not in the standard library, its not 'portable'. I'm sorry, but that does not make any sense at all. Using a language implementation (package) that abstracts the programmer from the platform (s)he is using is by far way more portable than calling subprocesses that relies in the underlying platform and in the existence of such exact commands, which can be different in Mac, Linux, Windows and BSDs. By definition, using abstraction interfaces is the very meaning of portable, while calling subcommands from your programs is absolutely not.
-
Syed Priom about 5 yearsDoesn't work if there's no tag, as opposed to Yuji 'Tomita' Tomita's answer below, which exactly provides what the question asks for.
-
Marc almost 5 yearsInstead of using
os.chdir
, thecwd=
arg can be used incheck_output
to temporary changes the working directory before executing. -
Nathaniel Ford over 4 yearsNot being able to import libraries is the pathological case, not the common case. The common case in the vast majority of programming is using libraries - particularly to avoid clunky interaction with external programs. If you can't you should use subprocess, but failing that or another compelling reason this is the best-practice solution: use a battle-hardened library built to handle the use case in question.
-
Federico Motta over 4 yearsLooking at the git blame, they did this as a bug fix for SVN 11 years ago: github.com/numpy/numpy/commit/… It's possible the bug fix is no longer necessary for git.
-
am9417 over 4 yearsSometimes the /refs/ is not found, but the current commit id is found in "packed-refs".
-
z0r about 4 years@MD004 @ryanjdillon They set the locale so that
.decode('ascii')
works - otherwise the encoding is unknown. -
z0r about 4 yearsOr add
universal_newlines=True
to get a string. -
chrislondon almost 4 yearsThis worked for me though I had to change the '\\' to '/'. Must be a Windows thing?
-
HamzDiou almost 4 yearsAgree with both opinion. I went into subprocess because GitPython needs Python > 3.4 and I'm still using Python 2.7. Maybe will use GitPython later...
-
HamzDiou almost 4 yearsI have tested subprocess on my local and when deploying crashed dev machine ! I better understand the matter now and will get back with raven that allow simply that on Python 2.7 : import raven -> raven.fetch_git_sha(BASE_DIR)
-
jlansey over 3 yearsIs there any way to import this function and use it? I tried:
from numpy.setup import git_version
and it didn't work -
ryanjdillon over 3 yearsBeing a function declared in
setup.py
, it is not part of thenumpy
package, so it isn't possible to import it fromnumpy
. To use it, you would need to add this method to your own code somewhere. -
RobinL over 3 yearsIf you always want the hash,
git describe --always
is no good because it returns the annotated tag if one exists -
Yunnosch over 3 years@Reishin I think you meant "environment-specific-coding". I think so because that would suffer less risk of being flagged for inappropriate language. (Which by the way I did not - for being too slow....)
-
Wayne Workman over 3 yearsThis avoids the resource leaks mentioned in gitpython and this is pretty clean 2-line def to get the hash. I like it.
-
duhaime almost 3 yearsThe real solution is
import subprocess; subprocess.check_output('pip install gitpython')
😎 -
Kipr over 2 yearsIf your code is ran from another directory, you might want to add
cwd=os.path.dirname(os.path.realpath(__file__))
as a parameter forcheck_output
-
John over 2 yearsThank you for including the case where the script is called from outside the repo. That just bit me.