How do we verify commit messages for a push?

26,176

Solution 1

Using the update hook

You know about hooks - please, read the documentation about them! The hook you probably want is update, which is run once per ref. (The pre-receive hook is run once for the entire push) There are tons and tons of questions and answers about these hooks already on SO; depending on what you want to do, you can probably find guidance about how to write the hook if you need it.

To emphasize that this really is possible, a quote from the docs:

This hook can be used to prevent forced update on certain refs by making sure that the object name is a commit object that is a descendant of the commit object named by the old object name. That is, to enforce a "fast-forward only" policy.

It could also be used to log the old..new status.

And the specifics:

The hook executes once for each ref to be updated, and takes three parameters:

  • the name of the ref being updated,
  • the old object name stored in the ref,
  • and the new objectname to be stored in the ref.

So, for example, if you want to make sure that none of the commit subjects are longer than 80 characters, a very rudimentary implementation would be:

#!/bin/bash
long_subject=$(git log --pretty=%s $2..$3 | egrep -m 1 '.{81}')
if [ -n "$long_subject" ]; then
    echo "error: commit subject over 80 characters:"
    echo "    $long_subject"
    exit 1
fi

Of course, that's a toy example; in the general case, you'd use a log output containing the full commit message, split it up per-commit, and call your verification code on each individual commit message.

Why you want the update hook

This has been discussed/clarified in the comments; here's a summary.

The update hook runs once per ref. A ref is a pointer to an object; in this case, we're talking about branches and tags, and generally just branches (people don't push tags often, since they're usually just for marking versions).

Now, if a user is pushing updates to two branches, master and experimental:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o (origin/experimental) - o - o (experimental)

Suppose that X is the "bad" commit, i.e. the one which would fail the commit-msg hook. Clearly we don't want to accept the push to master. So, the update hook rejects that. But there's nothing wrong with the commits on experimental! The update hook accepts that one. Therefore, origin/master stays unchanged, but origin/experimental gets updated:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o - o - o (origin/experimental, experimental)

The pre-receive hook runs only once, just before beginning to update refs (before the first time the update hook is run). If you used it, you'd have to cause the whole push to fail, thus saying that because there was a bad commit message on master, you somehow no longer trust that the commits on experimental are good even though their messages are fine!

Solution 2

You could do it with the following pre-receive hook. As the other answers have noted, this is a conservative, all-or-nothing approach. Note that it protects only the master branch and places no constraints on commit messages on topic branches.

#! /usr/bin/perl

my $errors = 0;
while (<>) {
  chomp;
  next unless my($old,$new) =
    m[ ^ ([0-9a-f]+) \s+   # old SHA-1
         ([0-9a-f]+) \s+   # new SHA-1
         refs/heads/master # ref
       \s* $ ]x;

  chomp(my @commits = `git rev-list $old..$new`);
  if ($?) {
    warn "git rev-list $old..$new failed\n";
    ++$errors, next;
  }

  foreach my $sha1 (@commits) {
    my $msg = `git cat-file commit $sha1`;
    if ($?) {
      warn "git cat-file commit $sha1 failed";
      ++$errors, next;
    }

    $msg =~ s/\A.+? ^$ \s+//smx;
    unless ($msg =~ /\[\d+\]/) {
      warn "No bug number in $sha1:\n\n" . $msg . "\n";
      ++$errors, next;
    }
  }
}

exit $errors == 0 ? 0 : 1;

It requires all commits in a push to have a bug number somewhere in their respective commit messages, not just the tip. For example:

$ git log --pretty=oneline origin/master..HEAD
354d783efd7b99ad8666db45d33e30930e4c8bb7 second [123]
aeb73d00456fc73f5e33129fb0dcb16718536489 no bug number

$ git push origin master
Counting objects: 6, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (5/5), 489 bytes, done.
Total 5 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (5/5), done.
No bug number in aeb73d00456fc73f5e33129fb0dcb16718536489:

no bug number

To file:///tmp/bare.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'file:///tmp/bare.git'

Say we fix the problem by squashing the two commits together and pushing the result:

$ git rebase -i origin/master
[...]

$ git log --pretty=oneline origin/master..HEAD
74980036dbac95c97f5c6bfd64a1faa4c01dd754 second [123]

$ git push origin master
Counting objects: 4, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 279 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To file:///tmp/bare.git
   8388e88..7498003  master -> master

Solution 3

This is a python version of pre-receive, which took me a while to finish, hope it could help others. I mainly use it with Trac, but it could be easily modified for other purposes.

I have also put down the instructions to modify back the historical commit message, which is a little more complicated than I thought.

#!/usr/bin/env python
import subprocess

import sys 
import re

def main():
    input  = sys.stdin.read()
    oldrev, newrev, refname = input.split(" ")
    separator = "----****----"


    proc = subprocess.Popen(["git", "log", "--format=%H%n%ci%n%s%b%n" + separator, oldrev + ".." +  newrev], stdout=subprocess.PIPE)
    message = proc.stdout.read()
    commit_list = message.strip().split(separator)[:-1] #discard the last line

    is_valid = True

    print "Parsing message:"
    print message

    for commit in commit_list:
        line_list = commit.strip().split("\n")
        hash = line_list[0]
        date = line_list[1]
        content = " ".join(line_list[2:])
        if not re.findall("refs *#[0-9]+", content): #check for keyword
            is_valid = False

    if not is_valid:
        print "Please hook a trac ticket when commiting the source code!!!" 
        print "Use this command to change commit message (one commit at a time): "
        print "1. run: git rebase --interactive " + oldrev + "^" 
        print "2. In the default editor, modify 'pick' to 'edit' in the line whose commit you want to modify"
        print "3. run: git commit --amend"
        print "4. modify the commit message"
        print "5. run: git rebase --continue"
        print "6. remember to add the ticket number next time!"
        print "reference: http://stackoverflow.com/questions/1186535/how-to-modify-a-specified-commit"

        sys.exit(1)

main()

Solution 4

You need made a script on your pre-receive.

In this script you receive the old and new revision. You can check all commit and return false if one of this is bad.

Solution 5

You didn't mention what is your bug tracker, but if it is JIRA, then the add-on named Commit Policy can do this for without any programming.

You can set up a commit condition which requires the commit message to match a regular expression. If it doesn't, the push is rejected, and the developer must amend (fix) the commit message, then push again.

Share:
26,176

Related videos on Youtube

Dale Forester
Author by

Dale Forester

Updated on July 09, 2022

Comments

  • Dale Forester
    Dale Forester almost 2 years

    Coming from CVS, we have a policy that commit messages should be tagged with a bug number (simple suffix "... [9999]"). A CVS script checks this during commits and rejects the commit if the message does not conform.

    The git hook commit-msg does this on the developer side but we find it helpful to have automated systems check and remind us of this.

    During a git push, commit-msg isn't run. Is there another hook during push that could check commit messages?

    How do we verify commit messages during a git push?

  • Can Berk Güder
    Can Berk Güder about 14 years
    I think the hook the OP is looking for is pre-receive, since s/he wants to reject the entire push depending on the commit message. However, AFAIK, neither pre-receive nor update receive the commit message as input. So using commit-msg will probably be the best solution.
  • Cascabel
    Cascabel about 14 years
    @Can: I'm pretty sure the OP wants update, not pre-receive. "The whole push" means the push for all branches. If the user attempts to push updates to three branches, and only one contains invalid commit messages, the other two should still be accepted!
  • Cascabel
    Cascabel about 14 years
    @Can: And no, the commit message is not part of the input, but the old and new object (commit) names (SHA1s) are. Note that the update hook is executed just before the refs are updated (after the commit objects have been received). The hook can therefore use git log to inspect whatever it wants to about the commits between old and new, including their commit messages.
  • John Feminella
    John Feminella about 14 years
    @Jefromi » I'm not sure I agree, but I think this part is subjective. IMO I'd treat it as a transaction: if any part of something you did is bad, stop the whole thing so you can correct the mistakes.
  • Dale Forester
    Dale Forester about 14 years
    @John: That would be the most straightforward and desirable. The whole thing should fail if any one part is invalid.
  • Cascabel
    Cascabel about 14 years
    @John: Well, you can make your own judgment call. Here's my general thought, though. It's consistent with the general philosophy of branches in git to treat each one as a transaction. You do stop the push of that individual branch if it has one bad commit, even if it has 500 new commits on it. But two different branches are two different things - different topics, different features. If you work on two things and make a mistake on one, it shouldn't affect the other.
  • Cascabel
    Cascabel about 14 years
    @shovas: But what does an invalid commit message X on branch A have to do with branch B? B can't contain X, or it'd fail the hook too. So B is a series of commits, all of which are perfectly fine. Why should pushing B be refused simply because the developer also did something wrong on A? If they pushed branches individually (git push A, git push B) A would fail and B would succeed.
  • Cascabel
    Cascabel about 14 years
    @shovas: I do agree that push of "the whole" should fail if "one part" is invalid, but the proper definitions in this context are that "the whole" is the branch, and the one part is a single commit. Content on one branch cannot invalidate content on another branch.
  • Dale Forester
    Dale Forester about 14 years
    @Jefromi: You make a good point. I guess my uncertainty is if is the "once per ref" idea in the docs. Does that mean all commits on a branch are one ref?
  • Cascabel
    Cascabel about 14 years
    @shovas: A ref is a pointer to an object, generally commit. Examples of refs are tags and branches. So when the docs say once per ref updated in a push, they mean once per branch/tag. If you've made 50 commits on a branch since you pushed, git uploads all 50, then updates the ref. The old position of the ref is just before the first commit pushed; the new is the last commit pushed. The hook runs just before updating the ref; you can examine all 50 commits. If the hook fails, the ref won't move at all.
  • Zarathustra
    Zarathustra almost 8 years
    but git log --pretty=%s $2..$3 does not work when pushing a new branch to a remote. Error message: Invalid revision range 0000000000000000000000000000000000000000..d480cc0993800cadd6‌​c23d00c608fe52723008‌​96