How can I tell what happened in a Git commit with two parents that did not merge in the changes from the second parent?

11,321

Solution 1

However, he says that he does sometimes do a pull origin master that results in everyone else's changes getting put in his index or working tree as if he had made those changes and not the actual authors of those changes.

It sounds like he was getting merge conflicts but does not understand what they are. This is an extremely common problem, and unfortunately, we don't know a good way to avoid it (switching back to SVN doesn't avoid it, for example).

How could this happen, exactly?

Let's call your developers Alice and Bob. Alice made commits 1-5, and Bob made commits A and X. Here is a plausible history.

  1. Bob makes commit A.

  2. Alice makes commits 1-5, and pushes them to the central repository.

  3. Bob tries to push A, but can't, because his repository is out of date.

    $ git push
     ! [rejected]        master -> master (non-fast-forward)
    
  4. Bob then he does what you told him to do: he pulls first. However, he gets a merge conflict because commit A and commits 1-5 touch some of the same code.

    $ git pull
    Auto-merging file.txt
    CONFLICT (content): Merge conflict in file.txt
    Automatic merge failed; fix conflicts and then commit the result.
    
  5. Bob sees other people's changes in his working directory, and doesn't understand why the changes are there.

    $ git status
        both modified:   file.txt
    
  6. He thinks Git is doing something wrong, when in fact, Git is asking him to resolve a merge conflict. He tries to check out a fresh copy, but gets an error:

    $ git checkout HEAD file.txt  
    error: path 'file.txt' is unmerged
    
  7. Since it doesn't work, he tries -f:

    $ git checkout -f HEAD file.txt
    warning: path 'file.txt' is unmerged
    
  8. Success! He commits and pushes.

    $ git commit
    $ git push
    

The part where it gets harder

There are a lot of git tools out there. Seriously. Visual Studio and Xcode both come with Git integration, there are several other GUIs, and there are even multiple command-line clients. People are also sloppy with the way they describe how they use Git, and most developers are not quite comfortable enough with how Git works outside of the "pull commit push" workflow.

There was an excellent paper on this very subject not too long ago (I'm having a hard time finding it). Some of the conclusions were (forgive my memory):

  • Most developers don't really know how to use source control, except for a few really simple commands (commit, push).

  • When source control doesn't behave the way developers expect, they resort to tactics such as copy-pasting some command they don't quite understand to "fix things", adding the -f flag, or erasing the repository and starting again with a clean copy.

  • On development teams, it is often the case that only the lead developers really know what is going on in the repo.

So this is really an educational challenge.

I think the key lesson here that Bob needs to learn is that git pull is really just git fetch and git merge, and that you can get merge conflicts, and you need to act in a very conscientious and purposeful manner when resolving merges. This applies even when there are no reported conflicts... but let's not blow Bob's mind too much for now!

The other key lesson here is that lead developers need to take the time to ensure that everyone on the team can use source control correctly, and understands how pulling, pushing, branching, and merging are all related. This is a great opportunity for a lunchtime lecture: put together some slides, buy pizza, and talk about how Git works.

Solution 2

There's several ways to get the behavior of things being in their index. A pull is a fetch then a merge. That merge can result in a conflict which would look as you described with other people's changes in your index. A user who doesn't understand conflict management can cause a lot of damage and the result could be the bad merge.

Otherwise, they'd have to pass extra flags to git pull like --no-commit to make it behave as they describe.

Here's how I'd investigate...

Users are notorious for not reporting all the information. I'd find out exactly what they're doing when the problem happens, ask them to copy their terminal history when it happens. Their shell history or reflog might be interesting, too.

Check their configuration. I would look at their ~/.gitconfig, project/.git/config and env | grep GIT to see if there's anything funny.

I'd also find out if they're using git on the command line or some tool, the tool could be causing the problem.

Find out what version of git they're using, maybe it's an old or buggy release (though I have yet to encounter a situation caused by a git bug).

Check their remotes, it's possible they've got some other repository mixed in somehow.

Does the repository have any hooks? If so, are they using utilities that might not be working as expected on the user's machine?

Share:
11,321

Related videos on Youtube

DAC
Author by

DAC

Java Web App Developer

Updated on September 22, 2022

Comments

  • DAC
    DAC over 1 year

    In Gitk I can see a team member's commit (X) that has two parents, the first parent is his own previous commit (A), the other parent contains lots of other people commits (1 through 5). After his merge all changes made by other people (1 through 5 and others) are no longer present at X, B, C, etc...

    A------------
                  \
                   X - B - C 
                  /
    1--2--3--4--5
               /
    e--r--j--k
         /
    l--m
    

    If I diff commit X to commit A it shows no differences, if I diff commit X to commit 5 it shows all the missing changes. Also, at commit X, B, or C git log does not show changes that were made to files in commits 1 through 5. However, if I do git log --full-history then history does show the changes that were made in 1 through 5, but those changes are not still present in the actual file and history does not show them being being undone. So git log --full-history seems to contradict the current file contents.

    I talked to the user who made commit X. He says he did not do a reset or rebase and he says he hasn't reverted any commits during the time in question. However, he says that he does sometimes do a pull origin master that results in everyone else's changes getting put in his index or working tree as if he had made those changes and not the actual authors of those changes. He says when that happens he does a fresh clone and does not push anything from that local repo to master because he believes Git has done something wrong.

    Are the two things related (bad pull and bad merge)?

    How can I tell exactly what happened so that we can avoid this in the future?

    And what causes Git to sometimes put changes pulled from origin master to be placed in the local working directory or index as if they were local changes?

  • DAC
    DAC over 9 years
    The user in question knows the importance of merge resolutions and has a history of correct resolution. However, this user does use several different tools, git bash to pull push and a NetBeans client to commit. I'll review his environment vars and tool settings when user returns to work tomorrow.
  • DAC
    DAC over 9 years
    I thought Bob knew how to resolve merge conflicts, however, since you suspect this, I'll investigate what Bob actually does when conflicts arise. Will comment again after doing so.
  • Dietrich Epp
    Dietrich Epp over 9 years
    He probably does know how to resolve merge conflicts, but some people don't realize that you can get merge conflicts when you do a git pull, and some people will run increasingly reckless commands when Git is doing something that they don't understand.
  • DAC
    DAC over 9 years
    Further discussion resulted in the user recalling that maybe he did do a rebase followed by a push after all. He also said that when a pull results in other people's changes in his working tree he becomes convinced that Git has done something wrong and re clones, and he uses the IDE's conflict resolution GUI tool.
  • DAC
    DAC almost 6 years
    I have come to realize how this usually happens. When pull origin master fails due to conflicts a git status will show all of the changes from the other branch as staged to commit. Bob thinks, those aren't my changes, panics and starts unstaging stuff. But actually, Bob must commit all those other people's changes after resolving the conflicts, otherwise, git will never merge those in. Solution is tell Bob, when you have conflicts, resolve conflicts and commit. Don't worry about other peoples changes in your staged, that's normal.
  • Betlista
    Betlista over 4 years
    Hello, I'm Bob.