Git commits are duplicated in the same branch after doing a rebase

84,967

Solution 1

You should not be using rebase here, a simple merge will suffice. The Pro Git book that you linked basically explains this exact situation. The inner workings might be slightly different, but here's how I visualize it:

  • C5 and C6 are temporarily pulled out of dev
  • C7 is applied to dev
  • C5 and C6 are played back on top of C7, creating new diffs and therefore new commits

So, in your dev branch, C5 and C6 effectively no longer exist: they are now C5' and C6'. When you push to origin/dev, git sees C5' and C6' as new commits and tacks them on to the end of the history. Indeed, if you look at the differences between C5 and C5' in origin/dev, you'll notice that though the content is the same, the line numbers are probably different -- which makes the hash of the commit different.

I'll restate the Pro Git rule: never rebase commits that have ever existed anywhere but your local repository. Use merge instead.

Solution 2

Short answer

You omitted the fact that you ran git push, got the following error, and then proceeded to run git pull:

To [email protected]:username/test1.git
 ! [rejected]        dev -> dev (non-fast-forward)
error: failed to push some refs to '[email protected]:username/test1.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Despite Git trying to be helpful, its 'git pull' advice is most likely not what you want to do.

If you are:

  • Working on a "feature branch" or "developer branch" alone, then you can run git push --force to update the remote with your post-rebase commits (as per user4405677's answer).
  • Working on a branch with multiple developers at the same time, then you probably should not be using git rebase in the first place. To update dev with changes from master, you should, instead of running git rebase master dev, run git merge master whilst on dev (as per Justin's answer).

A slightly longer explanation

Each commit hash in Git is based on a number of factors, one of which is the hash of the commit that comes before it.

If you reorder commits you will change commit hashes; rebasing (when it does something) will change commit hashes. With that, the result of running git rebase master dev, where dev is out of sync with master, will create new commits (and thus hashes) with the same content as those on dev but with the commits on master inserted before them.

You can end up in a situation like this in multiple ways. Two ways I can think of:

  • You could have commits on master that you want to base your dev work on
  • You could have commits on dev that have already been pushed to a remote, which you then proceed to change (reword commit messages, reorder commits, squash commits, etc.)

Let's better understand what happened—here is an example:

You have a repository:

2a2e220 (HEAD, master) C5
ab1bda4 C4
3cb46a9 C3
85f59ab C2
4516164 C1
0e783a3 C0

Initial set of linear commits in a repository

You then proceed to change commits.

git rebase --interactive HEAD~3 # Three commits before where HEAD is pointing

(This is where you'll have to take my word for it: there are a number of ways to change commits in Git. In this example I changed the time of C3, but you be inserting new commits, changing commit messages, reordering commits, squashing commits together, etc.)

ba7688a (HEAD, master) C5
44085d5 C4
961390d C3
85f59ab C2
4516164 C1
0e783a3 C0

The same commits with new hashes

This is where it is important to notice that the commit hashes are different. This is expected behaviour since you have changed something (anything) about them. This is okay, BUT:

A graph log showing that master is out-of-sync with the remote

Trying to push will show you an error (and hint that you should run git pull).

$ git push origin master
To [email protected]:username/test1.git
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to '[email protected]:username/test1.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

If we run git pull, we see this log:

7df65f2 (HEAD, master) Merge branch 'master' of bitbucket.org:username/test1
ba7688a C5
44085d5 C4
961390d C3
2a2e220 (origin/master) C5
85f59ab C2
ab1bda4 C4
4516164 C1
3cb46a9 C3
0e783a3 C0

Or, shown another way:

A graph log showing a merge commit

And now we have duplicate commits locally. If we were to run git push we would send them up to the server.

To avoid getting to this stage, we could have run git push --force (where we instead ran git pull). This would have sent our commits with the new hashes to the server without issue. To fix the issue at this stage, we can reset back to before we ran git pull:

Look at the reflog (git reflog) to see what the commit hash was before we ran git pull.

070e71d HEAD@{1}: pull: Merge made by the 'recursive' strategy.
ba7688a HEAD@{2}: rebase -i (finish): returning to refs/heads/master
ba7688a HEAD@{3}: rebase -i (pick): C5
44085d5 HEAD@{4}: rebase -i (pick): C4
961390d HEAD@{5}: commit (amend): C3
3cb46a9 HEAD@{6}: cherry-pick: fast-forward
85f59ab HEAD@{7}: rebase -i (start): checkout HEAD~~~
2a2e220 HEAD@{8}: rebase -i (finish): returning to refs/heads/master
2a2e220 HEAD@{9}: rebase -i (start): checkout refs/remotes/origin/master
2a2e220 HEAD@{10}: commit: C5
ab1bda4 HEAD@{11}: commit: C4
3cb46a9 HEAD@{12}: commit: C3
85f59ab HEAD@{13}: commit: C2
4516164 HEAD@{14}: commit: C1
0e783a3 HEAD@{15}: commit (initial): C0

Above we see that ba7688a was the commit we were at before running git pull. With that commit hash in hand we can reset back to that (git reset --hard ba7688a) and then run git push --force.

And we're done.

But wait, I continued to base work off of the duplicated commits

If you somehow didn't notice that the commits were duplicated and proceeded to continue working atop of duplicate commits, you've really made a mess for yourself. The size of the mess is proportional to the number of commits you have atop of the duplicates.

What this looks like:

3b959b4 (HEAD, master) C10
8f84379 C9
0110e93 C8
6c4a525 C7
630e7b4 C6
070e71d (origin/master) Merge branch 'master' of bitbucket.org:username/test1
ba7688a C5
44085d5 C4
961390d C3
2a2e220 C5
85f59ab C2
ab1bda4 C4
4516164 C1
3cb46a9 C3
0e783a3 C0

Git log showing linear commits atop duplicated commits

Or, shown another way:

A log graph showing linear commits atop duplicated commits

In this scenario we want to remove the duplicate commits, but keep the commits that we have based on them—we want to keep C6 through C10. As with most things, there are a number of ways to go about this:

Either:

  • Create a new branch at the last duplicated commit1, cherry-pick each commit (C6 through C10 inclusive) onto that new branch, and treat that new branch as canonical.
  • Or run git rebase --interactive $commit, where $commit is the commit prior to both the duplicated commits2. Here we can outright delete the lines for the duplicates.

1 It doesn't matter which of the two you choose, either ba7688a or 2a2e220 work fine.

2 In the example it would be 85f59ab.

TL;DR

Set advice.pushNonFastForward to false:

git config --global advice.pushNonFastForward false

Solution 3

I think you skipped an important detail when describing your steps. More specifically, your last step, git push on dev, would have actually given you an error, as you can not normally push non-fastforward changes.

So you did git pull before the last push, which resulted in a merge commit with C6 and C6' as parents, which is why both will remain listed in log. A prettier log format might have made it more obvious they are merged branches of duplicated commits.

Or you made a git pull --rebase (or without explicit --rebase if it is implied by your config) instead, which pulled the original C5 and C6 back in your local dev (and further re-rebased the following ones to new hashes, C7' C5'' C6'').

One way out of this could have been git push -f to force the push when it gave the error and wipe C5 C6 from origin, but if anyone else also had them pulled before you wiped them, you'd be in for a whole lot more trouble... basically everyone that has C5 C6 would need to do special steps to get rid of them. Which is exactly why they say you should never rebase anything that's already published. It's still doable if said "publishing" is within a small team, though.

Solution 4

I found out that in my case, this issue the consequence of a Git configuration problem. (Involving pull and merge)

Description of the problem:

Sympthoms: Commits duplicated on child branch after rebase, implying numerous merges during and after rebase.

Workflow: Here are steps of the workflow I was performing:

  • Work on the "Features-branch" (child of "Develop-branch")
  • Commit and Push changes on "Features-branch"
  • Checkout "Develop-branch" (Mother branch of Features) and work with it.
  • Commit and push changes on "Develop-branch"
  • Checkout "Features-branch" and pull changes from repository (In case someone else has commited work)
  • Rebase "Features-branch" onto "Develop-branch"
  • Push force of changes on "Feature-branch"

As conséquences of this workflow, duplication of all commits of "Feature-branch" since previous rebase... :-(

The issue was due to the pull of changes of child branch before rebase. Git default pull configuration is "merge". This is changing indexes of commits performed on the child branch.

The solution: in Git configuration file, configure pull to work in rebase mode:

...
[pull]
    rebase = preserve
...

Hope it can help JN Grx

Solution 5

You may have pulled from a remote branch different from your current. For example you may have pulled from Master when your branch is develop tracking develop. Git will dutifully pull in duplicate commits if pulled from a non-tracked branch.

If this happens, you can do the following:

git reset --hard HEAD~n

where n == <number of duplicate commits that shouldn't be there.>

Then make sure you are pulling from the correct branch and then run:

git pull upstream <correct remote branch> --rebase

Pulling with --rebase will ensure you aren't adding extraneous commits which could muddy up the commit history.

Here is a bit of hand holding for git rebase.

Share:
84,967

Related videos on Youtube

elitalon
Author by

elitalon

Updated on February 16, 2022

Comments

  • elitalon
    elitalon about 2 years

    I understand the scenario presented in Pro Git about The Perils of Rebasing. The author basically tells you how to avoid duplicated commits:

    Do not rebase commits that you have pushed to a public repository.

    I am going to tell you my particular situation because I think it does not exactly fit the Pro Git scenario and I still end up with duplicated commits.

    Let's say I have two remote branches with their local counterparts:

    origin/master    origin/dev
    |                |
    master           dev
    

    All four branches contains the same commits and I am going to start development in dev:

    origin/master : C1 C2 C3 C4
    master        : C1 C2 C3 C4
    
    origin/dev    : C1 C2 C3 C4
    dev           : C1 C2 C3 C4
    

    After a couple of commits I push the changes to origin/dev:

    origin/master : C1 C2 C3 C4
    master        : C1 C2 C3 C4
    
    origin/dev    : C1 C2 C3 C4 C5 C6  # (2) git push
    dev           : C1 C2 C3 C4 C5 C6  # (1) git checkout dev, git commit
    

    I have to go back to master to make a quick fix:

    origin/master : C1 C2 C3 C4 C7  # (2) git push
    master        : C1 C2 C3 C4 C7  # (1) git checkout master, git commit
    
    origin/dev    : C1 C2 C3 C4 C5 C6
    dev           : C1 C2 C3 C4 C5 C6
    

    And back to dev I rebase the changes to include the quick fix in my actual development:

    origin/master : C1 C2 C3 C4 C7
    master        : C1 C2 C3 C4 C7
    
    origin/dev    : C1 C2 C3 C4 C5 C6
    dev           : C1 C2 C3 C4 C7 C5' C6'  # git checkout dev, git rebase master
    

    If I display the history of commits with GitX/gitk I notice that origin/dev now contains two identical commits C5' and C6' which are different to Git. Now if I push the changes to origin/dev this is the result:

    origin/master : C1 C2 C3 C4 C7
    master        : C1 C2 C3 C4 C7
    
    origin/dev    : C1 C2 C3 C4 C5 C6 C7 C5' C6'  # git push
    dev           : C1 C2 C3 C4 C7 C5' C6'
    

    Maybe I don't fully understand the explanation in Pro Git, so I would like to know two things:

    1. Why does Git duplicate these commits while rebasing? Is there a particular reason to do that instead of just applying C5 and C6 after C7?
    2. How can I avoid that? Would it be wise to do it?
    • learning2learn
      learning2learn almost 3 years
      There is a smorgasbord of excellent answers all over this question. I wonder if someone has a succinct gist or article or wiki, not a book, that rolls up some latter day best practices? I've gotten the OP and other branch/merge issues using GitLab & GitHub multiple times and it seems like there is so much contradicting advice that leads repeated problems.
  • Wazery
    Wazery almost 12 years
    I have the same issue, how I can fix my remote branch history now, is there any other option other than deleting the branch and recreating it with cherry-picking??
  • Justin ᚅᚔᚈᚄᚒᚔ
    Justin ᚅᚔᚈᚄᚒᚔ almost 12 years
    @xdsy: Jave a look at this and this.
  • KJ50
    KJ50 over 9 years
    You say "C5 and C6 are temporarily pulled out of dev... C7 is applied to dev". If this is the case, then why do C5 and C6 show up before C7 in the ordering of commits on origin/dev?
  • Whymarrh
    Whymarrh almost 9 years
    The omission of git pull is crucial. Your recommendation of git push -f, while dangerous, is probably what readers are looking for.
  • Justin ᚅᚔᚈᚄᚒᚔ
    Justin ᚅᚔᚈᚄᚒᚔ over 7 years
    @KJ50: Because C5 and C6 were already pushed to origin/dev. When dev is rebased, its history is modified (C5/C6 temporarily removed and reapplied after C7). Modifying history of pushed repos is generally a Really Bad Idea™ unless you know what you're doing. In this simple case, the issue could be solved by doing a force push from dev to origin/dev after the rebase and notifying anyone else working off of origin/dev that they're probably about to have a bad day. The better answer, again, is "don't do that... use merge instead"
  • G. Sylvie Davies
    G. Sylvie Davies over 7 years
    It's okay to follow the "git pull..." advice as long one realizes the ellipsis hides the "--rebase" option (aka "-r"). ;-)
  • Özgür Murat Sağdıçoğlu
    Özgür Murat Sağdıçoğlu almost 7 years
    One thing to note: The hash of C5 and C5' are certainly different, but not because of the line numbers are different, but for the following two facts of which any one is enough for the difference: 1) the hash we are talking about is the hash of entire source tree after commit, not the hash of delta difference, and therefore C5' contains whatever comes from the C7, while C5 doesn't, and 2) The parent of C5' is different from C5, and this information is also included in the root node of a commit tree affecting the hash result.
  • elitalon
    elitalon over 6 years
    Indeed. Back when I wrote the question I actually did git push --force, just to see what Git was going to do. I learnt a ton about Git since then and nowadays rebase is part of my normal workflow. However, I do git push --force-with-lease to avoid overwriting someone else's work.
  • Whymarrh
    Whymarrh about 6 years
    Using --force-with-lease is a good default, I'll leave a comment under my answer as well
  • Whymarrh
    Whymarrh about 6 years
    I would recommend using git push's --force-with-lease nowadays as it's a better default
  • ZeMoon
    ZeMoon over 5 years
    It's either this answer or a time machine. Thanks!
  • thepurpleowl
    thepurpleowl almost 5 years
    if git merge master is done, then the bullet points should be changed a little bit, like 1. C7 is already there in master , 2. C5 and C6 are played back on top of C7 in master
  • ScottyBlades
    ScottyBlades over 4 years
    What if you don't want to have a merge commit muddying up your git history. This can be irksome when working on big teams with big projects.
  • Justin ᚅᚔᚈᚄᚒᚔ
    Justin ᚅᚔᚈᚄᚒᚔ over 4 years
    @ScottyBlades Why would you not want a merge commit? Those are vitally important, particularly for large teams: they tell you when work done on another branch was merged into the mainline (or other branch) and provide a summary of what changed. This article may change your mind about merge commits.
  • ScottyBlades
    ScottyBlades over 4 years
    Clarification: Merge commits are great most of the time. There have been some situations when I created extraneous merge commits which didn’t accurately portray merges as intended, but I don’t remember how I screwed things up prior to that to be frank. But I was glad —rebase was an option, because other devs were rejecting my pr based on them.
  • Dhruv Singhal
    Dhruv Singhal over 4 years
    Very neat explanation... I stumbled upon a similar issue that duplicated my code 5-6 times after I attempted rebase repeatedly... just to be sure the code is up-to-date with master... but every time it pushed new commits to my branch, duplicating my code as well. Can you please tell me if force push (with lease option) is safe to do here if I am the only developer working on my branch? Or merging master into mine instead rebasing is better way?
  • Itamar Katz
    Itamar Katz about 4 years
    Thanks @Justinᚅᚔᚈᚄᚒᚔ ! when you say "never rebase commits that have ever existed anywhere" you mean C5 and C6 in the example, right? (not C7). But that means I never push to remote commits I want to rebase? And why not use rebase in this situation? I thought this is exactly the 'rebase on branch, merge on master' workflow. (I also get duplicated commits after rebase on my branch). and one last question.. if I work on a branch on 2 machines (say my pc and a server), I MUST push commits, so I cannot follow 'never rebase...'. what's the best workflow in that case? thanks
  • Torge
    Torge almost 4 years
    I user rebasing like crasy, never merge and it works like a charm. best part of git! I would never rebase the master, but all branches in projects I control have to rebase on master before being allowed in. No merge mess, cristal clean history. What happened here was most likely another error as stackoverflow.com/a/30927009/2075537 explains. You will have to force commit to a remote branch, but use remote branches!! else you don't have a backup!!
  • SungHo Choi
    SungHo Choi over 2 years
    Thanks! It helped me a lot.
  • thargenediad
    thargenediad over 2 years
    Holy crap! This is one of the most useful answers I've ever encountered on SO! I would only suggest one edit, because I was confused at first: The "Either:" bulleted list. I didn't realize at first that it was an either/or thing, so I thought initially that I needed to perform the first bulleted item and then the second.