How to remove previously added git subtree and its history

32,372

Solution 1

You need to use a filter-branch along with the --prune-empty option to remove any commits that no longer introduce new changes.

git filter-branch --index-filter 'git rm --cached --ignore-unmatch -rf dir1 dir2 dirN file1 file2 fileN' --prune-empty -f HEAD

After that, if you want to recover disk space you will need to delete all the original refs that filter branch saved, expire the reflog, and garbage collect.

Solution 2

You can use git filter-branch to apply an operation to all the commits you made in a branch. This includes deleting files from each commit as you rewrite the entire history. A good description and instructions is available here: http://gitready.com/beginner/2009/03/06/ignoring-doesnt-remove-a-file.html. You may find the following question useful as well: Completely remove files from Git repo and remote on GitHub (it is where I found the link). The actual command you will run with this solution is going to be something like git filter-branch --index-filter 'git rm --cached -rf subtree–folder1 subtree_folder2' HEAD.

Another way, which is probably overkill for what you are doing, is cherry-picking. Cherry-picking allows you to rewrite any portion of your history that you like with any level of detail you like: http://git-scm.com/docs/git-cherry-pick. You will want to do something like git reset --hard <HASH of commit that introduced the subtree>, followed by a series of

git cherry-pick -n <following commit hashes>
git reset
git add -p
git commit

for each subsequent commit you had made. This will allow you to remove the subtree from each commit you made in the past. You can refer to partly cherry-picking a commit with git for more information on effective cherry-picking. When you are done with this version of the deletion, you will want to remove the old commits that are no longer part of your branch:

git reflog expire --expire-unreachable=now --all
git gc --prune=now

Other referenced questions: How to move a branch backwards in git? Listing and deleting Git commits that are under no branch (dangling?)

Share:
32,372

Related videos on Youtube

jlconlin
Author by

jlconlin

Updated on October 08, 2020

Comments

  • jlconlin
    jlconlin over 3 years

    Many moons ago I added a subtree to my git repository. This subtree included several folders and files. I added the subtree instead of creating a submodule (as recommended). Now I realize I only want one of the files in the subtree and none of the rest. Even worse, when others clone my repository, what they get is not what is expected—there is some conflict with the subtree and the other code that I've created.

    I can get ride of the files/folders with

    git rm subtree–folder1 subtree_folder2 subtree_files.*
    

    however, I'm still left with a lengthy commit history from the subtree.

    I've done a fair amount of development since I originally added the subtree and can't lose the commit history that I've generated.

    In short this is what I would like:

    1. Remove all the subtree files/folders.
    2. Forget the history of all the subtree commits.
    3. Left with only my code and my history.

    Is this possible?

    PS. One possible complication is that I moved the single header file I wanted to keep from the subtree to some folder in my code. I hope this is not what is keeping me from forgetting the subtree history.

    An Attempt

    After a fresh checkout from the remote server I have the following:

    $ ls
    .git             CMakeLists.txt   Read.cpp         logging.conf
    .gitignore       ENDF6            TestData         src
    .sparse-checkout LICENCE          doc              test
    .travis.yml      README.md        include          tools
    

    Where .gitignore only has: build/ debug/

    When I try the command as suggested I don't get a very happy response:

    $ git filter-branch --index-filter 'git rm --cached -rf test tools src doc LICENCE README.md .travis.yml' HEAD
    Rewrite 2fec85e41e40ae18efd1b130f55b14166a422c7f (1/1701)fatal: pathspec 'test' did not match any files
    index filter failed: git rm --cached -rf test tools src doc LICENCE README.md .travis.yml
    

    I'm not sure why it says it has a problem with test when it is clearly there. I'm baffled.

    • hunch_hunch
      hunch_hunch over 9 years
      Have you tried simply using git rm <subtree name> to remove the subtree?
    • jlconlin
      jlconlin over 9 years
      @hunch_hunch I did that as well as the commands here: stackoverflow.com/questions/15890047/… but I still have all the history of the subtree. Help!
    • Andrew C
      Andrew C over 9 years
      You are OK with rewriting your repository history via rebase or filter-branch?
    • jlconlin
      jlconlin over 9 years
      @AndrewC I'm okay with rewriting my repository history as long as I keep the history of my changes.
    • Andrew C
      Andrew C over 9 years
      Did you try either a tree or index filter operation? kernel.org/pub/software/scm/git/docs/git-filter-branch.html
    • jlconlin
      jlconlin over 9 years
      @AndrewC I've tried them, but haven't had much luck.
    • Andrew C
      Andrew C over 9 years
      I'd need to see the filter branch you ran, and a description of how it didn't work.
    • Andrew C
      Andrew C over 9 years
      You need to add " --ignore-unmatch" to the git rm. You probably also want to add --prune-empty to the filter branch
    • jlconlin
      jlconlin over 9 years
  • jlconlin
    jlconlin over 9 years
    Thanks for your help! It isn't shown here, but in our chat @AndrewC provided the answer needed.
  • llrs
    llrs over 7 years
    @SkylarSaveland See the chat jlconlin is referring, although it should be here the solution. Mm
  • fuzzyTew
    fuzzyTew almost 3 years
    Is this the right answer: git filter-branch --index-filter 'git rm --cached --ignore-unmatch -rf dir1 dir2 dirN file1 file2 fileN' --prune-empty -f HEAD? @AndrewC could you update the answer if it is incorrect, or comment?
  • Andrew C
    Andrew C almost 3 years
    @fuzzyTew if you look at the comment/chat history you can see it worked for the OP. That said, filter-branch has been supplanted in recent times by the 'filter-repo' command so if you were starting now you might want to consider that.
  • fuzzyTew
    fuzzyTew almost 3 years
    I infer you must have updated the answer from the chat some time ago. Great. filter-repo now.