Remove deleted files from git history

18,842

Solution 1

Here are some instructions to do what you want.

This will remove file_to_remove:

git filter-branch --index-filter 'git rm --cached --ignore-unmatch file_to_remove' --prune-empty -- --all

Solution 2

Ok now I'm trying with the following technique, will report back if it worked, because it seems to be quite long running: On a zsh or bash ON A CLONED Repository

git log --diff-filter=D --summary <start_commit>..HEAD | egrep -o '*[[:alnum:]]*(/[[:alnum:].]*)+$' > deleted.txt

to get all deleted files

for del in `cat deleted.txt`
do
    git filter-branch --index-filter "git rm --cached --ignore-unmatch $del" --prune-empty -- --all
    # The following seems to be necessary every time
    # because otherwise git won't overwrite refs/original
    git reset --hard
    git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
    git reflog expire --expire=now --all
    git gc --aggressive --prune=now
done;

This might be extremly dangeours for your data so only try on clones.

Share:
18,842
Niklas Schnelle
Author by

Niklas Schnelle

I'm a Software Engineering Student at the University of Stuttgart

Updated on June 16, 2022

Comments

  • Niklas Schnelle
    Niklas Schnelle almost 2 years

    I'm trying to split a subproject off of my git repository. However unlike in Detach (move) subdirectory into separate Git repository I don't have it in it's own subdirectory (and moving it in and doing the above only yields the history after the move).

    I've cloned the branch from which I want to split off the subproject into it's own repository and removed everything that isn't used by the subproject, so basically I could use this as the repository of my subproject.

    Now I want to get rid of the history of all files that are no longer in this repository so as to only keep the file history for the files that made it into the offspring.

    I think it must be possible with git-filter-branch but I can't figure out how

    Many thanks in advance

  • Niklas Schnelle
    Niklas Schnelle almost 12 years
    The thing is I want to just keep the files and their history that are in the working directory and have git forget about all others. It would be quite cumbersome to first find all deleted files and remove them with the above command, that's why even though I found it it's of not too much use
  • matbrgz
    matbrgz almost 11 years
    What did you end up finding?
  • Admin
    Admin over 10 years
    The reason it appears to run so slow for you is because you're running the git filter-branch command once for each file, along with a bunch of other commands (git gc is not a cheap nor fast command to run) instead of running it once for all files, so it's probably extremely inefficient. See the comments at New repo with copied history of only currently tracked files.
  • oxygen
    oxygen over 6 years
    Will pushing to github or gitlab clean-up the remote repository?
  • Oyvind
    Oyvind over 4 years
    Note that you can use git rm -r for entire directories, deleting recursively.
  • David Maness
    David Maness over 3 years
    @Oyvind Using git rm -r only deletes a file/directory from the working directory, and doesn't delete any of the history of the file/directory. It only adds the deletion to the top of the history.