Delete file with all history from svn repository

40,694

Solution 1

This has recently become much more straightforward with the command svndumpfilter. Details are available in the subversion documentation here. Basically, to avoid conflicts (explained here), it takes a repo dump and redoes each commit, either including or excluding a given file prefix. Basic syntax:

svndumpfilter exclude yourfileprefix < yourdump > yournewdump

Exclude is probably what the question asker is looking for, but you can also use include to, say, extract a subtree of the repo so as to spin it off as its own repository.

The latest revision of subversion in subversion (very meta) can also take glob patterns. I recently had to remove all pdfs from a repo and it was very easily done like so:

svndumpfilter exclude --pattern '*.pdf' < dump > dump_nopdfs

Further usage information can be found by calling svndumpfilter help and svndumpfilter help exclude.

Solution 2

But this is too complicated and unreliable.

I wouldn't know why this shouldn't be considered reliable. However, if you want to completely get rid of the file, history and all, no matter what the effect on previous revisions this file was part of, there only is one way to do so and that way is indeed complicated. And rightly so. SVN is a tool with one single goal: never ever to lose any file, even after it was deleted. Forcing it to do otherwise ought to be hard.

Solution 3

I was facing a similar issue, except that I needed to remove multiple files, not just one file, and also we are on Subversion 1.6 which doesn't support the --patern directive.

-- backup current SVN

$ cp -R /svn  /svnSAVE

-- dump repository

$ svnadmin dump /svn/root > svnDump

-- create new dump while excluding the very large file

$ svndumpfilter exclude "/path/file.csv" < svnDump > newSvnDump0
-- {note: should see a message like this}:
--          Dropped 1 node:
--                  '/path/file.csv'

-- create another new dump while excluding another very large file

$ svndumpfilter exclude "/path/anotherFile.csv" < newSvnDump0 > newSvnDump1

-- remove the old svn

$ rm -rf /svn

-- recreate the svn directories

$ mkdir -p /svn/root

-- recreate the SVN

$ svnadmin create /svn/root

-- repopulate the fresh repository with the dump

$ cat newSvnDump1 | svnadmin load /svn/root

-- update the conf files from the saved copy into the new copy...

$ cp /svnSAVE/root/conf/* /svn/root/conf

Now the repository should not contain the 2 large files "file.csv" and "anotherFile.csv"

Share:
40,694
altern
Author by

altern

I'm a software engineer, configuration manager, instructor, musician. My passion is software configuration management and related stuff: version control continuous integration build management deployment management dependency management merge management release management Check out my training dedicated to software configuration management. You can see presentation slides on my slideshare page.

Updated on March 29, 2020

Comments

  • altern
    altern about 4 years

    Is there any way to delete file from svn repository including all its history? This issue emerges when I want to get rid of large binary file residing in repo.

    I know only one approach that might help in this situation:

    1. Dump all repo with the help of svnadmin utility.
    2. Filter dumped file with grep. Grep should use filename and write in to the other dump-file
    3. Import last dump-file with svnadmin

    But this is too complicated and unreliable. Maybe there is another solution?

  • Shawn
    Shawn over 11 years
    So the entire process would be : svnadmin dump > myDump; svndumpfilter exclude myFile < myDump > newDump; cat newDump | svnadmin load myRepositoryURL; Correct?
  • Shawn
    Shawn over 11 years
    Ok, I tried it and the process looks rather like this: svnadmin dump path_to_repository > old.dump; svndumpfilter exclude file_prefix < old.dump > new.dump; rm -rf path_to_repository; svnadmin create path_to_repository; svnadmin load path_to_repository < new.dump; The main difference being that you have to delete the repository and re-create it before loading the filtered dump. Note also that path_to_repository is the path to the repository on the server, not the path to your working copy.
  • Znik
    Znik over 9 years
    yes it is ok, but you can ommit temporary files. you can do all like that: svnadmin create path_to_NEW_repository; svnadmin dump path_to_CURRENT_repository | svndumpfilter exclude file_prefix | svnadmin load path_to_NEW_repository; of course you must configure NEW repository for web. test it. If all is ok, then check for idle using or down www, rename CURRENT to OLD, then rename NEW to CURRENT, enable access. if all is ok, you can backup old directory for any reason if it is needed, and restore previous web config. Don't remove any source data without thinking :)