rsync exclude according to .gitignore & .hgignore & svn:ignore like --filter=:C

34,026

Solution 1

As mentioned by luksan, you can do this with the --filter switch to rsync. I achieved this with --filter=':- .gitignore' (there's a space before ".gitignore") which tells rsync to do a directory merge with .gitignore files and have them exclude per git's rules. You may also want to add your global ignore file, if you have one. To make it easier to use, I created an alias to rsync which included the filter.

Solution 2

After the hours of research I have found exactly what I need: to sync the destination folder with the source folder (also deleting files in the destination if they were deleted in the source), and not to copy to the destination the files that are ignored by .gitignore, but also not to delete this files in the destination:

rsync -vhra /source/project/ /destination/project/ --include='**.gitignore' --exclude='/.git' --filter=':- .gitignore' --delete-after

In other words, this command completely ignores files from .gitignore, both in source and in the destination. You can omit --exclude='/.git' part if want to copy the .git folder too.

You MUST copy .gitignore files from the source. If you will use LordJavac's command, the .gitignore will not be copied. And if you create a file in the destination folder, that should be ignored by .gitignore, this file will be deleted despite .gitignore. This is because you don't have .gitignore-files in the destination. But if you will have these files, the files described in the .gitignore will not be deleted, they will be ignored, just expected.

Solution 3

You can use git ls-files to build the list of files excluded by the repository's .gitignore files. https://git-scm.com/docs/git-ls-files

Options:

  • --exclude-standard Consider all .gitignore files.
  • -o Don't ignore unstaged changes.
  • -i Only output ignored files.
  • --directory Only output the directory path if the entire directory is ignored.

The only thing I left to ignore was .git.

rsync -azP --exclude=.git --exclude=`git -C <SRC> ls-files --exclude-standard -oi --directory` <SRC> <DEST>

Solution 4

2018 solution confirmed

rsync -ah --delete 
    --include .git --exclude-from="$(git -C SRC ls-files \
        --exclude-standard -oi --directory >.git/ignores.tmp && \
        echo .git/ignores.tmp')" \
    SRC DST 

Details: --exclude-from is mandatory instead of --exclude because likely case that exclude list would not be parsed as an argument. Exclude from requires a file and cannot work with pipes.

Current solution saves the exclude file inside the .git folder in order to assure it will not affect git status while keeping it self contained. If you want you are welcome to use /tmp.

Solution 5

how about rsync --exclude-from='path/.gitignore' --exclude-from='path/myignore.txt' source destination?
It worked for me.
I believe you can have more --exclude-from parameters too.

Share:
34,026
Jesse Glick
Author by

Jesse Glick

Updated on July 08, 2022

Comments

  • Jesse Glick
    Jesse Glick almost 2 years

    Rsync includes a nifty option --cvs-exclude to “ignore files in the same way CVS does”, but CVS has been obsolete for years. Is there any way to make it also exclude files which would be ignored by modern version control systems (Git, Mercurial, Subversion)?

    For example, I have lots of Maven projects checked out from GitHub. Typically they include a .gitignore listing at least target, the default Maven build directory (which may be present at top level or in submodules). Since the contents of these directories are entirely disposable, and they can be far larger than source code, I would like to exclude them when using rsync for backups.

    Of course I can explicitly --exclude=target/ but that will accidentally suppress unrelated directories that just happen to be named target and are not supposed to be ignored.

    And I could supply a complete list of absolute paths for all file names and patterns mentioned in any .gitignore, .hgignore, or svn:ignore property on my disk, but this would be a huge list that would have to be produced by some sort of script.

    Since rsync has no built-in support for VCS checkouts other than CVS, is there any good trick for feeding it their ignore patterns? Or some kind of callback system whereby a user script can be asked whether a given file/directory should be included or not?

    Update: --filter=':- .gitignore' as suggested by LordJavac seems to work as well for Git as --filter=:C does for CVS, at least on the examples I have found, though it is unclear if the syntax is an exact match. --filter=':- .hgignore' does not work very well for Mercurial; e.g. an .hgignore containing a line like ^target$ (the Mercurial equivalent of Git /target/) is not recognized by rsync as a regular expression. And nothing seems to work for Subversion, for which you would have to parse .svn/dir-prop-base for a 1.6 or earlier working copy, and throw up your hands in dismay for a 1.7 or later working copy.

  • Jesse Glick
    Jesse Glick over 11 years
    A good start, though I hesitate to “accept” this answer as it only covers Git.
  • Jesse Glick
    Jesse Glick almost 11 years
    To the contrary, I definitely want to include .git/ directories, perhaps even more strongly than the working copy. What I want to exclude are build products.
  • VasiliNovikov
    VasiliNovikov over 9 years
    Also, this setting is not portable. It's per-user, not per-project.
  • VasiliNovikov
    VasiliNovikov over 9 years
    A more verbose version which also excludes .git files: --exclude='/.git' --filter="dir-merge,- .gitignore"
  • rolandow
    rolandow almost 9 years
    I have something like this now: rsync -rvv --exclude='.git*' --exclude='/rsync-to-dev.sh' --filter='dir-merge,-n /.gitignore' $DIR/ development.foobar.com:~/test/ .. but although it says [sender] hiding file .gitignore because of pattern .git*, the file still is sent to the desintation
  • Johan Boulé
    Johan Boulé almost 9 years
    @JesseGlick I second you about keeping .git/ dirs included. Git being a distributed SCM, it's important to backup the whole local repository.
  • Jesse Glick
    Jesse Glick almost 9 years
    Trying this via locate -0e .gitignore | (while read -d '' x; do process_git_ignore "$x"; done), but has a lot of issues. Files in the same directory as .gitignore not correctly separated from the directory name with /. Blank lines and comments misinterpreted. Chokes on .gitignore files in paths with spaces (never mind the fiendish /opt/vagrant/embedded/gems/gems/rb-fsevent-0.9.4/spec/fixtur‌​es/custom 'path/.gitignore from the vagrant package for Ubuntu). Perhaps better done as a Perl script.
  • cobbzilla
    cobbzilla almost 9 years
    @JesseGlick I'm not sure why you're calling the function within the script. it's intended to be used as a drop-in replacement for rsync, for the specific reason that handling quoting/whitespace is such a pain. If you have an example of a gsync command line that is failing, and the .gitignore files associated with it, I would be happy to take a closer look.
  • Jesse Glick
    Jesse Glick almost 9 years
    I need to rsync an entire filesystem, with various Git repositories scattered around it. Perhaps your script works fine for the case of synchronizing a single repository.
  • cobbzilla
    cobbzilla almost 9 years
    yes, definitely. sorry I did not make that clear. With this script, you'd have to invoke it once per git repo, from within the repo directory.
  • Jesse Glick
    Jesse Glick over 8 years
    This will work insofar as your .gitignore files happen to use a syntax compatible with rsync.
  • marathon
    marathon over 7 years
    this doesn't work. it excludes the first file from the git subcommand and then treats the rest as part of the SRC list. this works: rsync -azP --exclude-from="$(git -C SRC ls-files --exclude-standard -oi --directory > /tmp/excludes; echo /tmp/excludes)" SRC DEST
  • ostrokach
    ostrokach about 6 years
    This is the only method that works if you have both exclude and include lines in your .gitignore (i.e. lines that start with !). It also rsyncs files that you --force added to your repo, which is usually a good thing.
  • sorin
    sorin about 6 years
    Indeed this answer does NOT WORK, so I ended up writing one that works: stackoverflow.com/a/50059607/99834
  • sorin
    sorin about 6 years
    @JesseGlick is right, rsync is not able to parse .gitignore files, see stackoverflow.com/a/50059607/99834 workround.
  • dbolotin
    dbolotin about 6 years
    If you also want to use --delete option, here is the working command line: rsync --delete-after --filter=":e- .gitignore" --filter "- .git/" -v -a .... This took me a while... e in filter and --delete-after are both important. I suggest reading the "PER-DIRECTORY RULES AND DELETE" chapter of rsync man page.
  • sylbru
    sylbru about 6 years
    1/ The sentence from the rsync man page quoted in this answer describes the --cvs-exclude option, so you have to use it explicitly. 2/ You may create .cvsignore files in any directory to have project-specific ignores, those are read as well. 3/ .git is already ignored when you use --cvs-exclude, according to the manual, so having it in $HOME/.cvsignore seems redundant.
  • Jesse Glick
    Jesse Glick almost 6 years
    This looks like it will work if you have a particular Git repository you want to synchronize—the SRC here—but not for the original problem I stated, which is a sprawling directory with thousands of Git repositories as subdirectories at various depths, many of which have idiosyncratic .gitignores.
  • kittygirl
    kittygirl over 5 years
    @VasyaNovikov,why not --exclude='/.git' --filter="dir-merge,+ .gitignore"?+ means includes files listed in .gitignore.
  • Roland W
    Roland W over 4 years
    If you are using a shell with support for process substitution (bash, zsh, etc.) you can use --exclude-from=<(git -C SRC ls-files --exclude-standard -oi --directory)
  • Bampfer
    Bampfer over 4 years
    To sync deletes as well as adds & updates, you can simply add --delete-after to @VasiliNovikov's version of the command. (This seems equivalent to @dboliton's version of the command, except @db uses :e which i think excludes the .gitignore files from being copied, which is not what I wanted.)
  • redanimalwar
    redanimalwar about 4 years
    Does this assume running rsync from the directoy with the .gitignore in it? Or does it pull ot from the dir syncs from? I guess I have to put in the full path to .gitignore to be save?
  • Ng Sek Long
    Ng Sek Long about 3 years
    This solution is especially perfect for project that uses multiple .gitignore scatter around their directory, which is most of the modern git structure. Glad I scroll down to here
  • sorin
    sorin almost 3 years
    Sorry but this does not work as expected because rsync is not able to properly ready gitignore files and gets confused by what it find there. Example, if you have foo/* inside .gitigore, rsync with fail to sync src/foo/.* even that is not part of the git ignore patterns.
  • Paul Praet
    Paul Praet over 2 years
    I omitted the exclude for .git but it still does not copy that directory.. EDIT: solved by --include '.git'
  • Heath Raftery
    Heath Raftery over 2 years
    Since it's non-obvious, might be worth noting that ':- .gitignore' means dir-merge (:), exclude patterns (-) from the file .gitignore. "dir-merge" is short for "per-directory merge", which means "rsync will scan every directory that it traverses for the named file, merging its contents when the file exists into the current list of inherited rules." In my case, I only have one .gitignore, and it's in a parent directory, so the correct option for me is: --filter='.- ../.gitignore', which is a "single-instance" (.) merge.
  • Heath Raftery
    Heath Raftery over 2 years
    On second thoughts I could just run the original command from the parent directory and adjust <src>. Comment left as guide to others.