Shell command to tar directory excluding certain files/folders

1,112,907

Solution 1

You can have multiple exclude options for tar so

$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

etc will work. Make sure to put --exclude before the source and destination items.

Solution 2

You can exclude directories with --exclude for tar.

If you want to archive everything except /usr you can use:

tar -zcvf /all.tgz / --exclude=/usr

In your case perhaps something like

tar -zcvf archive.tgz arc_dir --exclude=dir/ignore_this_dir

Solution 3

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup

Exclude files using an exclude file filled with a list of patterns

tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup

Exclude files using tags by placing a tag file in any directory that should be skipped

tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup

Solution 4

old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try.

if you have the following structure

/home/ftp/mysite/

with following file/folders

/home/ftp/mysite/file1
/home/ftp/mysite/file2
/home/ftp/mysite/file3
/home/ftp/mysite/folder1
/home/ftp/mysite/folder2
/home/ftp/mysite/folder3

so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two.

we use the format

tar -czvf <name of tar file> <what to tar> <any excludes>

where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file.

so, my command would look like this

cd /home/ftp/
tar -czvf mysite.tar.gz mysite --exclude='file3' --exclude='folder3'

note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work).

hope this will help someone (and me next time I google it)

Solution 5

You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories:

tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/*  -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt

myInputFile.txt contains:

/dev2/java
/dev2/javascript

Share:
1,112,907
deepwell
Author by

deepwell

Lead high performing software organizations.

Updated on April 23, 2022

Comments

  • deepwell
    deepwell about 2 years

    Is there a simple shell command/script that supports excluding certain files/folders from being archived?

    I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

    Not quite solutions:

    The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

    I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

    I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

    Can anybody think of a better/more efficient solution?

    EDIT: Charles Ma's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

    cd /folder_to_backup
    tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .
    
    • Rekhyt
      Rekhyt about 12 years
      Another thing caught me out on that, might be worth a note: Trailing slashes at the end of excluded folders will cause tar to not exclude those folders at all.
    • earcam
      earcam about 12 years
      @Rekhyt thanks, I was staring at the command for 15 minutes ... then 30
    • Joel G Mathew
      Joel G Mathew over 11 years
      It seems the position of --exclude depends on the version of tar. For tar 1.23, --exclude needs to come after the main commands.
    • Meetai.com
      Meetai.com over 10 years
      Don't forget the "'" (quotation marks).
    • Brice
      Brice about 10 years
      I had to remove the single quotation marks in order to exclude sucessfully the directories. (tar -zcvf gatling-charts-highcharts-1.4.6.tar.gz /opt/gatling-charts-highcharts-1.4.6 --exclude=results --exclude=target)
    • guidod
      guidod over 8 years
      It only worked without the quotation marks for me as well
    • staffan
      staffan over 8 years
      The --exclude='./folder' syntax does not seem to work on OS X.
    • escape-llc
      escape-llc over 8 years
      it works with quotes i use this one tar --exclude "logs" --exclude "*.tar.gz" -zcvf "Archive.tar.gz" -C "/path/to/files" . The --exclude must go first i had problems until i did that. Here i'm using the -C option; change to directory. That helps make it insensitive to the initial directory TAR is running from.
    • wortwart
      wortwart almost 7 years
      --exclude doesn't have to be first but it has to come somewhere before the source directory (tested with tar 1.29 on Cygwin).
    • Mariano Paniga
      Mariano Paniga over 2 years
      In my case I needed also to remove the initial './' characters inside the "-exclude" options... but I think it depends on what you have specified as the last parameter (tar version "tar (GNU tar) 1.26"), for example: tar --exclude='wlserver_12.2/OPatch/patches' --exclude='wlserver_12.2/OPatch_20191007/patches' -cvf wlserver_12.2.backup.tar wlserver_12.2
  • Johan Soderberg
    Johan Soderberg about 15 years
    To clarify, you can use full path for --exclude.
  • jørgensen
    jørgensen over 12 years
    That can cause tar to be invoked multiple times - and will also pack files repeatedly. Correct is: find / -print0 | tar -T- --null --no-recursive -cjf tarfile.tar.bz2
  • GeertVc
    GeertVc over 10 years
    Just want to add to the above, that it is important that the directory to be excluded should NOT contain a final backslash. So, --exclude='/path/to/exclude/dir' is CORRECT, --exclude='/path/to/exclude/dir/' is WRONG.
  • Znik
    Znik over 10 years
    you can quote 'exclude' string, like this: 'somedir/filesdir/*' then shell isn't going to expand asterisks and other white chars.
  • Sverre
    Sverre about 10 years
    I would have added a comment on some of the other good answers, but did not have the karma, so decided to make a full answer.
  • Mike
    Mike about 10 years
    In the second example above there should be asterisks after the last slash in each exclude clause, but the post did not take them.
  • hitautodestruct
    hitautodestruct almost 10 years
    Can you give an example of a pattern. Is this a regex pattern?
  • Nux
    Nux almost 10 years
    No. It's not a RegExp. The * here means any character (including /).
  • t0r0X
    t0r0X over 9 years
    Beware: zip does not pack empty directories, but tar does!
  • James O'Brien
    James O'Brien over 9 years
    This answer makes it look like --exclude comes first... tar cvfpz ../stuff.tgz --exclude='node_modules' --exclude='.git' .
  • Tuxdude
    Tuxdude over 9 years
    xargs -n 1 is another option to avoid xargs: Argument list too long error ;)
  • user276648
    user276648 over 9 years
    See @Stephen Donecker answer below to have a file containing the list of files to exclude.
  • Valentino
    Valentino over 9 years
    this is because the target archive target.tgz is an argument of the f switch, which it should follow
  • shasi kanth
    shasi kanth over 9 years
    As an example, if you are trying to backup your wordpress project folder, excluding the uploads folder, you can use this command: tar -cvf wordpress_backup.tar wordpress --exclude=wp-content/uploads
  • Anish Ramaswamy
    Anish Ramaswamy about 9 years
    This answer definitely helped me! The gotcha for me was that my command looked something like tar -czvf mysite.tar.gz mysite --exclude='./mysite/file3' --exclude='./mysite/folder3', and this didn't exclude anything.
  • Alfred Bez
    Alfred Bez almost 9 years
    I came up with the following command: tar -zcv --exclude='file1' --exclude='patter*' --exclude='file2' -f /backup/filename.tgz . note that the -f flag needs to precede the tar file see: superuser.com/a/559341/415047
  • Josiah
    Josiah almost 9 years
    A "/" on the end of the exclude directory will cause it to fail. I guess tar thinks an ending / is part of the directory name to exclude. BAD: --exclude=mydir/ GOOD: --exclude=mydir
  • Rik Smith-Unna
    Rik Smith-Unna almost 9 years
    @hitautodestruct The patterns are called file globs, see man7.org/linux/man-pages/man7/glob.7.html for documentation
  • Michael
    Michael almost 9 years
    Thanks for this answer, the tar on darwin definitely has a different syntax and it was driving me nuts why "--exclude=blah" in the other answers weren't working. This worked great on a mac.
  • Stphane
    Stphane over 8 years
    I read somewhere that when using xargs, one should use tar r option instead of c because when find actually finds loads of results, the xargs will split those results (based on the local command line arguments limit) into chuncks and invoke tar on each part. This will result in a archive containing the last chunck returned by xargs and not all results found by the find command.
  • Benoit Duffez
    Benoit Duffez about 8 years
    Don't forget COPYFILE_DISABLE=1 when using tar, otherwise you may get ._ files in your tarball
  • NightKnight on Cloudinsidr.com
    NightKnight on Cloudinsidr.com over 7 years
    > Make sure to put --exclude before the source and destination items. OR use an absolute path for the exclude: tar -cvpzf backups/target.tar.gz --exclude='/home/username/backups' /home/username
  • Qorbani
    Qorbani over 7 years
    Your sample was very similar to what I had issue with! Thank you!
  • gdbj
    gdbj over 7 years
    the ordering of the exclude tag matters. But how is this behaviour not considered a bug? Would be an easy fix.
  • Admin
    Admin over 7 years
    I'm guessing bash expansion should work for exclude, --exclude={'folder1','folder2','folder3'} , saves from having to type too many excludes
  • manuc66
    manuc66 over 7 years
    I suggest this one for multi-core machines : tar --exclude='./folder' --exclude='./upload/folder2' -c --use-compress-program=pigz -f /backup/filename.tgz .
  • Hubert
    Hubert over 7 years
    Nice and clear thank you. For me the issue was that other answers include absolute of relative paths. But all you have to do is add the name of the folder you want to exclude.
  • ericosg
    ericosg about 7 years
    i'm 100% sure everyone doubled back to this issue after not having put --exclude= before the rest. lets make that bold?
  • tripleee
    tripleee almost 7 years
    Requiring interactive input is a poor design choice for most shell scripts. Make it read command-line parameters instead and you get the benefit of the shell's tab completion, history completion, history editing, etc.
  • tripleee
    tripleee almost 7 years
    Additionally, your script does not work for paths which contain whitespace or shell metacharacters. You should basically always put variables in double quotes unless you specifically require the shell to perform whitespace tokenization and wildcard expansion. For details, please see stackoverflow.com/questions/10067266/…
  • not2qubit
    not2qubit about 6 years
    I believe this require that the Bash shell option variable globstar has to be enabled. Check with shopt -s globstar. I think it off by default on most unix based OS's. From Bash manual: "globstar: If set, the pattern ** used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match."
  • fagiani
    fagiani almost 6 years
    This is a much more clear answer. Because of the example I was able to get it working as the paths were confusing at first. Thanks a bunch!
  • PicoutputCls
    PicoutputCls almost 6 years
    After much experimentation I've found more or less the same thing with my command in tar (GNU tar) 1.28.
  • cstamas
    cstamas over 5 years
    The ordering of parameters seems to matter and this form works for me.
  • arg
    arg over 5 years
    Me too. For tar (GNU tar) 1.28 on Ubuntu 16.04, only this specific order of parameters worked.
  • Brent Faust
    Brent Faust over 5 years
    As of tar 1.28, at least, the ordering does not appear to matter. tar -czf <dst>.tar.gz --exclude={*.mp4,.git} <srcdir> also works.
  • lucaferrario
    lucaferrario about 5 years
    it worked! Please remember not to add a trailing slash to the exclude. For example, while "file3" or "file3/subfolder" works, "file3/" and "file3/subfolder/" do not!
  • SherylHohman
    SherylHohman about 5 years
    Thanks for including your answer. It's always nice to include a link to the source where you found the answer. Bonus: if the source was from another stackoverflow or stackexchange post, you get extra karma (either points or badges - I don't remember which). Either way, they get a smile, and everyone wins. No downsides :-) It also helps people if who want to search out extra info. Sometimes people will upvote just because you included a source link. Finally, sharing the specific issue this addressed, or why this was a better solution, it might help someone else with a unique problem.
  • SherylHohman
    SherylHohman about 5 years
    For me this worked when I did not surround the exclude files and directories with quotation marks.
  • Paul
    Paul almost 5 years
    Note that the path to the directory to exclude shouldn't end with a slash. ` --exclude='./folder'` works, but --exclude='./folder/' doesn't work.
  • Nitai
    Nitai over 4 years
    Thank you for your answer. I was looking (what felt like a very long time) for a solution, and your answer guided me in the right direction. However, in my case (Ubuntu 18.04.3, Tar 1.29) I only could make it work with adding the folder name and NOT the path, e.g.: tar --exclude=folder1 --exclude=folder2 -czvf /opt/archieve.tgz folder
  • Kai Petzke
    Kai Petzke over 4 years
    I think, this is the best solution, as it even works in those cases, that the number of excludes is large. It is also possible to include the X option in the option pack, so the shortest form is probably: tar cXvfJ EXCLUDE-LIST ARCHIVE.tar.xz SOURCE-FOLDER
  • lobotmcj
    lobotmcj over 4 years
    in some instances, it is required that --exclude precede the files/folders to archive
  • F. Hauri  - Give Up GitHub
    F. Hauri - Give Up GitHub over 3 years
    Tar complexity! xkcd: tar
  • PJ Brunet
    PJ Brunet over 3 years
    The problem with braces, they can break your bash functions :-)
  • ygoe
    ygoe over 3 years
    This is not the answer. I tried putting the --exclude-from first and I tried providing absolute paths to exclude. Either nothing was excluded, or the excluded patterns also matched in subdirectories, excluding too much!
  • ygoe
    ygoe over 3 years
    Beware that --exclude=dir/ignore_this_dir will match in any subtree as well! You'll end up missing files you didn't expect to be excluded.
  • alper
    alper over 3 years
    can I exclude hidden folders as well?
  • sfink
    sfink almost 3 years
    It looks like there's an --anchored flag for preventing it from matching the middle of paths. Apparently the intended use case is --exclude="*.o" and you have to fight to get anything else to work.
  • wsams
    wsams over 2 years
    On a Mac as of at least 11.6 (with bsdtar 3.3.2) --exclude must precede -zcvf. For example: tar --exclude "./.git" --exclude "./node_modules" -zcvf archive.tar.gz some_directory. (Update: I just noticed @Jerinaw answer, that also works. Looks like -f is the problem and must be after the --exclude options)
  • Black
    Black over 2 years
    Does not work, it still includes the excluded folder
  • mattlangtree
    mattlangtree over 2 years
    I tried most options on this page and on tar version 1.27.1, this answer helped me.
  • Avio
    Avio about 2 years
    It's pretty funny that -zcvf works while -cvfz just creates an archive named z and gg to everyone :D