How to exclude files from TAR archive using regular expressions?

17,828

You can use some additional tools like find and egrep:

find directory/ -type f -print | egrep -v '[0-9]+x[0-9X]+\.jpg' | tar cvfz directory.tar.gz -T -

The drawback of the above mentioned method is that it will not work for all possible file names. Another opportunity is to use the built-in exclude functionality of tar:

tar -czvf directory.tar.gz --exclude='*x*X*.jpg' directory

Unfortunately the second method does not work with regular expressions, but only with wildcards.

Share:
17,828

Related videos on Youtube

Frodik
Author by

Frodik

Updated on September 18, 2022

Comments

  • Frodik
    Frodik over 1 year

    I have a simple question, yet I can't find or solve the answer. I want to make a tar archive, but I want to exclude some files from it using regular expression.

    Example of the file to exclude is this: 68x640X480.jpg

    I have tried this with no luck:

    tar cvf test.tar --exclude=[0-9]+x[0-9X]+\.jpg /data/foto
    

    Can anybody help ?

  • Frodik
    Frodik over 12 years
    Thanks, this is what I was looking for. Can you please make a note about what file names wouldn't work ? e.g. containing what characters ?
  • Vladimir Blaskov
    Vladimir Blaskov over 12 years
    You shouldn't worry too much about that - most file names work perfectly fine with that solution. The problem is that UNIX/Linux file names can include pretty much everything, even control characters - such obscure combinations will not work with the first solution.
  • Vladimir Blaskov
    Vladimir Blaskov over 12 years
    A nice read related to UNIX/Linux/POSIX file names: dwheeler.com/essays/fixing-unix-linux-filenames.html
  • MBasith
    MBasith almost 3 years
    What does the "-T -" at the end do?
  • AndOs
    AndOs over 2 years
    @MBasith the -T option enables tar to read the files to archive from another file (--files-from=FILE). The - (dash) refers in this case to standard input. This is useful when the file list needs to be generated from another process and piped into tar.