How to specify level of compression when using tar -zcvf?

240,460

Solution 1

GZIP=-9 tar cvzf file.tar.gz /path/to/directory

assuming you're using bash. Generally, set GZIP environment variable to "-9", and run tar normally.

Also - if you really want best compression, don't use gzip. Use lzma or 7z.

And when using gzip (which is good idea for various of reasons anyway) consider using pigz program and not the gzip.

Solution 2

Instead of using the gzip flag for tar, gzip the files manually after the tar process, then you can specify the compression level for the gzip program:

tar -cvf files.tar /path/to/file0 /path/to/file1 ; gzip -9 files.tar

Or you could use:

tar cvf - /path/to/file0 /path/to/file1 | gzip -9 - > files.tar.gz

The -9 in the gzip command line tells gzip to use the maximum possible compression level (default is -6).

Edit: Fixed pipe command line based on @depesz comment.

Solution 3

Modern versions of tar support the xz archive format (GNU tar, since 1.22 in 2009, Busybox since 1.17.0 in 2010).

It's based on lzma2, kind of like a 7-Zip version of gz. This gives better compression if you are ok with the requirement of needing xz support.

tar -Jcvf file.tar.xz /path/to/directory

I just found out here (basically a dupe of this question, but in the Unix stackexchange) that there is also a XZ_OPT=-9 environment variable to control the XZ compression level similar to the GZIP one in the other post.

XZ_OPT=-9 tar -Jcvf file.tar.xz /path/to/directory

Solution 4

tar cv /path/to/directory | gzip --best > file.tar.gz

This is Matrix Mole's second solution, but slightly shortened:

When calling tar, option f states that the output is a file. Setting it to - (stdout) makes tar write its output to stdout which is the default behavior without both f and -.

And as stated by the gzip man page, if no files are specified gzip will compress from standard input. There is no need for - in the gzip call.

Option --best (equivalent to -9) sets the highest compression level.

Solution 5

There is also the option to specify the compression program using -I. This can include the compression level option.

tar -I 'gzip -9' -cvf file.tar.gz /path/to/directory

Note that the -I option is shorthand for --use-compress-program=COMMAND. This is important if you're not using GNU tar but BSD tar. The latter uses the -I option as shorthand for the --files-from filename option.

So to be make your command "cross-platform" you could write:

tar --use-compress-program='gzip -9' -cvf file.tar.gz /path/to/directory
Share:
240,460

Related videos on Youtube

user882903
Author by

user882903

Updated on September 18, 2022

Comments

  • user882903
    user882903 over 1 year

    I gzip directories very often at work. What I normally do is

    tar -zcvf file.tar.gz /path/to/directory
    

    Is there a way to specify the compression level here? I want to use the best compression possible even if it takes more time to compress.

  • Admin
    Admin almost 13 years
    Using pipes should be done with: tar cvf - /path/to/directory | gzip -9 - > file.tar.gz
  • User1
    User1 over 11 years
    +1 xz is far better than both bzip2 and gzip. Here's a comparison: tukaani.org/lzma/benchmarks.html
  • akostadinov
    akostadinov over 10 years
    why don't you skip f -? if there is no file, then it is stdin/out
  • Mikl
    Mikl over 10 years
    addition to the previos comment. From "man tar" section Environtment: TAPE Device or file to use for the archive if --file is not specified. If this environment variable is unset, use stdin or stdout instead.
  • Mikl
    Mikl over 10 years
    and we can reduce "gzip -9 -" -> "gzip -9". From "man gzip" section Description: If no files are specified, or if a file name is "-", the standard input is compressed to the standard output.
  • Sorceri
    Sorceri over 10 years
    pigz is "parallel gzip" which uses all your cores for gzip compression. You can watch top and see it using anywhere between 200%-400$ CPU.
  • PJ Brunet
    PJ Brunet over 10 years
    This works beautifully. Also if you run as root, permissions & owners are preserved too. Otherwise you must specify. Also if it wasn't obvious "-9" is best compression and "-1" is fastest compression. "-1" still takes a looong time if you have lots of files ;-)
  • joelostblom
    joelostblom about 9 years
    This works with xz and pixz too. It is a great way to control the number of threads used for parallel compressing without having to create an intermediate .tar file. Like so tar -cv /path/to/dir | pixz -p4 > output.tpxz
  • vricot
    vricot about 7 years
    FYI, for .bz2 format, use: BZIP2=-9 tar cvjf file.tar.bz2 /path/to/directory
  • Bell
    Bell about 7 years
    The trade-off is speed. XZ is significantly slower.
  • Seer
    Seer over 6 years
    The environment variable seems to now be GZIP_OPT, the usage should be the same.
  • Cheetah
    Cheetah over 6 years
    Older versions of tar such as that provided in CentOS 6 & 7 do not support providing arguments in the -I arg, they will try to treat the whole thing as a program name to exec, and thus fail. At least as of tar 1.29 in Debian Stretch, this does work.
  • Ponyboy47
    Ponyboy47 almost 6 years
    From the man page on Ubuntu 16.04 for gzip: "On Vax/VMS, the name of the environment variable is GZIP_OPT, to avoid a conflict with the symbol set for invocation of the program." For sh, csh, and MSDOS it should still just be GZIP
  • patryk.beza
    patryk.beza over 4 years
    This is what I get when I try to set GZIP environment variable to -9: gzip: warning: GZIP environment variable is deprecated; use an alias or script
  • OrigamiEye
    OrigamiEye about 4 years
    @Cheetah which version of tar?
  • ivan_pozdeev
    ivan_pozdeev almost 4 years
    @patryk.beza see stackoverflow.com/questions/46167772/…. It says that -I 'gzip <args>' is now the recommended way`.
  • YumeYao
    YumeYao over 3 years
    no. xz -1 significantly beats bz2 -1~9 in terms of both compression ratio, compression/decompression speed. bz2 is the most awful format among popular formats. In short, if you ever use bz2, try xz -1, that's done.
  • Saurabh P Bhandari
    Saurabh P Bhandari over 3 years
    @Cheetah I can confirm that it doesn't work for tar v1.26 in CentOS 7.x, returns this error : tar (child): gzip -9: Cannot exec: No such file or directory
  • RonJohn
    RonJohn about 3 years
    @SaurabhPBhandari "Cannot exec: No such file or directory" that probably means you need to specify the full gzip file name tar -I '/bin/gzip -9' -cvf file.tar.gz /path/to/directory.
  • Erwan
    Erwan about 3 years
    @YumeYao FYI this is not always true, I just tried to compress some data with both bzip2 and xz at -9 compression level, xz gives me a 38M compressed size whereas bzip2 gives me 36M.
  • mwfearnley
    mwfearnley almost 3 years
    @RonJohn no, sadly it's due to the old version of tar. 'gzip' works, '/bin/gzip' works, '/bin/gzip -9' doesn't.
  • Kresimir Pendic
    Kresimir Pendic over 2 years
    all boils down to type of data that you need to compress I thing - not reversed :)
  • Admin
    Admin almost 2 years
    however --use-compress-program works there too, so I'd use that for portability