How to specify level of compression when using tar -zcvf?
Solution 1
GZIP=-9 tar cvzf file.tar.gz /path/to/directory
assuming you're using bash. Generally, set GZIP environment variable to "-9", and run tar normally.
Also - if you really want best compression, don't use gzip. Use lzma or 7z.
And when using gzip (which is good idea for various of reasons anyway) consider using pigz
program and not the gzip
.
Solution 2
Instead of using the gzip flag for tar, gzip the files manually after the tar process, then you can specify the compression level for the gzip program:
tar -cvf files.tar /path/to/file0 /path/to/file1 ; gzip -9 files.tar
Or you could use:
tar cvf - /path/to/file0 /path/to/file1 | gzip -9 - > files.tar.gz
The -9 in the gzip command line tells gzip to use the maximum possible compression level (default is -6).
Edit: Fixed pipe command line based on @depesz comment.
Solution 3
Modern versions of tar support the xz archive format (GNU tar, since 1.22 in 2009, Busybox since 1.17.0 in 2010).
It's based on lzma2, kind of like a 7-Zip version of gz. This gives better compression if you are ok with the requirement of needing xz support.
tar -Jcvf file.tar.xz /path/to/directory
I just found out here (basically a dupe of this question, but in the Unix stackexchange) that there is also a XZ_OPT=-9 environment variable to control the XZ compression level similar to the GZIP one in the other post.
XZ_OPT=-9 tar -Jcvf file.tar.xz /path/to/directory
Solution 4
tar cv /path/to/directory | gzip --best > file.tar.gz
This is Matrix Mole's second solution, but slightly shortened:
When calling tar, option f
states that the output is a file. Setting it to -
(stdout) makes tar write its output to stdout which is the default behavior without both f
and -
.
And as stated by the gzip
man page, if no files are specified gzip will compress from standard input. There is no need for -
in the gzip
call.
Option --best
(equivalent to -9
) sets the highest compression level.
Solution 5
There is also the option to specify the compression program using -I
. This can include the compression level option.
tar -I 'gzip -9' -cvf file.tar.gz /path/to/directory
Note that the -I
option is shorthand for --use-compress-program=COMMAND
. This is important if you're not using GNU tar
but BSD tar
.
The latter uses the -I
option as shorthand for the --files-from filename
option.
So to be make your command "cross-platform" you could write:
tar --use-compress-program='gzip -9' -cvf file.tar.gz /path/to/directory
Related videos on Youtube
user882903
Updated on September 18, 2022Comments
-
user882903 over 1 year
I gzip directories very often at work. What I normally do is
tar -zcvf file.tar.gz /path/to/directory
Is there a way to specify the compression level here? I want to use the best compression possible even if it takes more time to compress.
-
Admin almost 13 yearsUsing pipes should be done with:
tar cvf - /path/to/directory | gzip -9 - > file.tar.gz
-
User1 over 11 years+1 xz is far better than both bzip2 and gzip. Here's a comparison: tukaani.org/lzma/benchmarks.html
-
akostadinov over 10 yearswhy don't you skip
f -
? if there is no file, then it is stdin/out -
Mikl over 10 yearsaddition to the previos comment. From "man tar" section Environtment: TAPE Device or file to use for the archive if --file is not specified. If this environment variable is unset, use stdin or stdout instead.
-
Mikl over 10 yearsand we can reduce "gzip -9 -" -> "gzip -9". From "man gzip" section Description: If no files are specified, or if a file name is "-", the standard input is compressed to the standard output.
-
Sorceri over 10 yearspigz is "parallel gzip" which uses all your cores for gzip compression. You can watch
top
and see it using anywhere between 200%-400$ CPU. -
PJ Brunet over 10 yearsThis works beautifully. Also if you run as root, permissions & owners are preserved too. Otherwise you must specify. Also if it wasn't obvious "-9" is best compression and "-1" is fastest compression. "-1" still takes a looong time if you have lots of files ;-)
-
joelostblom about 9 yearsThis works with
xz
andpixz
too. It is a great way to control the number of threads used for parallel compressing without having to create an intermediate .tar file. Like sotar -cv /path/to/dir | pixz -p4 > output.tpxz
-
vricot about 7 yearsFYI, for .bz2 format, use: BZIP2=-9 tar cvjf file.tar.bz2 /path/to/directory
-
Bell about 7 yearsThe trade-off is speed. XZ is significantly slower.
-
Seer over 6 yearsThe environment variable seems to now be
GZIP_OPT
, the usage should be the same. -
Cheetah over 6 yearsOlder versions of tar such as that provided in CentOS 6 & 7 do not support providing arguments in the
-I
arg, they will try to treat the whole thing as a program name to exec, and thus fail. At least as of tar 1.29 in Debian Stretch, this does work. -
Ponyboy47 almost 6 yearsFrom the man page on Ubuntu 16.04 for gzip: "On Vax/VMS, the name of the environment variable is GZIP_OPT, to avoid a conflict with the symbol set for invocation of the program." For sh, csh, and MSDOS it should still just be GZIP
-
patryk.beza over 4 yearsThis is what I get when I try to set
GZIP
environment variable to-9
: gzip: warning: GZIP environment variable is deprecated; use an alias or script -
OrigamiEye about 4 years@Cheetah which version of tar?
-
ivan_pozdeev almost 4 years@patryk.beza see stackoverflow.com/questions/46167772/…. It says that
-I 'gzip <args>'
is now the recommended way`. -
YumeYao over 3 yearsno. xz -1 significantly beats bz2 -1~9 in terms of both compression ratio, compression/decompression speed. bz2 is the most awful format among popular formats. In short, if you ever use bz2, try xz -1, that's done.
-
Saurabh P Bhandari over 3 years@Cheetah I can confirm that it doesn't work for tar v1.26 in CentOS 7.x, returns this error :
tar (child): gzip -9: Cannot exec: No such file or directory
-
RonJohn about 3 years@SaurabhPBhandari "Cannot exec: No such file or directory" that probably means you need to specify the full gzip file name
tar -I '/bin/gzip -9' -cvf file.tar.gz /path/to/directory
. -
Erwan about 3 years@YumeYao FYI this is not always true, I just tried to compress some data with both bzip2 and xz at -9 compression level, xz gives me a 38M compressed size whereas bzip2 gives me 36M.
-
mwfearnley almost 3 years@RonJohn no, sadly it's due to the old version of tar.
'gzip'
works,'/bin/gzip'
works,'/bin/gzip -9'
doesn't. -
Kresimir Pendic over 2 yearsall boils down to type of data that you need to compress I thing - not reversed :)
-
Admin almost 2 yearshowever
--use-compress-program
works there too, so I'd use that for portability