How to XZ a directory with TAR using maximum compression?
Solution 1
Assuming xz
honors the standard set of commandline flags - including compression level flags, you could try:
tar -cf - foo/ | xz -9 -c - > foo.tar.xz
Solution 2
With a recent GNU tar
on bash or derived shell:
XZ_OPT=-9 tar cJf tarfile.tar.xz directory
tar's lowercase j switch uses bzip, uppercase J switch uses xz.
The XZ_OPT
environment variable lets you set xz
options that cannot be passed via calling applications such as tar
.
This is now maximal.
See man xz
for other options you can set (-e
/--extreme
might give you some additional compression benefit for some datasets).
XZ_OPT=-e9 tar cJf tarfile.tar.xz directory
Solution 3
XZ_OPT=-9e tar cJf tarfile.tar.xz directory
is even better than
XZ_OPT=-9 tar cJf tarfile.tar.xz directory
Solution 4
If you have 16 GiB of RAM (and nothing else running), you can try:
tar -cf - foo/ | xz --lzma2=dict=1536Mi,nice=273 -c - > foo.tar.xz
This will need 1.5 GiB for decompression, and about 11x that for compression. Adjust accordingly for lesser amounts of memory.
This will only help if the data is actually that big, and in any case it won't help THAT much, but still...
If you're compressing binaries, add --x86 as the first xz option. If you're playing with "multimedia" files (uncompressed audio or bitmaps), you can try with --delta=dist=2 (experiment with value, good values to try are 1..4).
If you're feeling very adventurous, you can try playing with more LZMA options, like
--lzma2=dict=1536Mi,nice=273,lc=3,lp=0,pb=2
(these are the default settings, you can try values between 0 and 4, and lc+lp must not exceed 4)
In order to see how the default presets map to these values, you can check the source file src/liblzma/lzma/lzma_encoder_presets.c. Nothing of much interest there though (-e sets the nice length to 273 and also adjusts the depth).
Solution 5
tar --help
: -I, --use-compress-program=PROG
tar -I 'xz -9' -cvf foo.tar.xz foo/
tar -I 'gzip -9' -cvf foo.tar.gz foo/
also compress with external compressors:
tar -I 'lz4 -9' -cvf foo.tar.lz4 foo/
tar -I 'zstd -19' -cvf foo.tar.zst foo/
decompress external compressors:
tar -I lz4 -xvf foo.tar.lz4
tar -I zstd -xvf foo.tar.zst
list archive external compressors:
tar -I lz4 -tvf foo.tar.lz4
tar -I zstd -tvf foo.tar.zst
Related videos on Youtube
LanceBaynes
Updated on September 18, 2022Comments
-
LanceBaynes over 1 year
So I need to compress a directory with max compression.
How can I do it with
xz
? I mean I will needtar
too because I can't compress a directory with onlyxz
. Is there a oneliner to produce e.g.foo.tar.xz
?-
Admin over 9 yearsFWIW,
man 1 xz
saysit's not a good idea to blindly use -9 for everything like it often is with gzip(1) and bzip2(1).
-7 ... -9 [...] These are useful only when compressing files bigger than 8 MiB, 16 MiB, and 32 MiB, respectively.
RTFM for more info.
-
-
LanceBaynes over 12 yearsand this uses maximum compression level with XZ?
-
LanceBaynes over 12 yearsand this uses maximum compression level with XZ?
-
bsd over 12 yearsadding -9 to xz will make it max
-
bsd over 12 yearsIt does now, see edited answer and XZ_OPT env var ;)
-
Admin about 11 yearsJust a note: you have to export
XZ_OPT
. -
bsd about 11 yearsNo, you don't. That's the whole point. You can set the environment var for just that invocation. You can export it if you want to, but you don't have to.
-
anddam about 11 yearsYou're assuming bash-like shell for that.
-
Anthon over 10 yearsThe
J
was already mentioned in bdowning's answer -
psusi over 9 yearsThis really doesn't answer the question. This is just an observation that for your particular small data set, -4e already gets the best compression and so the higher levels don't get any more benefit ( and even an ever so slight penalty ).
-
terdon over 9 yearsAre you the same user as Szymon Roziewski? If so, please don't post multiple answers. Instead, edit your original answer. If you can't access your first account, please see here for how to merge your accounts. In the meantime, I am deleting your previous answer and including it here.
-
Szymon Roziewski over 9 yearsOk, I have done a more comprehensive study on that. What I got is here. I chose some files from my hardrive and made compression with option -4e and -9e. So, it's better to find your best solution by yourself. You were right, for some cases -9e is better whereas for another it's not:
no difference = 660 4e better than 9e = 74 9e better than 4e = 17 total files = 751 tar 2 html 2 csv 2 xml 2 gz 2 ppt 2 eps 2 docx 2 gif 2 rpm 3 png 3 asv 3 xlsx 3 exe 3 rar 4 nc 4 txt 5 odt 6 xls 7 zip 7 doc 9 m 12 dat 17 other 109 pdf 133 135 jpg 270
-
Szymon Roziewski over 9 years(comments may be edited only for 5 minutes)
txt 109 txt/pdf 135
-
Stéphane Chazelas over 9 years@anddam, that's supported by all shells of the Bourne family (Bourne, ksh, mksh, pdksh, ash, dash, bash, yash, zsh) and
rc
andakanga
.fish
,csh
,tcsh
andes
being the major shells that don't support it. There, you'd use theenv
command. -
anddam over 9 yearsActually on fish I'd use 'set' command. The point if that if you're using a syntax specific to one shell you'd warn the reader about that.
-
cychoi over 9 years+1. This does help the OP find a way to determine maximum compression for
tar
ing files usingxz
. -
Amedee Van Gasse almost 9 yearsThe question was about xz, not about 7z, even though they both use LZMA compression.
-
cxdf almost 9 yearsHow is this better? What does the e flag do?
-
Evandro Jr about 8 years
option -e, --extreme
Modify the compression preset (-0 ... -9) so that a little bit better compression ratio can be achieved without increasing memory usage of the compressor or decompressor (exception: compressor memory usage may increase a little with presets -0 ... -2). The downside is that the compression time will increase dramatically (it can easily double). -
Krzysztof Krasoń almost 8 years
-9e
is the best level, but it will take very long -
Rahly about 7 yearsIts good to note, the use of XZ_OPT or XZ_DEFAULTS depends on the version of XZ and not TAR.
man xz
-
nyxee over 6 yearsSo, If i'm compressing about 80GB of Software on my machine (when i want all the computers resources to go to the compression process for speed) i should use
-9
not-9e
, yeah? -
dhag over 6 yearsThis seems like a working answer, but, as it is, it would be greatly improved by having its formatting fixed and and explanation of option
-I
added. -
twistylittlepassages over 6 yearsJust for the record:
XZ_OPT
is not a feature implemented intar
. It's a feature ofxz
. Whentar
callsxz
, the env-variable is simply passed on. -
EkriirkE over 5 yearsxz by default uses 1 core/thread, you can max that out (speed it all up) by adding -T0, eg
XZ_OPT="-9e -T0" tar -cJf ...
-
Dzenly almost 5 yearsBad variable name choosing, because T0 is option to enable multi-threaded archivation.
-
Jimmy almost 5 years@Dzenly You're right! Thank you! Changed it.
-
KolonUK almost 5 years
-9e
will not always give you the best result - see point 8 here rootusers.com/13-simple-xz-examples -
KolonUK almost 5 yearsAlso, you might see significant improvement if you add
--threads=0
to xz -
user3439968 almost 5 years
XZ_OPT=-e9T0 tar cJf tarfile.tar.xz directory
. T0 - Specify the number of worker threads to use. Setting threads to a special value 0 makes xz use as many threads as there are CPU cores on the system. -
holzkohlengrill over 4 yearsIs the pipe variant (
tar .... | xz ...
) (significantly) slower than using-J
/-j
/...? -
Vlastimil Burián over 4 yearsPlease, be aware this thread has been read 104k times to date. Be sure to add something distinctive. So far, I don't see any way this post actually contributes to the overall thread. How is it different from writing a one-liner:
xz -k -8e -M 7000MB -T 8 -v whatever.img
? It has been already posted here for instance not exacly the same, but better with the XZ_OPT syntax pointed out. Cheers. -
Adam Wądołkowski over 4 yearsI sharing my experience in this matter with technical aspects. The example is based on the syntax xz (XZ Utils) 5.2.2 (with man xz) as I write above. I think the test gives a broader picture of the use of xz and an example for further tests optimizing the compression rate vs performance vs equipment load. Regards.
-
staticfloat about 4 years@KolonUK reading that article, it shows that
-e
(extreme mode) always improves compression ratio; the comparison is between-0e
and-6
; while-e
always improves compression ratio within the same compression level, a higher compression level may be more effective than "extreme mode". There is no evidence that-9e
can yield a worse compression ratio than-9
. -
cronburg about 3 yearsDouble reminder to readers to check
man xz | grep XZ_OPT
before using this method. -
midnite about 2 years@user3439968 - I wonder if using T0 without a space means compression level 0. I think using
XZ_OPT="-e9 -T 0" tar cJf tarfile.tar.xz directory
is what you mean.