Fast compression in 7z format (like zip or gzip)
Solution 1
You could try the 7-Zip Zstandard version. This fork supports additional codecs which are very fast for compression and decompression.
Here is a short summary of the codecs used:
LZ4 - fastest compression / decompression, but not much compression ratio
Lizard / LZ5 - better ratio then LZ4 and often faster on decompression then LZ4... but compression is a bit slower
Brotli and Zstandard - zstd is often a bit faster then Brotli, but for text content, Brotli maybe is a bit better ;)
Threading is supported by all 5 codecs, up to 256 threads currently.
Run it like:
7z a archiv.7z -m0=lz5 -mx1 -mmt=4
7z a archiv.7z -m0=zstd -mx1 -mmt=4
7z a archiv.7z -m0=brotli -mx1 -mmt=1
.. 7z a archiv.7z -m0=brotli -mx1 -mmt=256
And so on...
Solution 2
A very fast compressor is lbzip2 with multithreading, but it cannot be run from tar itself.
In gnu tar you can specify which compressor with a flag. Examples: tar -I "zstd -T0"
or tar --use-compress-program=pigz
If you want a fast single-threaded compressor you can use lz4.
But you don't have to use this, you can also pipe the output through a compressor of your choosing.
# create
tar -c /inputdir | pigz --fast > output.tar.gz
# decompress
pigz -d input.tar.gz | tar -x
My source disks typically read at 20 MiB/s, sometimes 100
This sounds like you are actually bottlenecked on random access reads and not compression. If you have large files you should defragment them. If you have many small files you should make sure the disk is mounted with relatime
and you could also try fastar which I have optimized for the case of many small files.
Related videos on Youtube
Nemo
Updated on September 18, 2022Comments
-
Nemo over 1 year
In short: can the Deflate compression be used only with the zip format (
-tzip
) in 7zip?
I want to archive a big directory (hundreds GiB) from a disk to another, while keeping I/O speed the same or better than without compression.
I like the 7z format for a variety of reasons, but LZMA and Bzip2 compressions are too slow even with
-mx=1
. I've tried7z a -mm=Zip -mx=1 -mmt=4
(and-mm=GZip
which uses Deflate too), but I get an argument error after the file scanning phase. http://7zip.bugaco.com/7zip/MANUAL/switches/method.htmMy typical solution would be tar with
.tar.lzo
(LZOP), which easily reaches 100 MiB/s single-threaded at default compress rate; or.tar.gz
withGZIP=-1
. A very fast compressor is lbzip2 with multithreading, but it cannot be run from tar itself.My source disks typically read at 20 MiB/s, sometimes 100 (with files several MiB big); the target writes at up to 80 MiB/s. So this is the speed the compressor should have, ideally even when single-threaded. Up to 8 cores and 16 GB RAM are available.
-
Nemo almost 7 yearsNo, my bottleneck is usually not I/O (except with lzop, which is much faster than I/O). When I say "20, sometimes 100", I mean that some disks consistently read at just 20, while a few others I have are faster. Currently I'm piping to lbzip2, or lzop when I/O is too fast for lbzip2. Thanks for mentioning fastar, I'll try it with some disk.
-
Nemo almost 7 yearsSince your READMEs mention ext4, perhaps I should have said that my disks are mostly external NTFS disks...