Fast compression in 7z format (like zip or gzip)

8,245

Solution 1

You could try the 7-Zip Zstandard version. This fork supports additional codecs which are very fast for compression and decompression.

Here is a short summary of the codecs used:

  1. LZ4 - fastest compression / decompression, but not much compression ratio

  2. Lizard / LZ5 - better ratio then LZ4 and often faster on decompression then LZ4... but compression is a bit slower

  3. Brotli and Zstandard - zstd is often a bit faster then Brotli, but for text content, Brotli maybe is a bit better ;)

Threading is supported by all 5 codecs, up to 256 threads currently.

Run it like:

7z a archiv.7z -m0=lz5 -mx1 -mmt=4

7z a archiv.7z -m0=zstd -mx1 -mmt=4

7z a archiv.7z -m0=brotli -mx1 -mmt=1 .. 7z a archiv.7z -m0=brotli -mx1 -mmt=256

And so on...

Solution 2

A very fast compressor is lbzip2 with multithreading, but it cannot be run from tar itself.

In gnu tar you can specify which compressor with a flag. Examples: tar -I "zstd -T0" or tar --use-compress-program=pigz

If you want a fast single-threaded compressor you can use lz4.

But you don't have to use this, you can also pipe the output through a compressor of your choosing.

# create
tar -c /inputdir | pigz --fast > output.tar.gz
# decompress
pigz -d input.tar.gz | tar -x 

My source disks typically read at 20 MiB/s, sometimes 100

This sounds like you are actually bottlenecked on random access reads and not compression. If you have large files you should defragment them. If you have many small files you should make sure the disk is mounted with relatime and you could also try fastar which I have optimized for the case of many small files.

Share:
8,245

Related videos on Youtube

Nemo
Author by

Nemo

Updated on September 18, 2022

Comments

  • Nemo
    Nemo over 1 year

    In short: can the Deflate compression be used only with the zip format (-tzip) in 7zip?


    I want to archive a big directory (hundreds GiB) from a disk to another, while keeping I/O speed the same or better than without compression.

    I like the 7z format for a variety of reasons, but LZMA and Bzip2 compressions are too slow even with -mx=1. I've tried 7z a -mm=Zip -mx=1 -mmt=4 (and -mm=GZip which uses Deflate too), but I get an argument error after the file scanning phase. http://7zip.bugaco.com/7zip/MANUAL/switches/method.htm

    My typical solution would be tar with .tar.lzo (LZOP), which easily reaches 100 MiB/s single-threaded at default compress rate; or .tar.gz with GZIP=-1. A very fast compressor is lbzip2 with multithreading, but it cannot be run from tar itself.

    My source disks typically read at 20 MiB/s, sometimes 100 (with files several MiB big); the target writes at up to 80 MiB/s. So this is the speed the compressor should have, ideally even when single-threaded. Up to 8 cores and 16 GB RAM are available.

  • Nemo
    Nemo almost 7 years
    No, my bottleneck is usually not I/O (except with lzop, which is much faster than I/O). When I say "20, sometimes 100", I mean that some disks consistently read at just 20, while a few others I have are faster. Currently I'm piping to lbzip2, or lzop when I/O is too fast for lbzip2. Thanks for mentioning fastar, I'll try it with some disk.
  • Nemo
    Nemo almost 7 years
    Since your READMEs mention ext4, perhaps I should have said that my disks are mostly external NTFS disks...