Which archiving method is better for compressing text files on Linux?

24,380

Solution 1

Normally, bz2 has a better compression ratio, combined with better recoverability features.

OTOH, gz is faster.

xz is said to be even better than bz2, but I don't know the timing behaviour.

Solution 2

Last update of maximumcompression.com is June-2011 (answer updated in Oct-2015)
Therefore this website does not mention
the current champion text compressor worldwide:

      cmix

Competitions/Benchmarks:

Details:
Byron Knoll is actively developping cmix as libre software (GPL) since 2013 based on the book Data Compression Explained by Matt Mahoney. Matt Mahoney also maintains some of the above benchmarks and proposes ZPAQ (WP), a command line incremental archiver.


If you prefer a more standard tool (requiring less RAM) I recommend:

      lrzip

lrzip is an evolution of rzip by Con Kolivas.
lrzip stands for two names: Long Range ZIP and Lzma RZIP.
lrzip is often better than xz (another popular compression tool).
Alexander Riccio also recommends lrzip.


My favorite is:

      zpaq

The "archiver expert", Matt Mahoney, has intensively worked on PAQ algorithms for ten years and provide the best compromise between CPU/memory resources and compression level.

However, the last zpaq version is not often packaged/available on recent distro :-(
I always compile it from sources when I have a new machine and I need a very good compressor: https://github.com/zpaq/zpaq

clone https://github.com/zpaq/zpaq
cd zpaq
g++ -O3 -march=native -Dunix zpaq.cpp libzpaq.cpp -pthread -o zpaq

Solution 3

Maybe you could have a look to those benchmarks, especially the part testing the log files compression.

Solution 4

i have made a benchmark to test to compress the following:
204MB folder (with 1,600 html files)
results

7zip =>     2.38 MB
winrar =>   49.5 MB
zip =>      50.8 MB
gzip =>     51.9 MB

so the 7zip is the best among them you can get it from here
http://www.7-zip.org/

Share:
24,380

Related videos on Youtube

user710818
Author by

user710818

Updated on September 18, 2022

Comments

  • user710818
    user710818 over 1 year

    In my application I need do compress of logs that are text files.

    Seems that bzip2 and gzip have the same compression ratio.

    Is that correct?

    • Admin
      Admin over 12 years
      xz (from xz-tools or 7z from p7zip, it is very like lzma) is the best. bzip2 is better than gzip.
  • osgx
    osgx over 12 years
    xz is slower than bzip2.
  • Tebe
    Tebe over 7 years
    xz is not just slower , but much slower, 300 mb file took about 30 seconds for bzip2 to compress. I killed xz after it had been compressing for longer than 5 minutes
  • glglgl
    glglgl over 7 years
    @Копать_Шо_я_нашел I think it depends heavily on the compression level you choose. With -1, it is not so very slow, but with the default settings, it tends to be quite slow.
  • Rumplin
    Rumplin about 6 years
    Link does not work.