Pros and cons of bzip vs gzip?

104,591

Solution 1

Gzip and bzip2, as well as xz and lzop, are functionally equivalent. (There once was a bzip, but it seems to have completely vanished off the face of the world.) Other common compression formats are zip, rar and 7z; these three do both compression and archiving (packing multiple files into one). Here are some typical ratings in terms of speed, availability and typical compression ratio (note that these ratings are somewhat subjective, don't take them as gospel):

decompression speed (fast > slow): lzop > gzip, zip > xz > 7z > rar > bzip2
compression speed (fast > slow): lzop > gzip, zip > xz > bzip2 > 7z > rar
compression ratio (better > worse): xz > 7z > rar, bzip2 > gzip > zip > lzop
availability (unix): gzip > bzip2 > xz > lzop > zip > 7z > rar
availability (windows): zip > rar > 7z > gzip > bzip2, lzop, xz

As you can see, there isn't a clear winner. If you want to rely on programs that are likely to be installed already, use zip on Windows (or if possible, self-extracting archives, as Windows doesn't ship with any of these) and gzip on unix. If you want maximum compression, use 7z or xz.

Non-Unix native formats (zip, rar, 7z) don't preserve all Unix metadata (ownership, permissions). If you need that, use compressed tar.

Rar also has downside that, as far as I know, there is no open source software that creates rar archives or that can unpack all rar archives. The other formats have free implementations and no (serious) patent claims.

Solution 2

As far as I can tell, gzip is overall faster, while bzip overall produces better (smaller) compression.

Solution 3

The algorithms have different time, memory, space tradeoffs. Bear in mind these algorithms were written quite a while back and your smartphone has many times more CPU than desktops of those days.

Your pick is between universality (.gz) and a bit more compression (.bz2). Only you can say whichyou care about more.

One advantage of .gz is that it can compress a stream, a sequence where you can't look behind. This makes it the official compressor of http streams. I needed to use gzip once because of that, but unlikely you'll need to think about it.

Solution 4

Here is a list of sites that test compression algorithms, to find just bzip and gzip you will have to do some digging, but most sites will list characteristics of the algorithms. This way you can compare what is important to you, size (compression ratio), time, memory, cpu.
http://www.maximumcompression.com/benchmarks/benchmarks.php

Solution 5

In my experience bzip has offered consistently better compression ratios than gzip. Plus with 7zip as manager and bzip algorithm, 7zip can make use of multi core processors.

Share:
104,591

Related videos on Youtube

ripper234
Author by

ripper234

Updated on September 17, 2022

Comments

  • ripper234
    ripper234 over 1 year

    I've known gzip for years, recently I saw bzip being used at work. Are they basically equivalent, or are there significant pros and cons to one of them over the other?

    • Angry 84
      Angry 84 over 8 years
      While this is an old question with a valid and correct answer, I would like to point people to this google result: tukaani.org/lzma/benchmarks.html as it does break it down further
    • Joseph
      Joseph over 7 years
      Isn't bzip for compression and gzip for archival?
    • ripper234
      ripper234 over 7 years
      @juniorRubyist source?
    • Joseph
      Joseph over 7 years
      I just heard that. I forgot where.
    • neverMind9
      neverMind9 about 5 years
      No mention of random access? stackoverflow.com/questions/14225751/…
  • Dentrasi
    Dentrasi over 13 years
    Also, gzip seems to be slightly better supported, especially on Windows..
  • whitequark
    whitequark over 13 years
    @Dentrasi: winrar/7zip support both, what's the problem?
  • Lie Ryan
    Lie Ryan over 13 years
    as far as I can tell, all versions of Windows since XP, can open zip file natively using the file explorern
  • new123456
    new123456 almost 13 years
    bzip2 is less available than gzip? What UNIX systems don't come with bzip2?
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' almost 13 years
    @new123456 On OpenBSD, gzip is in the base system but bzip2 has to be installed from a package. Many *WRT routers include gzip but not bzip2.
  • Matthew
    Matthew over 11 years
    @whitequark: being widely supported is mostly important for unix since users may not have root access and must work with what is already installed. Also applies to Windows environments where the user does not have admin access (schools/libraries/etc).
  • whitequark
    whitequark over 11 years
    @Matthew, you don't need admin rights to use a lot of ported free software, including 7zip.
  • shgnInc
    shgnInc over 9 years
    @Gilles, And What about pbzip?
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 9 years
    @shgnInc Less commonly available than bzip2. As for speed, it depends how many processors you have. Hmm, I should add xz.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 8 years
    @mlainz Original research. This isn't Wikipedia.
  • IQAndreas
    IQAndreas about 8 years
    Do you have any statistics or sources to back that up?
  • Lie Ryan
    Lie Ryan about 8 years
    @IQAndreas: some benchmarks: 1, 2, 3
  • stommestack
    stommestack over 7 years
    unrar is the open source rar unpacking utility.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 7 years
    @JopV. Last I looked, there were some options of the rar format that the open-source unrar didn't support. I don't remember what options these are but I have had rar archives in my hand that only worked with the closed-source version.
  • forest
    forest over 5 years
    it seems to have completely vanished - Plain old bzip vanished because it was using the patented algorithmic coding. Because of the patent, it was re-designed to use Huffman coding instead. During this re-design, new features and improvements were added. The fundamental thing that makes it a unique compression algorithm though, the Burrows–Wheeler transform, stayed the same in both versions.
  • forest
    forest over 5 years
    Although bzip2 is often better, gzip usually pulls ahead for text compression.
  • Nick Chammas
    Nick Chammas over 4 years
    This is a major difference between gzip and bzip2 for those working with data processing tools like Apache Spark: bzip2 is splittable and gzip is not. This means that Spark can read a single bzip2 file using multiple concurrent tasks, whereas a gzipped file can only be read with a single task.
  • BallpointBen
    BallpointBen about 4 years
    Another way to phrase "gz can compress a stream" is that gz is homomorphic under concatenation: gz(concat(x, y)) == concat(gz(x), gz(y)). IMO this is one of gz's most useful features.
  • BallpointBen
    BallpointBen about 4 years
    Unless I'm mistaken, 7z is an archive format, and LZMA is the compression algorithm commonly used to create it.
  • Admin
    Admin almost 2 years
    The link you referenced is dead.
  • Admin
    Admin almost 2 years
    @BallpointBen you hit the nail on its head. Couldn't have explained it any better.