Pros and cons of bzip vs gzip?
Solution 1
Gzip and bzip2, as well as xz and lzop, are functionally equivalent. (There once was a bzip, but it seems to have completely vanished off the face of the world.) Other common compression formats are zip, rar and 7z; these three do both compression and archiving (packing multiple files into one). Here are some typical ratings in terms of speed, availability and typical compression ratio (note that these ratings are somewhat subjective, don't take them as gospel):
decompression speed (fast > slow): lzop > gzip, zip > xz > 7z > rar > bzip2
compression speed (fast > slow): lzop > gzip, zip > xz > bzip2 > 7z > rar
compression ratio (better > worse): xz > 7z > rar, bzip2 > gzip > zip > lzop
availability (unix): gzip > bzip2 > xz > lzop > zip > 7z > rar
availability (windows): zip > rar > 7z > gzip > bzip2, lzop, xz
As you can see, there isn't a clear winner. If you want to rely on programs that are likely to be installed already, use zip on Windows (or if possible, self-extracting archives, as Windows doesn't ship with any of these) and gzip on unix. If you want maximum compression, use 7z or xz.
Non-Unix native formats (zip, rar, 7z) don't preserve all Unix metadata (ownership, permissions). If you need that, use compressed tar.
Rar also has downside that, as far as I know, there is no open source software that creates rar archives or that can unpack all rar archives. The other formats have free implementations and no (serious) patent claims.
Solution 2
As far as I can tell, gzip is overall faster, while bzip overall produces better (smaller) compression.
Solution 3
The algorithms have different time, memory, space tradeoffs. Bear in mind these algorithms were written quite a while back and your smartphone has many times more CPU than desktops of those days.
Your pick is between universality (.gz) and a bit more compression (.bz2). Only you can say whichyou care about more.
One advantage of .gz is that it can compress a stream, a sequence where you can't look behind. This makes it the official compressor of http streams. I needed to use gzip once because of that, but unlikely you'll need to think about it.
Solution 4
Here is a list of sites that test compression algorithms, to find just bzip and gzip you will have to do some digging, but most sites will list characteristics of the algorithms. This way you can compare what is important to you, size (compression ratio), time, memory, cpu.
http://www.maximumcompression.com/benchmarks/benchmarks.php
Solution 5
In my experience bzip has offered consistently better compression ratios than gzip. Plus with 7zip as manager and bzip algorithm, 7zip can make use of multi core processors.
Related videos on Youtube
ripper234
Updated on September 17, 2022Comments
-
ripper234 over 1 year
I've known gzip for years, recently I saw bzip being used at work. Are they basically equivalent, or are there significant pros and cons to one of them over the other?
-
Angry 84 over 8 yearsWhile this is an old question with a valid and correct answer, I would like to point people to this google result: tukaani.org/lzma/benchmarks.html as it does break it down further
-
Joseph over 7 yearsIsn't bzip for compression and gzip for archival?
-
ripper234 over 7 years@juniorRubyist source?
-
Joseph over 7 yearsI just heard that. I forgot where.
-
neverMind9 about 5 yearsNo mention of random access? stackoverflow.com/questions/14225751/…
-
-
Dentrasi over 13 yearsAlso, gzip seems to be slightly better supported, especially on Windows..
-
whitequark over 13 years@Dentrasi: winrar/7zip support both, what's the problem?
-
Lie Ryan over 13 yearsas far as I can tell, all versions of Windows since XP, can open zip file natively using the file explorern
-
new123456 almost 13 years
bzip2
is less available thangzip
? What UNIX systems don't come withbzip2
? -
Gilles 'SO- stop being evil' almost 13 years@new123456 On OpenBSD, gzip is in the base system but bzip2 has to be installed from a package. Many *WRT routers include gzip but not bzip2.
-
Matthew over 11 years@whitequark: being widely supported is mostly important for unix since users may not have root access and must work with what is already installed. Also applies to Windows environments where the user does not have admin access (schools/libraries/etc).
-
whitequark over 11 years@Matthew, you don't need admin rights to use a lot of ported free software, including 7zip.
-
shgnInc over 9 years@Gilles, And What about pbzip?
-
Gilles 'SO- stop being evil' over 9 years@shgnInc Less commonly available than
bzip2
. As for speed, it depends how many processors you have. Hmm, I should add xz. -
Gilles 'SO- stop being evil' over 8 years@mlainz Original research. This isn't Wikipedia.
-
IQAndreas about 8 yearsDo you have any statistics or sources to back that up?
-
Lie Ryan about 8 years
-
stommestack over 7 years
unrar
is the open source rar unpacking utility. -
Gilles 'SO- stop being evil' over 7 years@JopV. Last I looked, there were some options of the rar format that the open-source unrar didn't support. I don't remember what options these are but I have had rar archives in my hand that only worked with the closed-source version.
-
forest over 5 yearsit seems to have completely vanished - Plain old
bzip
vanished because it was using the patented algorithmic coding. Because of the patent, it was re-designed to use Huffman coding instead. During this re-design, new features and improvements were added. The fundamental thing that makes it a unique compression algorithm though, the Burrows–Wheeler transform, stayed the same in both versions. -
forest over 5 yearsAlthough bzip2 is often better, gzip usually pulls ahead for text compression.
-
Nick Chammas over 4 yearsThis is a major difference between gzip and bzip2 for those working with data processing tools like Apache Spark: bzip2 is splittable and gzip is not. This means that Spark can read a single bzip2 file using multiple concurrent tasks, whereas a gzipped file can only be read with a single task.
-
BallpointBen about 4 yearsAnother way to phrase "gz can compress a stream" is that gz is homomorphic under concatenation: gz(concat(x, y)) == concat(gz(x), gz(y)). IMO this is one of gz's most useful features.
-
BallpointBen about 4 yearsUnless I'm mistaken, 7z is an archive format, and LZMA is the compression algorithm commonly used to create it.
-
Admin almost 2 yearsThe link you referenced is dead.
-
Admin almost 2 years@BallpointBen you hit the nail on its head. Couldn't have explained it any better.