The fastest GZIP decompress library in .NET

18,971

Solution 1

I've have had good performance with SevenZipLib for very large files, but I was using the native 7zip format and highly compressible content. If you're using content that won't have a high compression ratio, then your throughput will vary greatly compared to some of the benchmarks you can find for these libraries.

Solution 2

I found problems with Microsoft's GZipStream implementation not being able to read certain gzip files, so I have been testing a few libraries.

This is a basic test I adapted for you to run, tweak, and decide:

using System;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using NUnit.Framework;
using Ionic.Zlib;
using ICSharpCode.SharpZipLib.GZip;

namespace ZipTests
{
    [TestFixture]
    public class ZipTests
    {
        MemoryStream input, compressed, decompressed;
        Stream compressor;
        int inputSize;
        Stopwatch timer;

        public ZipTests()
        {
            string testFile = "TestFile.pdf";
            using(var file = File.OpenRead(testFile))
            {
                inputSize = (int)file.Length;
                Console.WriteLine("Reading " + inputSize + " from " + testFile);
                var ms = new MemoryStream(inputSize);
                file.Read(ms.GetBuffer(), 0, inputSize);
                ms.Position = 0;
                input = ms;
            }
            compressed = new MemoryStream();
        }

        void StartCompression()
        {
            Console.WriteLine("Using " + compressor.GetType() + ":");
            GC.Collect(2, GCCollectionMode.Forced); // Start fresh
            timer = Stopwatch.StartNew();
        }

        public void EndCompression()
        {
            timer.Stop();
            Console.WriteLine("  took " + timer.Elapsed
                + " to compress " + inputSize.ToString("#,0") + " bytes into "
                + compressed.Length.ToString("#,0"));
            decompressed = new MemoryStream(inputSize);
            compressed.Position = 0; // Rewind!
            timer.Restart();
        }

        public void AfterDecompression()
        {
            timer.Stop();
            Console.WriteLine("  then " + timer.Elapsed + " to decompress.");
            Assert.AreEqual(inputSize, decompressed.Length);
            Assert.AreEqual(input.GetBuffer(), decompressed.GetBuffer());
            input.Dispose();
            compressed.Dispose();
            decompressed.Dispose();
        }

        [Test]
        public void TestGZipStream()
        {
            compressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Compress, true);
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Decompress, true);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        [Test]
        public void TestDotNetZip()
        {
            compressor = new Ionic.Zlib.GZipStream(compressed, Ionic.Zlib.CompressionMode.Compress, true);
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new Ionic.Zlib.GZipStream(compressed,
                                    Ionic.Zlib.CompressionMode.Decompress, true);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        [Test]
        public void TestSharpZlib()
        {
            compressor = new ICSharpCode.SharpZipLib.GZip.GZipOutputStream(compressed)
            { IsStreamOwner = false };
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new ICSharpCode.SharpZipLib.GZip.GZipInputStream(compressed);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        static void Main()
        {
            Console.WriteLine("Running CLR version " + Environment.Version +
                " on " + Environment.OSVersion);
            Assert.AreEqual(1,1); // Preload NUnit
            new ZipTests().TestGZipStream();
            new ZipTests().TestDotNetZip();
            new ZipTests().TestSharpZlib();
        }
    }
}

And the result in the system I am currently running (Mono on Linux), is as follows:

Running Mono CLR version 4.0.30319.1 on Unix 3.2.0.29
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using System.IO.Compression.GZipStream:
  took 00:00:03.3058572 to compress 37,711,561 bytes into 33,438,894
  then 00:00:00.5331546 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using Ionic.Zlib.GZipStream:
  took 00:00:08.9531478 to compress 37,711,561 bytes into 33,437,891
  then 00:00:01.8047543 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
  took 00:00:07.4982231 to compress 37,711,561 bytes into 33,431,962
  then 00:00:02.4157496 to decompress.

Be warned that this is Mono's GZIP, and Microsoft's version will give its own results (and as I mentioned, just can't handle any gzip you give it)

This is what I got on a windows system:

Running CLR version 4.0.30319.1 on Microsoft Windows NT 5.1.2600 Service Pack 3
Reading 37711561 from TestFile.pdf
Using System.IO.Compression.GZipStream:
  took 00:00:03.3557061 to compress 37.711.561 bytes into 36.228.969
  then 00:00:00.7079438 to decompress.
Reading 37711561 from TestFile.pdf
Using Ionic.Zlib.GZipStream:
  took 00:00:23.4180958 to compress 37.711.561 bytes into 33.437.891
  then 00:00:03.5955664 to decompress.
Reading 37711561 from TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
  took 00:00:09.9157130 to compress 37.711.561 bytes into 33.431.962
  then 00:00:03.0983499 to decompress.

It is easy enough to add more tests...

Solution 3

Compression performance benchmarks vary based on the size of streams being compressed and the precise content. If this is a particularly important performance bottleneck for you then it'd be worth your time to write a sample app using each library and running tests with your real files.

Share:
18,971
Rudiger
Author by

Rudiger

Updated on June 19, 2022

Comments

  • Rudiger
    Rudiger almost 2 years

    Which .NET library has the fastest decompress performance (in terms of throughput)?

    There are quite a few libraries out there...

    ...and I expect there are more I haven't listed.

    Has anyone seen a benchmark of the throughput performance of these GZIP libraries? I'm interested in decompression throughput, but I'd like to see the results for compression too.

  • Nate
    Nate almost 14 years
    Exactly, compression performance varies based on datatype in question.
  • Clinton Ward
    Clinton Ward over 11 years
    can you try sevenzipsharp sevenzipsharp.codeplex.com
  • The Dag
    The Dag almost 10 years
    System.IO.Compression looks pretty good if we just look at the times. But it looks less so if we look at the produced sizes. Then again both measurements could be atypical and a result of your particular input. This doesn't really tell us much...