Preferred way to use Java ZipOutputStream and BufferedOutputStream

82,144

Solution 1

You should always wrap the BufferedOutputStream with the ZipOutputStream, never the other way around. See the below code:

FileOutputStream fos = new FileOutputStream("hello-world.zip");
BufferedOutputStream bos = new BufferedOutputStream(fos);
ZipOutputStream zos = new ZipOutputStream(bos);

try {
    for (int i = 0; i < 10; i++) {
        // not available on BufferedOutputStream
        zos.putNextEntry(new ZipEntry("hello-world." + i + ".txt"));
        zos.write("Hello World!".getBytes());
        // not available on BufferedOutputStream
        zos.closeEntry();
    }
}
finally {
    zos.close();
}

As the comments say the putNextEntry() and closeEntry() methods are not available on the BufferedOutputStream. Without calling those methods ZipOutputStream throws an exception java.util.zip.ZipException: no current ZIP entry.

For the sake of completeness, it is worth noting that the finally clause only calls close() on the ZipOutputStream. This is because by convention all built-in Java output stream wrapper implementations propagate closing.

EDIT

I just tested it the other way around. It turns out that wrapping a ZipOutputStream with BufferedOutputStream and then only calling write() on it (without creating / closing entries) will not throw a ZipException. Instead the resulting ZIP file will be corrupt, without any entries inside it.

Solution 2

You should:

ZipOutputStream out =  new ZipOutputStream(new BufferedOutputStream(dest));

because you want to buffer the writing to the disc (because this is much more efficient in big data blocks than in a lot of little ones).


This

new BufferedOutputStream(new ZipOutputStream(dest));

would buffer before zip compression. But this all happens in the memory and does not need buffering because a lot of little memory accesses are about the same speed as a few big ones. In memory general the needed time is proportional to the number of bytes read/write.

As mentioned in the comments:

The methods of ZipOutputStream which are not part of BufferedOutputStream would not be available also. E.g. putNextEntry and closeEntry.

Share:
82,144
jjathman
Author by

jjathman

Software developer from Minnesota. @jjathman

Updated on May 14, 2020

Comments

  • jjathman
    jjathman almost 4 years

    In Java does it matter whether I instantiate a ZipOutputStream first, or the BufferedOutputStream first? Example:

    FileOutputStream dest = new FileOutputStream(file);
    ZipOutputStream zip = new ZipOutputStream(new BufferedOutputStream(dest));
    
    // use zip output stream to write to
    

    Or:

    FileOutputStream dest = new FileOutputStream(file);
    BufferedOutputStream out = new BufferedOutputStream(new ZipOutputStream(dest));
    
    // use buffered stream to write to
    

    In my non-scientific timings I can't seem to tell much of a difference here. I can't see anything in the Java API that says if one of these ways is necessary or preferred. Any advice? It seems like compressing the output first and then buffering it for writes would be more efficient.