Appending files to a zip file with Java

92,418

Solution 1

In Java 7 we got Zip File System that allows adding and changing files in zip (jar, war) without manual repackaging.

We can directly write to files inside zip files as in the following example.

Map<String, String> env = new HashMap<>(); 
env.put("create", "true");
Path path = Paths.get("test.zip");
URI uri = URI.create("jar:" + path.toUri());
try (FileSystem fs = FileSystems.newFileSystem(uri, env))
{
    Path nf = fs.getPath("new.txt");
    try (Writer writer = Files.newBufferedWriter(nf, StandardCharsets.UTF_8, StandardOpenOption.CREATE)) {
        writer.write("hello");
    }
}

Solution 2

As others mentioned, it's not possible to append content to an existing zip (or war). However, it's possible to create a new zip on the fly without temporarily writing extracted content to disk. It's hard to guess how much faster this will be, but it's the fastest you can get (at least as far as I know) with standard Java. As mentioned by Carlos Tasada, SevenZipJBindings might squeeze out you some extra seconds, but porting this approach to SevenZipJBindings will still be faster than using temporary files with the same library.

Here's some code that writes the contents of an existing zip (war.zip) and appends an extra file (answer.txt) to a new zip (append.zip). All it takes is Java 5 or later, no extra libraries needed.

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import java.util.zip.ZipOutputStream;

public class Main {

    // 4MB buffer
    private static final byte[] BUFFER = new byte[4096 * 1024];

    /**
     * copy input to output stream - available in several StreamUtils or Streams classes 
     */    
    public static void copy(InputStream input, OutputStream output) throws IOException {
        int bytesRead;
        while ((bytesRead = input.read(BUFFER))!= -1) {
            output.write(BUFFER, 0, bytesRead);
        }
    }

    public static void main(String[] args) throws Exception {
        // read war.zip and write to append.zip
        ZipFile war = new ZipFile("war.zip");
        ZipOutputStream append = new ZipOutputStream(new FileOutputStream("append.zip"));

        // first, copy contents from existing war
        Enumeration<? extends ZipEntry> entries = war.entries();
        while (entries.hasMoreElements()) {
            ZipEntry e = entries.nextElement();
            System.out.println("copy: " + e.getName());
            append.putNextEntry(e);
            if (!e.isDirectory()) {
                copy(war.getInputStream(e), append);
            }
            append.closeEntry();
        }

        // now append some extra content
        ZipEntry e = new ZipEntry("answer.txt");
        System.out.println("append: " + e.getName());
        append.putNextEntry(e);
        append.write("42\n".getBytes());
        append.closeEntry();

        // close
        war.close();
        append.close();
    }
}

Solution 3

I had a similar requirement sometime back - but it was for reading and writing zip archives (.war format should be similar). I tried doing it with the existing Java Zip streams but found the writing part cumbersome - especially when directories where involved.

I'll recommend you to try out the TrueZIP (open source - apache style licensed) library that exposes any archive as a virtual file system into which you can read and write like a normal filesystem. It worked like a charm for me and greatly simplified my development.

Solution 4

You could use this bit of code I wrote

public static void addFilesToZip(File source, File[] files)
{
    try
    {

        File tmpZip = File.createTempFile(source.getName(), null);
        tmpZip.delete();
        if(!source.renameTo(tmpZip))
        {
            throw new Exception("Could not make temp file (" + source.getName() + ")");
        }
        byte[] buffer = new byte[1024];
        ZipInputStream zin = new ZipInputStream(new FileInputStream(tmpZip));
        ZipOutputStream out = new ZipOutputStream(new FileOutputStream(source));

        for(int i = 0; i < files.length; i++)
        {
            InputStream in = new FileInputStream(files[i]);
            out.putNextEntry(new ZipEntry(files[i].getName()));
            for(int read = in.read(buffer); read > -1; read = in.read(buffer))
            {
                out.write(buffer, 0, read);
            }
            out.closeEntry();
            in.close();
        }

        for(ZipEntry ze = zin.getNextEntry(); ze != null; ze = zin.getNextEntry())
        {
            out.putNextEntry(ze);
            for(int read = zin.read(buffer); read > -1; read = zin.read(buffer))
            {
                out.write(buffer, 0, read);
            }
            out.closeEntry();
        }

        out.close();
        tmpZip.delete();
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}

Solution 5

I don't know of a Java library that does what you describe. But what you described is practical. You can do it in .NET, using DotNetZip.

Michael Krauklis is correct that you cannot simply "append" data to a war file or zip file, but it is not because there is an "end of file" indication, strictly speaking, in a war file. It is because the war (zip) format includes a directory, which is normally present at the end of the file, that contains metadata for the various entries in the war file. Naively appending to a war file results in no update to the directory, and so you just have a war file with junk appended to it.

What's necessary is an intelligent class that understands the format, and can read+update a war file or zip file, including the directory as appropriate. DotNetZip does this, without uncompressing/recompressing the unchanged entries, just as you described or desired.

Share:
92,418
Konstantin
Author by

Konstantin

Mobile developer (J2ME,iPhone &amp; Android) , Java programmer and PHP hacker.

Updated on July 14, 2022

Comments

  • Konstantin
    Konstantin almost 2 years

    I am currently extracting the contents of a war file and then adding some new files to the directory structure and then creating a new war file.

    This is all done programatically from Java - but I am wondering if it wouldn't be more efficient to copy the war file and then just append the files - then I wouldn't have to wait so long as the war expands and then has to be compressed again.

    I can't seem to find a way to do this in the documentation though or any online examples.

    Anyone can give some tips or pointers?

    UPDATE:

    TrueZip as mentioned in one of the answers seems to be a very good java library to append to a zip file (despite other answers that say it is not possible to do this).

    Anyone have experience or feedback on TrueZip or can recommend other similar libaries?

  • Konstantin
    Konstantin over 14 years
    I am not sure how your proposed solution differs from what I am doing already - how is this more automated?
  • sfussenegger
    sfussenegger over 14 years
    @carlos regarding your blog post: which Java version did you use? I just tested getting size of a 148M ZIP archive with standard API (new ZipFile(file).size()) and latest 7Zip bindings with Java 1.6.0_17 on a amd64 Linux system (4 cores). The standard API outperformed 7Zip by far (at least for the task you present on your blog: getting number of entries). Java took an avg of 1.5ms while 7Zip needed an avg of 350ms for 100 runs (excluding warmup). So from my perspective, there is no need to throw native libraries at this kind of problem.
  • Konstantin
    Konstantin over 14 years
    My war file is 30Mb compressed - not sure this approach will be the best way as it will require a lot of memory - I am already caching a lot of database queries in memory and this might make the memory footprint too big.
  • Konstantin
    Konstantin over 14 years
    I am still keen to understand your solution - you say instead or un-war then re-war I should read the file and then write to a new war - is this not the same thing? Please can you explain
  • Konstantin
    Konstantin over 14 years
    This looks very good - would like to know if there are any performance issues to know about?
  • Konstantin
    Konstantin over 14 years
    Didn't realise that this was going to use a native library thanks for point that out - will not investigate further.
  • sfussenegger
    sfussenegger over 14 years
    @Grouchal Actually you won't ever need more memory than BUFFER (I've chosen 4MB, but you're free to tailor it to your needs - it shouldn't hurt to reduce it to a few KB only). The file is never stored entirely in memory.
  • sfussenegger
    sfussenegger over 14 years
    the idea is to decompress contents of the existing war into BUFFER and compress it into a new archive - entry after entry. After that, you end up with the same archive that's ready to take some more entries. I've chosen to write "42" into answer.txt. That's where you should place your code to append more entries.
  • Konstantin
    Konstantin over 14 years
    How would this approach compare to using TrueZip - mentioned by gnlogic? TrueZip seems to really append to the file
  • sfussenegger
    sfussenegger over 14 years
    Sorry, I didn't know this library. After digging through the code, I still don't know what it's doing, but it's doing it pretty fast :) So yes, it seem to really append content to a file.
  • sfussenegger
    sfussenegger over 14 years
    However, as you said you want "to copy the war file and then just append the files" I assume you don't want to modify the source. Using TrueZip, you'll have to copy the file which isn't necessary with the code above. Therefore, both approaches should finally be quite similar in performance.
  • gnlogic
    gnlogic over 14 years
    So far I've been able to use it effectively with moderately sized files (3 MB etc). Haven't run into any performance problems.
  • gnlogic
    gnlogic over 14 years
    Truezip uses the concept of treating the zip file like a virtual file system. If you wanna copy and append - I bet that should be pretty easy too.
  • dma_k
    dma_k about 14 years
    @Carlos: If you have some free time, can you compare extraction to Apache common compress (commons.apache.org/compress)?
  • Carlos Tasada
    Carlos Tasada about 14 years
    @dma_k: I could do the test but the documentation says 'gzip support is provided by the java.util.zip package of the Java class library.' So I don't expect any difference
  • dma_k
    dma_k about 14 years
    I confirm that (after checking commons-compress sources): it utilizes available algorithms where possible. They have created their own ZipFile implementation, but it is based on java.util.zip.Inflater et al. I don't expect any tremendous speed boost as well, but comparison of extraction from .zip file might be interesing for you just for completeness.
  • Adam Schmideg
    Adam Schmideg over 13 years
    If you get a ZipException - invalid entry compressed size with this approach, see coderanch.com/t/275390/Streams/java/…
  • Liam Haworth
    Liam Haworth over 12 years
    And with this code the new files have top priority over the old ones
  • Liam Haworth
    Liam Haworth over 12 years
    you can also change the buffer size to need, the one that is in the code right now is only for small files
  • user577732
    user577732 over 12 years
    really liked this code but i needed something else where i needed to add files into folders in the zip and not just the root of the zip i posted my edited method here stackoverflow.com/questions/9300115/… hope it helps out others thanks a ton Liam for the great base code didn't really change much but i think that's a great method now :)
  • Nirmal Raghavan
    Nirmal Raghavan over 10 years
    How can we use this one using smb? I want to add files to a zip file which is in a windows machine from a osx/linux machine.
  • Grzegorz Żur
    Grzegorz Żur over 10 years
    @NirmalRaghavan This is out of scope of this question. For SMB/CIFS, see how to mount Windows network drive in Linux.
  • Tobias Kremer
    Tobias Kremer over 10 years
    Thanks for the example. Turns out I was to stupid to use ZIP-FileSystems till now.
  • Ned Twigg
    Ned Twigg over 10 years
    This code will only work for a zip file that was created by the java encoder. You can't reuse the ZipEntry because it stores the compressed size, which will depend on the compression settings.
  • Heiko Haller
    Heiko Haller about 10 years
    Wow, that was it. Much easier than the current top answer, which seems somewhat outdated as best answer since Java 7. @Grouchal: can / would you revoke or move your +150 in order to boost this answer? (We just spent some hours trying to get TrueVFS to work in vain...)
  • DJphilomath
    DJphilomath over 9 years
    There's a new option in Java 7, a ZipFileSystem
  • subrunner
    subrunner almost 9 years
    Zip File System can't really deal with whitespaces in the folder structure. For a workaround, encode all whitespaces with "%2520" (see also stackoverflow.com/questions/9873845/… )
  • Vadzim
    Vadzim almost 8 years
    It must be noted that TrueVFS, the successor of TrueZIP, uses Java 7 NIO 2 features under the hood when appropriate but offers much more features like thread-safe async parallel compression.
  • Vadzim
    Vadzim almost 8 years
    Beware that ZipFileSystem by default is vulnerable to OutOfMemoryError on huge inputs.
  • Vadzim
    Vadzim over 7 years
    Added TrueVFS code samples in separate answer.
  • Vikas Sharma
    Vikas Sharma over 7 years
    I am working with java 6 and i have similar kind of requirement.Moreover i have directories in zip file and i have to merge two zip files.Any help?
  • Sebastian Götz
    Sebastian Götz about 7 years
    Thank you very much. That led me into the right direction. I was using Java 8 and binary content and with the Files.newByteChannel method I had to provide the OpenOptions CREATE and WRITE otherwise an exception was thrown.
  • Rajesh Goel
    Rajesh Goel about 7 years
    Use Files.copy instead: try (FileSystem jarFs = FileSystems.newFileSystem(uri, env, null)) { for(final Path newFilePath : newFilePathList) { final Path pathInZipFile = jarFs.getPath("/" + newFilePath.getFileName()); Files.copy(newFilePath, pathInZipFile, StandardCopyOption.REPLACE_EXISTING); } }
  • NateS
    NateS about 4 years
    This answer shows how to do it but how does it work under the covers? Is updating a file in a zip efficient or is it equivalent to unzipping and building a new zip?
  • Grzegorz Żur
    Grzegorz Żur about 4 years
    @NateS Feel free to check the code or experiment and add comment with results.
  • Dave The Dane
    Dave The Dane over 3 years
    The [Zip File System][1] mentioned in this answer may be easy to use but, as others have queried, it's not entirely clear how it actually copies an entry from one zip to another (byte-for-byte or decompress/recompress?). [1]: docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/…