How to Compress/Decompress tar.gz files in java

91,019

Solution 1

My favorite is plexus-archiver - see sources on GitHub.

Another option is Apache commons-compress - (see mvnrepository).

With plexus-utils, the code for unarchiving looks like this:

final TarGZipUnArchiver ua = new TarGZipUnArchiver();
// Logging - as @Akom noted, logging is mandatory in newer versions, so you can use a code like this to configure it:
ConsoleLoggerManager manager = new ConsoleLoggerManager();
manager.initialize();
ua.enableLogging(manager.getLoggerForComponent("bla"));
// -- end of logging part
ua.setSourceFile(sourceFile);
destDir.mkdirs();
ua.setDestDirectory(destDir);
ua.extract();

Similar *Archiver classes are there for archiving.

With Maven, you can use this dependency:

<dependency>
  <groupId>org.codehaus.plexus</groupId>
  <artifactId>plexus-archiver</artifactId>
  <version>2.2</version>
</dependency>

Solution 2

I've written a wrapper for commons-compress called jarchivelib that makes it easy to extract or compress from and into File objects.

Example code would look like this:

File archive = new File("/home/thrau/archive.tar.gz");
File destination = new File("/home/thrau/archive/");

Archiver archiver = ArchiverFactory.createArchiver("tar", "gz");
archiver.extract(archive, destination);

Solution 3

To extract the contents of .tar.gz format, I successfully use apache commons-compress ('org.apache.commons:commons-compress:1.12'). Take a look at this example method:

public void extractTarGZ(InputStream in) {
    GzipCompressorInputStream gzipIn = new GzipCompressorInputStream(in);
    try (TarArchiveInputStream tarIn = new TarArchiveInputStream(gzipIn)) {
        TarArchiveEntry entry;

        while ((entry = (TarArchiveEntry) tarIn.getNextEntry()) != null) {
            /** If the entry is a directory, create the directory. **/
            if (entry.isDirectory()) {
                File f = new File(entry.getName());
                boolean created = f.mkdir();
                if (!created) {
                    System.out.printf("Unable to create directory '%s', during extraction of archive contents.\n",
                            f.getAbsolutePath());
                }
            } else {
                int count;
                byte data[] = new byte[BUFFER_SIZE];
                FileOutputStream fos = new FileOutputStream(entry.getName(), false);
                try (BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER_SIZE)) {
                    while ((count = tarIn.read(data, 0, BUFFER_SIZE)) != -1) {
                        dest.write(data, 0, count);
                    }
                }
            }
        }

        System.out.println("Untar completed successfully!");
    }
}

Solution 4

In my experience Apache Compress is much more mature than Plexus Archiver, specifically because of issues like http://jira.codehaus.org/browse/PLXCOMP-131.

I believe Apache Compress has more activity as well.

Solution 5

If you are planning to compress/decompress on Linux, you can call the shell command line to do that for you:

Files.createDirectories(Paths.get(target));
ProcessBuilder builder = new ProcessBuilder();
builder.command("sh", "-c", String.format("tar xfz %s -C %s", tarGzPathLocation, target));
builder.directory(new File("/tmp"));
Process process = builder.start();
int exitCode = process.waitFor();
assert exitCode == 0;
Share:
91,019
kdgwill
Author by

kdgwill

googlefacebooklinkedin

Updated on July 09, 2022

Comments

  • kdgwill
    kdgwill almost 2 years

    Can anyone show me the correct way to compress and decompress tar.gzip files in java i've been searching but the most i can find is either zip or gzip(alone).

  • Michael Plautz
    Michael Plautz over 8 years
    This is great - this works the same way a command line utility would - unzip <archive> <destination>, abstracting all boilerplate details from me (perhaps if I needed to worry about performance I'd use the commons-compress library, but I nearly never have to).
  • Ilavarasan Jayaraman
    Ilavarasan Jayaraman about 8 years
    @thrau : Its gives me a compile time error as : The method createArchiver(String, String) is undefined for the type String
  • Ilavarasan Jayaraman
    Ilavarasan Jayaraman about 8 years
    and add cast to archiver
  • thrau
    thrau about 8 years
    have a look at the other examples on the web page rauschig.org/jarchivelib/examples.html
  • didil
    didil over 7 years
    Apache Compress cannot extract some tar.gz archives because of a lack of support. This bug has never been resolved : jfrog.com/jira/browse/HAP-651
  • Gili
    Gili over 7 years
    @didile how do you expect this to get fixed if the bug was reported to jfrog instead of apache compress?
  • didil
    didil over 7 years
    It has been also reported to apache issue tracker.
  • Gili
    Gili over 7 years
    @didile please provide a link.
  • Stefan Bodewig
    Stefan Bodewig over 7 years
    @didile I don't see any bug reported to issues.apache.org/jira/browse/COMPRESS that would match HAP-651. It would be great if you could open one and attach a tar where Compress fails.
  • FGreg
    FGreg about 7 years
    Since you are using the try-with-resources syntax, you shouldn't need dest.close(); and tarIn.close();
  • JohnC
    JohnC about 6 years
    Am I missing something, this will decompress the gzip file and leave you with the tar file?
  • madhuri H R
    madhuri H R almost 6 years
    @thrau Is there any callback for file extraction ? I want to know once the extraction is done.
  • thrau
    thrau almost 6 years
    @madhuriHR extract is blocking and will return once extraction is complete, no callback required
  • toolforger
    toolforger over 4 years
    @JohnC This is usually what happens, and likely the reason for the downvote since the answer does not solve the problem.
  • sichinumi
    sichinumi about 2 years
    Warning: This is unsafe due to ZipSlip, do not use this code in production software. Specifically f.mkdir() is not safe to call in the blind: snyk.io/research/zip-slip-vulnerability