Make tar file by Java

50,707

Solution 1

You can use the jtar - Java Tar library.

Taken from their site:

JTar is a simple Java Tar library, that provides an easy way to create and read tar files using IO streams. The API is very simple to use and similar to the java.util.zip package.

An example, also from their site:

   // Output file stream
   FileOutputStream dest = new FileOutputStream( "c:/test/test.tar" );

   // Create a TarOutputStream
   TarOutputStream out = new TarOutputStream( new BufferedOutputStream( dest ) );

   // Files to tar
   File[] filesToTar=new File[2];
   filesToTar[0]=new File("c:/test/myfile1.txt");
   filesToTar[1]=new File("c:/test/myfile2.txt");

   for(File f:filesToTar){
      out.putNextEntry(new TarEntry(f, f.getName()));
      BufferedInputStream origin = new BufferedInputStream(new FileInputStream( f ));

      int count;
      byte data[] = new byte[2048];
      while((count = origin.read(data)) != -1) {
         out.write(data, 0, count);
      }

      out.flush();
      origin.close();
   }

   out.close();

Solution 2

I would look at Apache Commons Compress.

There is an example part way down this examples page, which shows off a tar example.

TarArchiveEntry entry = new TarArchiveEntry(name);
entry.setSize(size);
tarOutput.putArchiveEntry(entry);
tarOutput.write(contentOfEntry);
tarOutput.closeArchiveEntry();

Solution 3

.tar archive files are not compressed. You have to run a file compression on it like gzip and turn it into something like .tar.gz.

If you just want to just archive a directory, take a look at:

Solution 4

I have produced following code to solve this problem. This code checks if any of files to be incorporated already exist in tar file and updates that entry. Later if it doesn't exist append to the end of archive.

import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;

public class TarUpdater {

        private static final int buffersize = 8048;

        public static void updateFile(File tarFile, File[] flist) throws IOException {
            // get a temp file
            File tempFile = File.createTempFile(tarFile.getName(), null);
            // delete it, otherwise you cannot rename your existing tar to it.
            if (tempFile.exists()) {
                tempFile.delete();
            }

            if (!tarFile.exists()) {
                tarFile.createNewFile();
            }

            boolean renameOk = tarFile.renameTo(tempFile);
            if (!renameOk) {
                throw new RuntimeException(
                        "could not rename the file " + tarFile.getAbsolutePath() + " to " + tempFile.getAbsolutePath());
            }
            byte[] buf = new byte[buffersize];

            TarArchiveInputStream tin = new TarArchiveInputStream(new FileInputStream(tempFile));

            OutputStream outputStream = new BufferedOutputStream(Files.newOutputStream(tarFile.toPath()));
            TarArchiveOutputStream tos = new TarArchiveOutputStream(outputStream);
            tos.setLongFileMode(TarArchiveOutputStream.LONGFILE_POSIX);

            //read  from previous  version of  tar  file
            ArchiveEntry entry = tin.getNextEntry();
            while (entry != null) {//previous  file  have entries
                String name = entry.getName();
                boolean notInFiles = true;
                for (File f : flist) {
                    if (f.getName().equals(name)) {
                        notInFiles = false;
                        break;
                    }
                }
                if (notInFiles) {
                    // Add TAR entry to output stream.
                    if (!entry.isDirectory()) {
                        tos.putArchiveEntry(new TarArchiveEntry(name));
                        // Transfer bytes from the TAR file to the output file
                        int len;
                        while ((len = tin.read(buf)) > 0) {
                            tos.write(buf, 0, len);
                        }
                    }
                }
                entry = tin.getNextEntry();
            }
            // Close the streams
            tin.close();//finished  reading existing entries 
            // Compress new files

            for (int i = 0; i < flist.length; i++) {
                if (flist[i].isDirectory()) {
                    continue;
                }
                InputStream fis = new FileInputStream(flist[i]);
                TarArchiveEntry te = new TarArchiveEntry(flist[i],flist[i].getName());
                //te.setSize(flist[i].length());
                tos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
                tos.setBigNumberMode(2);
                tos.putArchiveEntry(te); // Add TAR entry to output stream.

                // Transfer bytes from the file to the TAR file
                int count = 0;
                while ((count = fis.read(buf, 0, buffersize)) != -1) {
                    tos.write(buf, 0, count);
                }
                tos.closeArchiveEntry();
                fis.close();
            }
            // Complete the TAR file
            tos.close();
            tempFile.delete();
        }
    }

If you use Gradle use following dependency:

compile group: 'org.apache.commons', name: 'commons-compress', version: '1.+'

I also tried org.xeustechnologies:jtar:1.1 but performance is way below the one provided by org.apache.commons:commons-compress:1.12

Notes about performance using different implementations:

Zipping 10 times using Java 1.8 zip:
- java.util.zip.ZipEntry;
- java.util.zip.ZipInputStream;
- java.util.zip.ZipOutputStream;

[2016-07-19 19:13:11] Before
[2016-07-19 19:13:18] After
7 seconds

Tar-ing 10 times using jtar:
- org.xeustechnologies.jtar.TarEntry;
- org.xeustechnologies.jtar.TarInputStream;
- org.xeustechnologies.jtar.TarOutputStream;

[2016-07-19 19:21:23] Before
[2016-07-19 19:25:18] After
3m55sec

shell call to Cygwin /usr/bin/tar - 10 times
[2016-07-19 19:33:04] Before
[2016-07-19 19:33:14] After
14 seconds

Tar-ing 100(hundred) times using org.apache.commons.compress:
- org.apache.commons.compress.archivers.ArchiveEntry;
- org.apache.commons.compress.archivers.tar.TarArchiveEntry;
- org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
- org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;

[2016-07-19 23:04:45] Before
[2016-07-19 23:04:48] After
3 seconds

Tar-ing 1000(thousand) times using org.apache.commons.compress:
[2016-07-19 23:10:28] Before
[2016-07-19 23:10:48] After
20 seconds

Share:
50,707

Related videos on Youtube

卢声远 Shengyuan Lu
Author by

卢声远 Shengyuan Lu

卢声远 Shengyuan Lu I am not an engineer, I am a software engineer. My blogs about tech

Updated on August 17, 2022

Comments

  • 卢声远 Shengyuan Lu
    卢声远 Shengyuan Lu over 1 year

    I want to use Java to compress a folder to a tar file (in programmatic way). I think there must be an open source or library to do it. However, I cannot find such method.

    Alternatively, could I make a zip file and rename its extended name as .tar?

    Anyone could suggest a library to do it? Thanks!

    • Joachim Sauer
      Joachim Sauer over 12 years
      Renaming a zip file to .tar doesn't do anything productive. It just "lies" about the content of the file.
    • Marcelo
      Marcelo over 12 years
      It is not a duplicate, one asks of extracting a tar file, this asks of making one.
    • user unknown
      user unknown over 12 years
      Tar is not compressed. What do you want - compression or tar?
    • njzk2
      njzk2 over 10 years
      you do realize that changing the extension of a file does not actually changes its contents, right ?
    • Lassi
      Lassi almost 5 years
      Most likely the asker wants both compression AND tar. A tar archive is often compressed with an external compressor (gzip, bzip2, xz, etc.) - this is so common that people often think of tar+gzip as one operation.
  • Nikita Shah
    Nikita Shah over 9 years
    This works fine for .txt file but can you please help when i want to create .tar of a folder.
  • Beatrice Lin
    Beatrice Lin over 6 years
    I face the runtime error: java.lang.NoClassDefFoundError: Failed resolution of: Ljava/nio/file/attribute/PosixFilePermission; please help me, thanks!
  • user573215
    user573215 over 6 years
    Is that actually precisely true, that Tar archives are not compressed? I can see that compression is not the direct aim of Tars but let's look at this scenario: The filesystem works with blocks each of 4KB in size. Now you've got arbitrary files for each 'size mod 4KB != 0' is true. So each file wastes < 4KB of storage, because it does not fill up its last block. The more files, the bigger the overall waste and that may become significant. To be continued in the next comment...
  • user573215
    user573215 over 6 years
    When putting all the files together into one Tar the waste of unused block size is < 4KB for the whole Tar. So there would be something like implicit compression or at least a more efficient use of storage. This thesis is based on the assumption that the waste part of each single file's final block will be dropped and not be copied into the Tar. Is the assumption and the afterwards thoughts correct?
  • aprodan
    aprodan over 6 years
    nikita-shah: please see my solution. It can help with folders provided you have a file list. Eventualy iterate files in folder with DirectoryStream<Path> stream = Files.newDirectoryStream(dirpath);
  • Germano Rizzo
    Germano Rizzo over 6 years
    Wikipedia defines compression as: In signal processing, data compression [...] involves encoding information using fewer bits than the original representation So, you're not compressing the files in the archive, you're removing the padding that the filesystem applies to them in order to store them to the disk. In other words, the "original representation" has an inherent length, if a filesystem applies a padding to 4Kb, it doesn't change the length. Actually, many filesystems don't even do that (see tail packing).
  • Guo
    Guo about 6 years
    I don't know how to tar two or more folders to one .tar? can you help me ?
  • aprodan
    aprodan about 6 years
    Basically tar file can be written sequentialy , entry by entry. To reply your q, we need to know if by "tar two folders in one tar" you mean to preserve folder structure inside tar or not. In any case you can use tos.putArchiveEntry(new TarArchiveEntry(name)); where name will be populated based o your requirements.