Any linux command to perform parallel decompression of tar.bz2 file?

26,214

Solution 1

lbzip2 and pbzip2 are the tools which you can use for parallel compression and decompression.

Usage:

lbzip2 -d <file.tar.bz2> 
pbzip2 -d <file.tar.bz2> 

-d option is used for decompression.

To install these packages:

lbzip2 Install lbzip2type:

sudo apt-get install lbzip2

pbzip2 Install pbzip2type:

sudo apt-get install pbzip2

Solution 2

You can uncompress your archive with a single command using the tar -I option. It gives you the ability to use any compression utility that supports the -d option.

tar -I lbzip2 -xvf <file.tar.bz2>

It comes very useful when deailing with big archive as you don't need to have twice the uncompressed size available on the target filesystem (the tar temp file and the output file) It's also faster as you need far less disk IO.

Of course that works when compressing too :

tar -I lbzip2 -cvpf <file.tar.bz2> <file>

Check tar --help for more options.

Solution 3

you can use pbzip2 with the -d flag to "decompress",

from the manpage:

  pbzip2 -d myfile.tar.bz2

This example will decompress the file "myfile.tar.bz2" into the decompressed file "myfile.tar". It will use the autodetected # of processors (or 2 processors if autodetect not supported).

After decompressing, you need to untar the file with

 tar xf myfile.tar

A tar file is just a container, to which you can apply multiple compression algorithms, for example, you can have a ".tar.gz" or a ".tar.bz2" which both have different compression algorithms applied. So pbzip2 will only uncompress the archive but it will not extract the files, use tar to extract the files. Tar shouldn't take long since the archive is already uncompressed and it will just extract the files. (note that we are Not using the 'z' flag or the 'j' flag in the tar command, which they indicate that we also want to decompress the file)

Solution 4

lbzip2 seems a lot better than pbzip2 in your case as it is able to speed up decompression of standard .bz2 files while pbzip2 doesn't do that. (Just tested it - 17 seconds for lbzip2 vs 56 seconds for pbzip2 on a partially loaded quad core).

Share:
26,214

Related videos on Youtube

user784637
Author by

user784637

Updated on September 18, 2022

Comments

  • user784637
    user784637 over 1 year

    I have a rather large file (~50GB) and it takes some time to run

    tar xvf file.tar.bz2

    on it. I'm aware of programs that can do parallel compression for bzip2 files but unaware of programs that can do parallel decompression for bzip2 files.

    Are there any programs that can achieve this? What is the exact syntax of the command to use to extract from the file?

    I'm using ubuntu 12.04

  • macrobook
    macrobook over 11 years
    the manual page has some useful examples: manpages.ubuntu.com/pbzip2
  • user784637
    user784637 over 11 years
    So if I understand correctly, I need to decompress and then untar? Like 2 commands as opposed to tar xvf?
  • user784637
    user784637 over 11 years
    @Sam Thanks for the answer - would you be able to answer the comment I left on the other answer
  • devav2
    devav2 over 11 years
    Yes when you run lbzip2 -d -n 2 file.tar.bz2 it will give a tar file. Which needs to be untarred.
  • Tapio
    Tapio over 11 years
    From the man page of pbzip2 (lbzip2 tells a similar story): "Files that are compressed with pbzip2 will also gain considerable speedup when decompressed using pbzip2. Files that were compressed using bzip2 will not see speedup since bzip2 packages the data into a single chunk that cannot be split between processors."
  • devav2
    devav2 over 11 years
    @Tapio Here is the Description for lbzip2 "Compress or decompress FILE operands or standard input to regular files or standard output, by calling Julian Seward's libbz2 from multiple threads. The lbzip2 utility employs multiple threads and an input-bound splitter even when decompressing .bz2 files created by standard bzip2 (but see BUGS below)."
  • Wodin
    Wodin over 9 years
    Another option (e.g. if your version of "tar" doesn't understand the -I option) is lbzip2 -dc file.tar.bz2 | tar xvf -
  • Volker Siegel
    Volker Siegel about 9 years
    From the answer alone, I would understand both programs need to be used together somehow - but they seem to be alternatives, actually? (It says "lbzip2 and pbzip2 are the tools...", "Usage: lbzip2... pbzip2...", "to install these...")