Fastest way to extract tar.gz

68,783

Solution 1

pigz is a parallel version of gzip. Although it only uses a single thread for decompression, it starts 3 additional threads for reading, writing, and check calculation. Your results may vary but we have seen significant improvement in decompression of some of our datasets. Once you install pigz, the tar file can be extracted with:

pigz -dc target.tar.gz | tar xf -

Solution 2

if there are many many many small files in the tar ball, cancel the ‘v’ parameter, try again!

Solution 3

If you want to see progress use something like pv. Here is an Example:

pigz -dc mysql-binary-backup.tar.gz | pv | tar xf -
Share:
68,783

Related videos on Youtube

Justin
Author by

Justin

Updated on September 18, 2022

Comments

  • Justin
    Justin over 1 year

    Is there anyway to extract a tar.gz file faster than tar -zxvf filenamehere?

    We have large files, and trying to optimize the operation.

    • EEAA
      EEAA almost 13 years
      Are you finding that the $ tar -zxvf method is IO or CPU bound?
    • Justin
      Justin almost 13 years
      Believe CPU, how can I check though?
    • icecbr
      icecbr almost 13 years
      Not directly related, but 'z' hasn't been required since 2004/tar v1.1.5 gnu.org/software/tar/#TOCreleases :)
    • j0h
      j0h over 3 years
      @Justin You might have to install it, but vmstat will tell you about IO or CPU loading. vmstat reports information about processes, memory, paging, block IO, traps, disks and cpu activity you can even run it as a continual process, vmstat 1 100 or every 1 second, for 100 seconds, vmastat outputs. pigz was really helpful, I decompressed 108GB gz file in minutes that was taking over an hour previously.
  • Eimantas
    Eimantas almost 13 years
    I never use -v param. Don't know why people need that much noise in console.
  • ruakh
    ruakh over 11 years
    +1. FWIW, you can also write that as tar -xvf --use-compress-program=pigz filenamehere. (-z amounts to --use-compress-program=gzip.) Alternatively, you can even make gzip be a symlink to pigz, and keep using -zxvf.
  • Michael Hampton
    Michael Hampton almost 11 years
    @Eimantas When you untar something that contains many multi-gigabyte files, you will want some indication of progress. :)
  • alfC
    alfC over 8 years
    For bzip2 there is pbzip2 (p for parallel). tar --use-compress-program=pbzip2 -xvf file.tar.bz2.
  • smci
    smci over 6 years
    @TimHughes: that's really great to know, please post as a separate answer!
  • Luciano Andress Martini
    Luciano Andress Martini about 6 years
    Michael Hampton if you have a multi-gigabyte files but mixed with a big lists of small files you have a good reason to do not use -v, in my local tests it makes tar very slow specially if you have tar running in a remote server via terminal, what i do is to watch du -s directory so i can watch the directory growing...
  • Stefan Lasiewski
    Stefan Lasiewski almost 6 years
    It might be worth using --checkpoint=NUMBER (display progress messages every NUMBERth record) instead of -v.
  • Stefan Lasiewski
    Stefan Lasiewski almost 6 years
    Is there a way to use the pv command to show progress, or an equivilant, while also using the --use-compress-program=pigz flag? During compression, I can do gnutar --use-compress-program="pigz | pv" -cf target.tar.gz YourData, but not sure how to do this during untar/uncompression.
  • Mikko Ohtamaa
    Mikko Ohtamaa over 2 years
    Nice! You just made my day much more joyful when uncompressing tar archives of hundreds of gigabytes.
  • m_a_s
    m_a_s about 2 years
    @StefanLasiewski You can use pigz with pv and tar in this way: "tar cf - /your/files | pv | pigz > compressed.tgz"