let tar auto detect compression type when extracting from stdin
Solution 1
Finally, I realized the reason I cannot let tar detect and decompress the archive from stdin is that I use GNU tar. BSD tar can do it automatically without problem. So I decide to use bsdtar instead of tar in my script now.
Solution 2
This is not the answer you want to hear, but this is not supported by GNU tar according to its manual:
The only case when you have to specify a decompression option while reading the archive is when reading from a pipe or from a tape drive that does not support random access. However, in this case GNU tar will indicate which option you should use. For example:
$ cat archive.tar.gz | tar tf - tar: Archive is compressed. Use -z option tar: Error is not recoverable: exiting now
If you see such diagnostics, just add the suggested option to the invocation of GNU tar:
$ cat archive.tar.gz | tar tzf -
-- 8.1.1 Creating and Reading Compressed Archives
Related videos on Youtube
antelk
Updated on September 18, 2022Comments
-
antelk over 1 year
I use GNU tar. It can auto detect compression type when compressing/decompressing files. But I need to decompress an archive from stdin and the compress type is unknown. I noticed that tar can give me correct suggesting like:
tar: Archive is compressed. Use -z option
But I want tar to use that compression option automatically without asking me to input that argument. How can I do that? Why not tar just decompress since it already knows the compression type?
Thank you!
-
peterh over 10 yearsIt weren't hard to develop in, but it is not done until now.
-
DrColossos over 10 yearsThe reason it cannot do this is that
tar
does not know which type of data it has onstdin
until after it reads it, and by then it is too late to call the uncompress program. It supports gzip, bzip2, and others. Working around this problem is not easy (it would have to buffer the data), so it just tells you to try again. -
peterh over 10 years@KevinPanko Why where it too late? No, it is not late. He reads f.e. the first 4K in a buffer, tests its compression type (if there is one), then calls the program. This buffering were 10-20 lines of additional C code.
-
DrColossos over 10 yearsAt that point in time when it knows what data it has, it can no longer use a simple
fork()
/exec()
method to pipe the data through an uncompress utility. The utility would read from thestdin
pipe and the first 4K would now be missing. There is no way to put the data back into the pipe after reading it. -
Ilmari Karonen over 10 years@KevinPanko: It could be done by forking two processes, though, basically doing the equivalent of
cat buffer - | gunzip
. (Alternatively, non-blocking I/O could be used to avoid the need for the extra process.)
-
-
Sean Perry over 10 yearsNote this works in the tar command on OSX but not on Linux using GNU tar. So this could be fixed.