unzip (zip, tar, tag.gz) files with ruby
Solution 1
To extract files from a .tar.gz file you can use the following methods from packages distributed with Ruby:
require 'rubygems/package'
require 'zlib'
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open('Path/To/myfile.tar.gz'))
tar_extract.rewind # The extract has to be rewinded after every iteration
tar_extract.each do |entry|
puts entry.full_name
puts entry.directory?
puts entry.file?
# puts entry.read
end
tar_extract.close
Each entry of type Gem::Package::TarReader::Entry points to a file or directory within the .tar.gz file.
Similar code can be used (replace Reader with Writer) to write files to a .tar.gz file.
Solution 2
Although Florian's answer is right, it does not take into account tar LongLinks (Try extracting jdk-7u40-linux-i586.tar.gz from oracle :P ). The following code should be able to do this:
require 'rubygems/package'
require 'zlib'
TAR_LONGLINK = '././@LongLink'
tar_gz_archive = '/path/to/archive.tar.gz'
destination = '/where/extract/to'
Gem::Package::TarReader.new( Zlib::GzipReader.open tar_gz_archive ) do |tar|
dest = nil
tar.each do |entry|
if entry.full_name == TAR_LONGLINK
dest = File.join destination, entry.read.strip
next
end
dest ||= File.join destination, entry.full_name
if entry.directory?
File.delete dest if File.file? dest
FileUtils.mkdir_p dest, :mode => entry.header.mode, :verbose => false
elsif entry.file?
FileUtils.rm_rf dest if File.directory? dest
File.open dest, "wb" do |f|
f.print entry.read
end
FileUtils.chmod entry.header.mode, dest, :verbose => false
elsif entry.header.typeflag == '2' #Symlink!
File.symlink entry.header.linkname, dest
end
dest = nil
end
end
Solution 3
Draco, thx for you snippet. Some TARs encode directories as paths ending with '/' - see this Wiki. Examlple tar is Oracle Server JRE 7u80 for Windows. This will work for them:
require 'fileutils'
require 'rubygems/package'
require 'zlib'
TAR_LONGLINK = '././@LongLink'
Gem::Package::TarReader.new( Zlib::GzipReader.open tar_gz_archive ) do |tar|
dest = nil
tar.each do |entry|
if entry.full_name == TAR_LONGLINK
dest = File.join destination, entry.read.strip
next
end
dest ||= File.join destination, entry.full_name
if entry.directory? || (entry.header.typeflag == '' && entry.full_name.end_with?('/'))
File.delete dest if File.file? dest
FileUtils.mkdir_p dest, :mode => entry.header.mode, :verbose => false
elsif entry.file? || (entry.header.typeflag == '' && !entry.full_name.end_with?('/'))
FileUtils.rm_rf dest if File.directory? dest
File.open dest, "wb" do |f|
f.print entry.read
end
FileUtils.chmod entry.header.mode, dest, :verbose => false
elsif entry.header.typeflag == '2' #Symlink!
File.symlink entry.header.linkname, dest
else
puts "Unkown tar entry: #{entry.full_name} type: #{entry.header.typeflag}."
end
dest = nil
end
end
end
gustavgans
Updated on July 09, 2020Comments
-
gustavgans almost 4 years
I want to unzip a lot of zip files. Is there a module or script that checks which format the zip file is and decompresses it? This should work on Linux, I don't care about other OSs.
-
Jeff Dickey almost 8 yearsis
:verbose => false
necessary? -
Draco Ater almost 8 yearsNo, I just didn't want anything to be printed in my use case.
-
Andrew Kane about 4 yearsTo expand on this great answer, you can extract files to disk with
Gem::Package.new("").extract_tar_gz(io, destination)
, whereio
is an object likeFile.open("file.tar.gz", "rb")
-
lilkunien over 2 years@AndrewKane worked perfectly, though passing '' to the package kinda weird. How does it suppose to work?