unzip specific extension only
Solution 1
Something along the lines of:
#!/bin/bash
cd ~/basedir/files
for file in *.zip ; do
newfile=$(echo "${file}" | sed -e 's/^files.//' -e 's/.zip$//')
echo ":${newfile}:"
mkdir tmp
rm -rf "${newfile}"
mkdir "${newfile}"
cp "${newfile}.zip" tmp
cd tmp
unzip "${newfile}.zip"
find . -name '*.jpg' -exec cp {} "../${newfile}" ';'
find . -name '*.gif' -exec cp {} "../${newfile}" ';'
cd ..
rm -rf tmp
done
This is tested and will handle spaces in filenames (both the zip files and the extracted files). You may have collisions if the zip file has the same file name in different directories (you can't avoid this if you're going to flatten the directory structure).
Solution 2
In Short
You can do this with a one-liner find + unzip.
find . -name "*.zip" -type f -exec unzip -jd "images/{}" "{}" "*.jpg" "*.png" "*.gif" \;
In Detail
unzip
allows you to specify the files you want:
unzip archive.zip "*.jpg" "*.png" "*.gif"
And -d
a target directory:
unzip -d images/ archive.zip "*.jpg" "*.png" "*.gif"
Combine that with a find
, and you can extract all the images in all zips:
find . -name "*.zip" -type f -exec unzip -d images/ {} "*.jpg" "*.png" "*.gif" \;
Using unzip -j
to junk the extraction of the zip's internal directory structure, we can do it all in one command. This gives you the flat image list separated by zip name that you desire as a one-liner.
find . -name "*.zip" -type f -exec unzip -jd "images/{}" "{}" "*.jpg" "*.png" "*.gif" \;
A limitation is that unzip -d
won't create more than one new level of directories, so just mkdir images
first. Enjoy.
Solution 3
7zip can do this, and has a Linux version.
mkdir files/archive1
7z e -ofiles/archive1/ files/archive1.zip *.jpg *.png *.gif
(Just tested it, it works.)
Solution 4
You can write a program using a zip library. If you do Mono, you can use DotNetZip.
The code would look like this:
foreach (var archive in listOfZips)
{
using (var zip = ZipFile.Read(archive)
{
foreach (ZipEntry e in zip)
{
if (IsImageFile(e.FileName))
{
e.FileName = System.IO.Path.Combine(archive.Replace(".zip",""),
System.IO.Path.GetFileName(e.FileName));
e.Extract("files");
}
}
}
}
Solution 5
Perl's Archive-Zip is a good library for zipping/unzipping.
vache
Updated on July 27, 2022Comments
-
vache almost 2 years
I have a a directory with zip archives containing .jpg, .png, .gif images. I want to unzip each archive taking the images only and putting them in a folder with the name of the archive.
So:
files/archive1.zip files/archive2.zip files/archive3.zip files/archive4.zip
Open archive1.zip - take sunflower.jpg, rose_sun.gif. Make a folder files/archive1/ and add the images to that folder, so files/archive1/folder1.jpg, files/archive1/rose_sun.gif. Do this to each archive.
I really don't know how this can be done, all suggestions are welcome. I have over 600 archives and an automatic solution would be a lifesaver, preferably a linux solution.
-
MiffTheFox almost 15 yearsThis would be an excellent solution, except that the temp directory ends up wasting IO and system resources. You should add wildcards to the unzip call. (Add '.jpg' '.png' '*.gif' to the end.) Also, you should avoid copying the zip file, and instead use "unzip ../${newfile}/zip".
-
paxdiablo almost 15 yearsI don't believe that level of efficiency is a real concern here, this looks like a one-shot operation to me (or one that wouldn't be done often enough to warrant over-engineering). The end result is what the OP wanted, the graphic files in a specific directory based on the archive name.
-
vache almost 15 yearswell i want to keep it on linux, so preferably not .net, but than i should be able to do this same thing lets say using a java zip library no?
-
vache almost 15 yearsbut than i would have to run this for each zip can i add a while loops, or something?
-
vache almost 15 yearsyeah, this would be a one time thing :), i guess i am testing all these solutions right now, locally, than will try it on a larger amount of zips on the server
-
vache almost 15 yearsok so the big problem with this is, that the archives have names that have spaces in them, and the code above creates bunch of folders, with text in the zip name being separated by spaces
-
vache almost 15 yearssimply this one too does something similar for file in zip/*.zip ; do newfile=$(echo ${file}) unzip ${newfile} '.jpg' '.png' '*.gif' done
-
vache almost 15 yearsbut it too breaks when there are spaces in the zip name
-
paxdiablo almost 15 yearsIt's been fixed to handle spaces now in both the zip files and the zipped files within them.
-
vache almost 15 yearsthis will work since the zips all have unique names, and they are all in one folder
-
vache almost 15 yearsadding the '.jpg' '.png' '*.gif' to the end of the unzip call speeds this up very much, and if you add that, than there is no need for the extension in the copy line no?
-
paxdiablo almost 15 yearsThat's right but you will need a "-type f" on the find to only get files rather than directories (I should have done that already in case you had a directory called dir.jpg). Replace the finds with a single "find . -type f -exec cp {} "../${newfile}" ';'"
-
Cheeso almost 15 yearsSorry, I don't know if the Java zip libraries have similar function. I mean, I'm sure you could do it, it's a simple matter of programming. But the question is how much programming. When yous ay "I want to keep it on Linux, so preferably not .NET" - are you aware that Mono runs on Linux? In other words, you can use C# and .NET on Linux.