Is it possible in unix to search inside zip files

12,204

Solution 1

I once needed something similar to find class files in a bunch of zip files. Here it is:

#!/bin/bash

function process() {
while read line; do
    if [[ "$line" =~ ^Archive:\s*(.*) ]] ; then
        ar="${BASH_REMATCH[1]}"
        #echo "$ar"
    else
        if [[ "$line" =~ \s*([^ ]*abc\.jpg)$ ]] ; then
            echo "${ar}: ${BASH_REMATCH[1]}"
        fi
    fi
done
}


find . -iname '*.zip' -exec unzip -l '{}' \; | process

Now you only need to add one line to extract the files and maybe move them. I'm not sure exactly what you want to do, so I'll leave that to you.

Solution 2

If your unix variant supports FUSE (Linux, *BSD, OSX, Solaris all do), mount AVFS to access archives transparently. The command mountavfs creates a view of the whole filesystem, rooted at ~/.avfs, in which archive files have an associated directory that contains the directories and files in the archive. For example, if you have foo.zip in the current directory, then the following command is roughly equivalent to unzip -l foo.zip:

mountavfs    # needs to be done once and for all
find ~/.avfs$PWD/foo.zip\# -ls

So, to loop over all images contained in a zip file under the current directory and copy them to /destination/directory (with a prompt in case of clash):

find ~/.avfs"$PWD" -name '*.zip' -exec sh -c '
    find "${0}#" -name "*.jpg" -exec cp -ip {} "$1" \;
' {} /destination/directory \;

In zsh:

cp -ip ~/.avfs$PWD/**/*.zip(e\''REPLY=($REPLY\#/**/*.jpg(N))'\') /destination/directory

Deconstruction: ~/.avfs$PWD/**/*.zip expands to the AVFS view of the zip files under the current directory. The glob qualifier e is used to modify the output of the glob: …/*.zip(e\''REPLY=$REPLY\#'\') would just append a # to each match. REPLY=($REPLY\#/**/*.jpg(N)) transforms each match into the array of .jpg files in the .zip# directory.

Solution 3

I assume you have a new version of Bash, so you should be able to use this:

shopt -s globstar
for path in topdir/**/*.zip
do
    unzip "$path" '.*abc.jpg'
done

Solution 4

Similar to Kims answer but slightly modified. Just use sed:

find . -name *.zip -exec unzip -l '{}' \; | sed -n -e '/^Archive/ {h}' -e '/abc.jpg$/ {x;p;x;}'
Share:
12,204

Related videos on Youtube

Mirage
Author by

Mirage

Updated on September 18, 2022

Comments

  • Mirage
    Mirage almost 2 years

    I have 100s of directories and within those I have a few zip files. Now there are images named abc.jpg in those zip files. The zip files may be in any folder or in any subfolder so its difficult to extract them all in one place.

    I just want to collect those image files. Is this possible?

    • Michael Mrozek
      Michael Mrozek about 13 years
      You can use zip -sf foo.zip | grep abc.jpg to determine if an archive has abc.jpg; that should help. I don't have time to figure out the complete command now, but I'll try later if nobody else has answered
  • Kim
    Kim about 13 years
    nice solution, but FUSE? that's a little bit overkill, isn't it?
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' about 13 years
    @Kim: Why should I cripple myself not to use FUSE if it's available?
  • Kim
    Kim about 13 years
    some reasons not to use FUSE if you don't have to: portability (some OSs don't have FUSE), maintainability (not everybody knows FUSE)