Check if a directory exists in a zip file with Python

17,540

Solution 1

Just check the filename with "/" at the end of it.

import zipfile

def isdir(z, name):
    return any(x.startswith("%s/" % name.rstrip("/")) for x in z.namelist())

f = zipfile.ZipFile("sample.zip", "r")
print isdir(f, "a")
print isdir(f, "a/b")
print isdir(f, "a/X")

You use this line

any(x.startswith("%s/" % name.rstrip("/")) for x in z.namelist())

because it is possible that archive contains no directory explicitly; just a path with a directory name.

Execution result:

$ mkdir -p a/b/c/d
$ touch a/X
$ zip -r sample.zip a
adding: a/ (stored 0%)
adding: a/X (stored 0%)
adding: a/b/ (stored 0%)
adding: a/b/c/ (stored 0%)
adding: a/b/c/d/ (stored 0%)

$ python z.py
True
True
False

Solution 2

You can check for the directories with ZipFile.namelist().

import os, zipfile
dir = "some/directory/"

z = zipfile.ZipFile("myfile.zip")
if dir in z.namelist():
    print "Found %s!" % dir

Solution 3

for python(>=3.6):

this is how the is_dir() implemented in python source code:

def is_dir(self):
    """Return True if this archive member is a directory."""
    return self.filename[-1] == '/'

It simply checks if the filename ends with a slash /, Can't tell if this will work correctly in some certain circumstances(so IMO it is badly implemented).

for python(<3.6):

as print(zipinfo) will show filemode but no corrsponding property or field is provided, I dive into zipfile module source code and found how it is implemented. (see def __repr__(self): https://github.com/python/cpython/blob/3.6/Lib/zipfile.py)

possibly a bad idea but it will work:

if you want something simple and easy, this will work in most cases but it may fail because in some cases this field will not be printed.

def is_dir(zipinfo):
    return "filemode='d" in zipinfo.__repr__()

Finally:

my solution is to check file mode manually and decide if the referenced file is actually a directory inspired by https://github.com/python/cpython/blob/3.6/Lib/zipfile.py line 391.

def is_dir(fileinfo):
    hi = fileinfo.external_attr >> 16
    return (hi & 0x4000) > 0
Share:
17,540

Related videos on Youtube

Stupid.Fat.Cat
Author by

Stupid.Fat.Cat

I like to sleep, a lot....

Updated on September 15, 2022

Comments

  • Stupid.Fat.Cat
    Stupid.Fat.Cat over 1 year

    Initially I was thinking of using os.path.isdir but I don't think this works for zip files. Is there a way to peek into the zip file and verify that this directory exists? I would like to prevent using unzip -l "$@" as much as possible, but if that's the only solution then I guess I have no choice.

  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    This works for files but not directories :( at least not for me.
  • enderskill
    enderskill almost 12 years
    Try printing the namelist() of your .zip file to make sure your directory is formatted correctly.
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    Thanks! Well this worked with the sample you provided. But I'm trying to do this for docx files. Essentially I'm checking if the zip file contains the directory "word", but it's giving me false responses :(
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    Yea, I made sure the directory is there. I'm trying to do it for docx files, which are zip files anyways so that shouldn't matter right?
  • Igor Chubin
    Igor Chubin almost 12 years
    Just try to print the list of files in your docx and see what is strange with it: print zipfile.ZipFile("sample.docx", "r").namelist()
  • Igor Chubin
    Igor Chubin almost 12 years
    I suppose that you have some prefix before word. Please check it.
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    word/_rels/document.xml.rels This is a file contained it in, I printed it straight out of z.namelist()
  • Lanaru
    Lanaru almost 12 years
    You are trying to use a docx file instead of a zip? Rename the extension to .zip and try it again, it should work.
  • Igor Chubin
    Igor Chubin almost 12 years
    I fixed the function according to your needs. Please try it.
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    I'm trying to find the folder "word" I don't about the contents. I noticed it works if I give it a file, but not when I just give it the directory word
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    It works fine unzipping, and I can get it to print all the files. But the directory "word" is not in namelist(), rather individual files, such as word/webSettings.xml so it's not getting a match.
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    Oh I found the issue, the list doesn't contain the directory "word" by itself, rather it contains all the files.
  • Igor Chubin
    Igor Chubin almost 12 years
    Have you checked my function? I'm sure it must work now. Please check it
  • Stupid.Fat.Cat
    Stupid.Fat.Cat almost 12 years
    This worked great, thanks! I tweaked it a little and it works perfectly