Download Returned Zip file from URL
Solution 1
Most people recommend using requests
if it is available, and the requests
documentation recommends this for downloading and saving raw data from a url:
import requests
def download_url(url, save_path, chunk_size=128):
r = requests.get(url, stream=True)
with open(save_path, 'wb') as fd:
for chunk in r.iter_content(chunk_size=chunk_size):
fd.write(chunk)
Since the answer asks about downloading and saving the zip file, I haven't gone into details regarding reading the zip file. See one of the many answers below for possibilities.
If for some reason you don't have access to requests
, you can use urllib.request
instead. It may not be quite as robust as the above.
import urllib.request
def download_url(url, save_path):
with urllib.request.urlopen(url) as dl_file:
with open(save_path, 'wb') as out_file:
out_file.write(dl_file.read())
Finally, if you are using Python 2 still, you can use urllib2.urlopen
.
from contextlib import closing
def download_url(url, save_path):
with closing(urllib2.urlopen(url)) as dl_file:
with open(save_path, 'wb') as out_file:
out_file.write(dl_file.read())
Solution 2
As far as I can tell, the proper way to do this is:
import requests, zipfile, StringIO
r = requests.get(zip_file_url, stream=True)
z = zipfile.ZipFile(StringIO.StringIO(r.content))
z.extractall()
of course you'd want to check that the GET was successful with r.ok
.
For python 3+, sub the StringIO module with the io module and use BytesIO instead of StringIO: Here are release notes that mention this change.
import requests, zipfile, io
r = requests.get(zip_file_url)
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall("/path/to/destination_directory")
Solution 3
With the help of this blog post, I've got it working with just requests
.
The point of the weird stream
thing is so we don't need to call content
on large requests, which would require it to all be processed at once,
clogging the memory. The stream
avoids this by iterating through the data
one chunk at a time.
url = 'https://www2.census.gov/geo/tiger/GENZ2017/shp/cb_2017_02_tract_500k.zip'
response = requests.get(url, stream=True)
with open('alaska.zip', "wb") as f:
for chunk in response.iter_content(chunk_size=512):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
Solution 4
Here's what I got to work in Python 3:
import zipfile, urllib.request, shutil
url = 'http://www....myzipfile.zip'
file_name = 'myzip.zip'
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
shutil.copyfileobj(response, out_file)
with zipfile.ZipFile(file_name) as zf:
zf.extractall()
Solution 5
Super lightweight solution to save a .zip file to a location on disk (using Python 3.9):
import requests
url = r'https://linktofile'
output = r'C:\pathtofolder\downloaded_file.zip'
r = requests.get(url)
with open(output, 'wb') as f:
f.write(r.content)
user1229108
Updated on July 12, 2022Comments
-
user1229108 almost 2 years
If I have a URL that, when submitted in a web browser, pops up a dialog box to save a zip file, how would I go about catching and downloading this zip file in Python?
-
Zeinab Abbasimazar over 5 yearsI tried section Downloading a binary file and writing it to disk of this page which worked as a chram.
-
-
0atman about 12 yearsBut how do you parse results.content int a zip?
-
aravenel about 12 yearsUse the
zipfile
module:zip = zipfile.ZipFile(results.content)
. Then just parse through the files usingZipFile.namelist()
,ZipFile.open()
, orZipFile.extractall()
-
gr1zzly be4r about 8 yearsThanks for this answer. I used it to solve my issue getting a zip file with requests.
-
newGIS almost 8 yearsyoavram, in your code- where i enter the url of the webpage?
-
user799188 over 7 yearsIf you'd like to save the downloaded file in a different location, replace
z.extractall()
withz.extractall("/path/to/destination_directory")
-
yoavram about 7 years@newGIS I hope you figured it out by now, but the url of the zip you want to download is
zip_file_url
. -
AppleGate0 over 6 yearsThis is awesome.
-
Anirban Nag 'tintinmj' over 6 years@yoavram I was desperately looking for this answer. Can you tell me how to save the content as ".zip" file. If I do
extractall()
it extracts the content. I don't want that. -
yoavram over 6 yearsIf you just want to save the file from the url you can do:
urllib.request.urlretrieve(url, filename)
. -
Frikster about 6 yearsTo help others connect the dots it took me 60minutes too long to, you can then use
pd.read_table(z.open('filename'))
with the above. Useful if you have a zip url link that contains multiple files and you're only interested in loading one. -
mypetlion almost 6 yearsAnswers should not rely on links for the bulk of their content. Links can go dead, or the content on the other side can be changed to no longer answer the question. Please edit your answer to include a summary or explanation of the information you link points to.
-
Varadaraju G about 5 yearshow to print the status of extracting?
-
Victor M Herasme Perez almost 5 yearsHello. How can avoid this error:
urllib.error.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
? -
Webucator almost 5 years@VictorHerasmePerez, an HTTP 302 response status code means that the page has been moved. I think the issue your facing is addressed here: stackoverflow.com/questions/32569934/…
-
Adil Blanco over 4 years@yoavram How can I test these 3 lines if I put it in a function using Mock?
-
karthik r over 4 yearsnot the right pattern according to 2.python-requests.org/en/master/user/quickstart/…
-
Yossarian42 about 4 yearswhat if the .zip file is over 10GB, won't the get() mess up with the memory?
-
Sarvagya Dubey about 4 yearsCan you please add the sample snippet as well. It would be so kind of you to do so
-
ayush thakur about 3 yearsWhat is
chunk_size
here? And can this parameter affect the speed of downloading? -
Jeremiah England about 3 years@ayushthakur Here are some links that may help:
requests.Response.iter_content
and wikipedia:Chunk Transfer Encoding. Someone else could probably give a better answer, but I wouldn't expectchunk_size
to make of a difference for download speed if it's set large enough (reducing #pings/content ratio). 512 bytes seems super small in retrospect. -
Mujeebur Rahman about 3 years@Webucator What if the zipped folder contains several files, then all those files will get extracted and stored in the system.I want to extract and get just one file from the zipped folder. Any way to achieve this?
-
Atom Store almost 3 years
-
Theo F almost 3 years@AtomStore yes? Is there an issue with my answer?
-
Atom Store almost 3 yearshow to bypass the alert, it downloads the html file rather than zip
-
Theo F almost 3 yearsMy answer works for the link I tested with. Try using my code, but replacing the url with: api.os.uk/downloads/v1/products/CodePointOpen/… (open data from Ordnance Survey)