How do I download a file using urllib.request in Python 3?
Solution 1
change
f.write(g)
to
f.write(g.read())
Solution 2
An easier way I think (also you can do it in two lines) is to use:
import urllib.request
urllib.request.urlretrieve('http://media-mcw.cursecdn.com/3/3f/Beta.png', 'test.png')
As for the method you have used. When you use g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png')
you are just fetching the file. You must use g.read()
, g.readlines()
or g.readline()
to read it it.
It's just like reading a normal file (except for the syntax) and can be treated in a very similar way.
Nathan2055
Software engineer and graduate of Georgia State University located in Georgia.
Updated on June 05, 2022Comments
-
Nathan2055 almost 2 years
So, I'm messing around with
urllib.request
in Python 3 and am wondering how to write the result of getting an internet file to a file on the local machine. I tried this:g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png') with open('test.png', 'b+w') as f: f.write(g)
But I got this error:
TypeError: 'HTTPResponse' does not support the buffer interface
What am I doing wrong?
NOTE: I have seen this question, but it's related to Python 2's
urllib2
which was overhauled in Python 3. -
Debug255 over 6 yearsThe
PEP20
would have you useRequest
fromurllib.request
but yours would have a line less of code. Information about PEP20 for Request. You can useopen()
chained tofile.write(url.read())
like you mentioned. -
Xantium over 6 years@Debug255 Are you sure? The link mentioned
Open the URL url, which can be either a string or a Request object.
, here I specified a string so I don't think Request is required in this case. -
Debug255 about 6 yearsThat worked on debian9 using python3.5. I don't use 2.7 too much.
-
Robert Johnstone about 4 yearsThis doesn't work if you have to get round the
403: Forbidden
issue using stackoverflow.com/a/16187955/563247 -
Xantium about 4 years@Sevenearths That's true. However that's a different issue. Out of all the files I have used python to download/read, only a handful have ever given me a 403 error. I don't think this is a big enough reason not to warrent the use of
urlretrieve()
. Obviously if that issue is encounted, then what you have linked is the way forward -
Robert Johnstone about 4 yearsInteresting how experiences differ. While writing my app the first url I tried
https://medium.com/@tomaspueyo/coronavirus-the-hammer-and-the-dance-be9337092b56
and it gave me the403: Forbidden
. I wonder if it's just a Medium related issue -
Xantium about 4 years@Sevenearths 403 is a Forbidden error. This usually happens when a website (server) attempts to block a bot. Or you try to access a webpage with incorrect login/cert information (usually cookie related from my experience, like passing outdated information, or similar). Seen as the solution you listed uses a user agent, it strongly looks like that site attepts to block bots (which makes sense since it's a news site) a user agent tricks the server into thinking it's a legitimate browser.
-
Xantium about 4 years@Sevenearths Personally I usually use dedicated APIs (and this sort of thing never comes up, as they expect bots), which is probably why I don't encounter the problem much.