How do I download a file using urllib.request in Python 3?

python http python-3.x urllib

17,333

Solution 1

change

f.write(g)

f.write(g.read())

Solution 2

An easier way I think (also you can do it in two lines) is to use:

import urllib.request
urllib.request.urlretrieve('http://media-mcw.cursecdn.com/3/3f/Beta.png', 'test.png')

As for the method you have used. When you use g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png') you are just fetching the file. You must use g.read(), g.readlines() or g.readline() to read it it.

It's just like reading a normal file (except for the syntax) and can be treated in a very similar way.

17,333

Author by

Nathan2055

Software engineer and graduate of Georgia State University located in Georgia.

Updated on June 05, 2022

Comments

Nathan2055 almost 2 years
So, I'm messing around with urllib.request in Python 3 and am wondering how to write the result of getting an internet file to a file on the local machine. I tried this:
```
g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png')
with open('test.png', 'b+w') as f:
    f.write(g)
```
But I got this error:
```
TypeError: 'HTTPResponse' does not support the buffer interface
```
What am I doing wrong?

NOTE: I have seen this question, but it's related to Python 2's urllib2 which was overhauled in Python 3.
Debug255 over 6 years

The PEP20 would have you use Request from urllib.request but yours would have a line less of code. Information about PEP20 for Request. You can use open() chained to file.write(url.read()) like you mentioned.
Xantium over 6 years

@Debug255 Are you sure? The link mentioned Open the URL url, which can be either a string or a Request object., here I specified a string so I don't think Request is required in this case.
Debug255 about 6 years

That worked on debian9 using python3.5. I don't use 2.7 too much.
Robert Johnstone about 4 years

This doesn't work if you have to get round the 403: Forbidden issue using stackoverflow.com/a/16187955/563247
Xantium about 4 years

@Sevenearths That's true. However that's a different issue. Out of all the files I have used python to download/read, only a handful have ever given me a 403 error. I don't think this is a big enough reason not to warrent the use of urlretrieve(). Obviously if that issue is encounted, then what you have linked is the way forward
Robert Johnstone about 4 years

Interesting how experiences differ. While writing my app the first url I tried https://medium.com/@tomaspueyo/coronavirus-the-hammer-and-th‌e-dance-be9337092b56 and it gave me the 403: Forbidden. I wonder if it's just a Medium related issue
Xantium about 4 years

@Sevenearths 403 is a Forbidden error. This usually happens when a website (server) attempts to block a bot. Or you try to access a webpage with incorrect login/cert information (usually cookie related from my experience, like passing outdated information, or similar). Seen as the solution you listed uses a user agent, it strongly looks like that site attepts to block bots (which makes sense since it's a news site) a user agent tricks the server into thinking it's a legitimate browser.
Xantium about 4 years

@Sevenearths Personally I usually use dedicated APIs (and this sort of thing never comes up, as they expect bots), which is probably why I don't encounter the problem much.