What are the differences between the urllib, urllib2, urllib3 and requests module?

400,756

Solution 1

I know it's been said already, but I'd highly recommend the requests Python package.

If you've used languages other than python, you're probably thinking urllib and urllib2 are easy to use, not much code, and highly capable, that's how I used to think. But the requests package is so unbelievably useful and short that everyone should be using it.

First, it supports a fully restful API, and is as easy as:

import requests

resp = requests.get('http://www.mywebsite.com/user')
resp = requests.post('http://www.mywebsite.com/user')
resp = requests.put('http://www.mywebsite.com/user/put')
resp = requests.delete('http://www.mywebsite.com/user/delete')

Regardless of whether GET / POST, you never have to encode parameters again, it simply takes a dictionary as an argument and is good to go:

userdata = {"firstname": "John", "lastname": "Doe", "password": "jdoe123"}
resp = requests.post('http://www.mywebsite.com/user', data=userdata)

Plus it even has a built in JSON decoder (again, I know json.loads() isn't a lot more to write, but this sure is convenient):

resp.json()

Or if your response data is just text, use:

resp.text

This is just the tip of the iceberg. This is the list of features from the requests site:

  • International Domains and URLs
  • Keep-Alive & Connection Pooling
  • Sessions with Cookie Persistence
  • Browser-style SSL Verification
  • Basic/Digest Authentication
  • Elegant Key/Value Cookies
  • Automatic Decompression
  • Unicode Response Bodies
  • Multipart File Uploads
  • Connection Timeouts
  • .netrc support
  • List item
  • Python 2.7, 3.6—3.9
  • Thread-safe.

Solution 2

urllib2 provides some extra functionality, namely the urlopen() function can allow you to specify headers (normally you'd have had to use httplib in the past, which is far more verbose.) More importantly though, urllib2 provides the Request class, which allows for a more declarative approach to doing a request:

r = Request(url='http://www.mysite.com')
r.add_header('User-Agent', 'awesome fetcher')
r.add_data(urllib.urlencode({'foo': 'bar'})
response = urlopen(r)

Note that urlencode() is only in urllib, not urllib2.

There are also handlers for implementing more advanced URL support in urllib2. The short answer is, unless you're working with legacy code, you probably want to use the URL opener from urllib2, but you still need to import into urllib for some of the utility functions.

Bonus answer With Google App Engine, you can use any of httplib, urllib or urllib2, but all of them are just wrappers for Google's URL Fetch API. That is, you are still subject to the same limitations such as ports, protocols, and the length of the response allowed. You can use the core of the libraries as you would expect for retrieving HTTP URLs, though.

Solution 3

This is my understanding of what the relations are between the various "urllibs":

In the Python 2 standard library there exist two HTTP libraries side-by-side. Despite the similar name, they are unrelated: they have a different design and a different implementation.

  • urllib was the original Python HTTP client, added to the standard library in Python 1.2. Earlier documentation for urllib can be found in Python 1.4.

  • urllib2 was a more capable HTTP client, added in Python 1.6, intended as a replacement for urllib:

    urllib2 - new and improved but incompatible version of urllib (still experimental).

    Earlier documentation for urllib2 can be found in Python 2.1.

The Python 3 standard library has a new urllib which is a merged/refactored/rewritten version of the older modules.

urllib3 is a third-party package (i.e., not in CPython's standard library). Despite the name, it is unrelated to the standard library packages, and there is no intention to include it in the standard library in the future.

Finally, requests internally uses urllib3, but it aims for an easier-to-use API.

Solution 4

urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities.

1) urllib2 can accept a Request object to set the headers for a URL request, urllib accepts only a URL.

2) urllib provides the urlencode method which is used for the generation of GET query strings, urllib2 doesn't have such a function. This is one of the reasons why urllib is often used along with urllib2.

Requests - Requests’ is a simple, easy-to-use HTTP library written in Python.

1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib.encode() to encode the parameters before passing them.

2) It automatically decoded the response into Unicode.

3) Requests also has far more convenient error handling.If your authentication failed, urllib2 would raise a urllib2.URLError, while Requests would return a normal response object, as expected. All you have to see if the request was successful by boolean response.ok

Solution 5

Just to add to the existing answers, I don't see anyone mentioning that python requests is not a native library. If you are ok with adding dependencies, then requests is fine. However, if you are trying to avoid adding dependencies, urllib is a native python library that is already available to you.

Share:
400,756
Paul Biggar
Author by

Paul Biggar

I founded and ran CircleCI, now working on Dark, a startup to make coding 100x easier. In past lives, I've worked on the Javascript engine at Mozilla, done a PhD in compiler optimizations, static analyses and dynamic languages, and started a YCombinator startup.

Updated on July 08, 2022

Comments

  • Paul Biggar
    Paul Biggar almost 2 years

    In Python, what are the differences between the urllib, urllib2, urllib3 and requests modules? Why are there three? They seem to do the same thing...

  • Crast
    Crast over 14 years
    What you said about appengine is not entirely true. You can actually use httplib, urllib, and urllib2 in App Engine now (they are wrappers for url fetch, done so that more code would be compatible with appengine.)
  • Gattster
    Gattster over 14 years
    How does somebody create a url with an encoded query string using urllib2? It's the only reason I'm using urllib and I'd like to make sure I'm doing everything the latest/greatest way.
  • Chinmay Kanchi
    Chinmay Kanchi over 14 years
    Ah, must be new. My code failed last I tried and had to be rewritten to work with fetch...
  • Crast
    Crast over 14 years
    Like in my above example, you use urlopen() and Request from urllib2, and you use urlencode() from urllib. No real harm in using both libraries, as long as you make sure you use the correct urlopen. The [urllib docs][1] are clear on that using this is acecepted usage. [1]: docs.python.org/library/urllib2.html#urllib2.urlopen
  • Admin
    Admin almost 13 years
    Just a note, be careful with urlencode as it can't handle <unicode> objects directly -- you have to encode them before sending them to urlencode (u'blá'.encode('utf-8'), or whatever).
  • allyourcode
    allyourcode about 12 years
  • Janus Troelsen
    Janus Troelsen over 11 years
    @user18015: I do not think this applies to Python 3, can you clarify?
  • Andrei-Niculae Petre
    Andrei-Niculae Petre almost 10 years
    I used this gist for urllib2.urlopen ; contains other variations too.
  • fkl
    fkl about 9 years
    urllib2 does not support put or delete which is a pain
  • nealmcb
    nealmcb over 6 years
    As I noted above, this question and the various answers should be updated to clarify that urllib in Python 3 is yet another option, cleaned up in various ways. But thankfully, the official documentation also notes that "The Requests package is recommended for a higher-level HTTP client interface." at 21.6. urllib.request — Extensible library for opening URLs — Python 3.6.3 documentation
  • nealmcb
    nealmcb over 6 years
    It would help to note that the Python 3 documentation has yet another distinct library urllib and that its documentation also officially notes that "The Requests package is recommended for a higher-level HTTP client interface." at 21.6. urllib.request — Extensible library for opening URLs — Python 3.6.3 documentation, and that urllib3 is a great library used by requests.
  • PirateApp
    PirateApp about 6 years
    what about urllib3?
  • Bob Stein
    Bob Stein almost 6 years
    Ok except I have the impression request has no replacement for urllib.parse()
  • Omer Dagan
    Omer Dagan over 5 years
    requests also allow custom headers: docs.python-requests.org/en/master/user/quickstart/…
  • Boris Verkhovskiy
    Boris Verkhovskiy over 4 years
    @PirateApp requests is built on top of urllib3. I think code using urllib3 directly can be more efficient, because it lets you reuse the session, whereas requests (at least requests 2, the one everyone uses) creates one for every request, but don't quote me on that. Neither are part of the standard library (yet)
  • Boris Verkhovskiy
    Boris Verkhovskiy over 4 years
    urllib2 doesn't exist at all in Python 3
  • Boris Verkhovskiy
    Boris Verkhovskiy over 4 years
    urllib2 doesn't exist at all in Python 3
  • Alan
    Alan about 4 years
    @Boris It migrated to urllib.request and urllib.error.
  • hlongmore
    hlongmore almost 4 years
    True, if you want to avoid adding any dependencies, urllib is available. But note that even the Python official documentation recommends the requests library: "The Requests package is recommended for a higher-level HTTP client interface."
  • Zeitgeist
    Zeitgeist almost 4 years
    @hlongmore Of course, most people wouldn't want to deal with low level urllib, and Requests library provides a nice level of abstraction. It's like using a pancake mix in a box versus making it from scratch. Pros and cons.
  • causaSui
    causaSui over 3 years
    Thank you for this answer. I came here because I had seen urllib3 and didn't know if I should use it or requests. Now I feel informed about how to make that decision going forward. The accepted answer gives a nice breakdown of requests but does not differentiate it from the alternatives.
  • Rich Lysakowski PhD
    Rich Lysakowski PhD about 3 years
    Yes, I too came here looking for the differences between Requests, urllib, urllib2, and urllib3 and felt dissatisfied with the accepted answer. This clarification should be added or at least linked to the accepted answer. Thank you.
  • chrisinmtown
    chrisinmtown about 3 years
    If you are afflicted by a corporate proxy, know that the requests module cheerfully honors environment variables http_proxy, https_proxy, no_proxy. The urllib3 module ignores environment variables; to send your queries via a proxy you must create an instance of ProxyManager instead of PoolManager.
  • Martijn Pieters
    Martijn Pieters almost 3 years
    It moved to urllib.parse.urlencode in Python 3.
  • VimNing
    VimNing almost 3 years
    @Andriy: What did you mean PS?
  • Tyler Crompton
    Tyler Crompton over 2 years
    I don't understand why this is the accepted answer. It didn't answer OP's question.