Is there an easy way to request a URL in python and NOT follow redirects?

89,116

Solution 1

Here is the Requests way:

import requests
r = requests.get('http://github.com', allow_redirects=False)
print(r.status_code, r.headers['Location'])

Solution 2

Dive Into Python has a good chapter on handling redirects with urllib2. Another solution is httplib.

>>> import httplib
>>> conn = httplib.HTTPConnection("www.bogosoft.com")
>>> conn.request("GET", "")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
301 Moved Permanently
>>> print r1.getheader('Location')
http://www.bogosoft.com/new/location

Solution 3

This is a urllib2 handler that will not follow redirects:

class NoRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        infourl = urllib.addinfourl(fp, headers, req.get_full_url())
        infourl.status = code
        infourl.code = code
        return infourl
    http_error_300 = http_error_302
    http_error_301 = http_error_302
    http_error_303 = http_error_302
    http_error_307 = http_error_302

opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)

Solution 4

The redirections keyword in the httplib2 request method is a red herring. Rather than return the first request it will raise a RedirectLimit exception if it receives a redirection status code. To return the inital response you need to set follow_redirects to False on the Http object:

import httplib2
h = httplib2.Http()
h.follow_redirects = False
(response, body) = h.request("http://example.com")

Solution 5

i suppose this would help

from httplib2 import Http
def get_html(uri,num_redirections=0): # put it as 0 for not to follow redirects
conn = Http()
return conn.request(uri,redirections=num_redirections)
Share:
89,116
John
Author by

John

I like bacon.

Updated on April 21, 2020

Comments

  • John
    John about 4 years

    Looking at the source of urllib2 it looks like the easiest way to do it would be to subclass HTTPRedirectHandler and then use build_opener to override the default HTTPRedirectHandler, but this seems like a lot of (relatively complicated) work to do what seems like it should be pretty simple.

  • Carles Barrobés
    Carles Barrobés about 13 years
    Looks wrong... This code does actually follow the redirects (by calling the original handler, thus issuing an HTTP request), and then raise an exception
  • Marian
    Marian almost 11 years
    How is this the shortest way? It doesn't even contain the import or the actual request.
  • Tim Wilder
    Tim Wilder about 10 years
    I'm unit testing an API and dealing with a login method that redirects to a page I don't care about, but doesn't send the desired session cookie with the response to the redirect. This is exactly what I needed for that.
  • mit
    mit about 10 years
    Everybody who comes here from google, please note that the up to date way to go is this one: stackoverflow.com/a/14678220/362951 The requests library will save you a lot of headache.
  • user
    user over 9 years
    I already was going to post this solution and was quite surprised to find this answer at the bottom. It is very concise and should be the top answer in my opinion.
  • user
    user over 9 years
    Moreover, it gives you more freedom, this way it's possible to control which URLs to follow.
  • patricksurry
    patricksurry over 7 years
    Then look at r.headers['Location'] to see where it would have sent you
  • Hamish
    Hamish almost 7 years
    Note that it seems that Requests will normalize Location to location.
  • Marian
    Marian almost 7 years
    @Hamish requests allows you to access headers both in the canonical form and in lowercase. See docs.python-requests.org/en/master/user/quickstart/…
  • guettli
    guettli about 5 years
    The link to "Dive Into Python" is dead.
  • Max von Hippel
    Max von Hippel over 4 years
    As of 2019 in Python 3, this no longer appears to work for me. (I get a key dict error.)
  • StashOfCode
    StashOfCode over 4 years
    I confirm, this is the easist way. A short remark for those who want to debug. Do not forget that you may set multiples handlers when bullding the opener like : opener = urllib.request.build_opener(debugHandler, NoRedirect()) where debugHandler=urllib.request.HTTPHandler() and debugHandler.set_http_debuglevel (1). In the end: urllib.request.install_opener(opener)
  • user3504575
    user3504575 over 3 years
    Check r.status_code if it is not 301 there might have been another error. The Location header is only available for redirects. Use dict.get if you want to avoid KeyError on optional keys.
  • CS QGB
    CS QGB almost 3 years
    TypeError: request() got an unexpected keyword argument 'max_redirects'