Again urllib.error.HTTPError: HTTP Error 400: Bad Request

21,760

Solution 1

This URL seems to be doing user agent string checking. If I adjust my user agent string in Firefox to Python-urllib/2.7, it fails with the Bad Request you are seeing.

As you are using urllib, you can adjust the User Agent following this tutorial

from urllib.request import FancyURLopener

class MyOpener(FancyURLopener):
    version = 'My new User-Agent'   # Set this to a string you want for your user agent

myopener = MyOpener()
page = myopener.open('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire')

Solution 2

They are probably blocking the fact that it isn't coming from a browser. You probably need a valid User-Agent header or something.

Using requests, this works:

import requests
headers = 
{
 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)     Chrome/37.0.2049.0 Safari/537.36'
}

r = requests.get('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire', headers=headers)
print r
print r.headers
Share:
21,760
Admin
Author by

Admin

Updated on July 05, 2022

Comments

  • Admin
    Admin almost 2 years

    Hy! I tried to open web-page, that is normally opening in browser, but python just swears and does not want to work.

    import urllib.request, urllib.error
    f = urllib.request.urlopen('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire')
    

    And another way

    import urllib.request, urllib.error
    opener=urllib.request.build_opener()
    f=opener.open('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphi
    re')
    

    Both options give one type of error:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Python34\lib\urllib\request.py", line 461, in open
        response = meth(req, response)
      File "C:\Python34\lib\urllib\request.py", line 571, in http_response
        'http', request, response, code, msg, hdrs)
      File "C:\Python34\lib\urllib\request.py", line 493, in error
        result = self._call_chain(*args)
      File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
        result = func(*args)
      File "C:\Python34\lib\urllib\request.py", line 676, in http_error_302
        return self.parent.open(new, timeout=req.timeout)
      File "C:\Python34\lib\urllib\request.py", line 461, in open
        response = meth(req, response)
      File "C:\Python34\lib\urllib\request.py", line 571, in http_response
        'http', request, response, code, msg, hdrs)
      File "C:\Python34\lib\urllib\request.py", line 499, in error
        return self._call_chain(*args)
      File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
        result = func(*args)
      File "C:\Python34\lib\urllib\request.py", line 579, in http_error_default
        raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 400: Bad Request
    

    Any ideas?