Downloading HTTPS pages with urllib, error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error

13,127

Solution 1

Im using newest Kubuntu with Python 2.7.6

The latest Kubuntu (15.10) uses 2.7.10 as far as I know. But assuming you use 2.7.6 which is contained in 14.04 LTS:

Works with facebook for me too, so it's probably the page issue. What now?

Then it depends on the site. Typical problems with this version of Python is missing support for Server Name Indication (SNI) which was only added to Python 2.7.9. Since lots of sites require SNI today (like everything using Cloudflare Free SSL) I guess this is the problem.

But, there are also other possibilities like multiple trust path which is only fixed with OpenSSL 1.0.2. Or simply missing intermediate certificates etc. More information and maybe also workarounds are only possible if you provide the URL or you analyze the situation yourself based on this information and the analysis from SSLLabs.

Solution 2

old version of python 2.7.3 use

requests.get(download_url, headers=headers, timeout=10, stream=True)

get the following Warning and Exception:

You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
SSLError(SSLError(1, '_ssl.c:504: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error')

enter image description here

just follow the advice, visit Certificate verification in Python 2

run

pip install urllib3[secure]

and problem solved.

Share:
13,127

Related videos on Youtube

yak
Author by

yak

Updated on June 04, 2022

Comments

  • yak
    yak almost 2 years

    Im using newest Kubuntu with Python 2.7.6. I try to download a https page using the below code:

    import urllib2
    
    hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
           'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
           'Accept-Encoding': 'none',
           'Accept-Language': 'pl-PL,pl;q=0.8',
           'Connection': 'keep-alive'}
    
    req = urllib2.Request(main_page_url, headers=hdr)
    
    try:
        page = urllib2.urlopen(req)
    except urllib2.HTTPError, e:
        print e.fp.read()
    
    content = page.read()
    print content
    

    However, I get such an error:

    Traceback (most recent call last):
      File "test.py", line 33, in <module>
        page = urllib2.urlopen(req)
      File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
        return _opener.open(url, data, timeout)
      File "/usr/lib/python2.7/urllib2.py", line 404, in open
        response = self._open(req, data)
      File "/usr/lib/python2.7/urllib2.py", line 422, in _open
        '_open', req)
      File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
        result = func(*args)
      File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
        return self.do_open(httplib.HTTPSConnection, req)
      File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
        raise URLError(err)
    urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error>
    

    How to solve this?

    SOLVED!

    I used url https://www.ssllabs.com given by @SteffenUllrich. It turned out that the server uses TLS 1.2, so I updated python to 2.7.10 and modified my code to:

    import ssl
    import urllib2
    
    context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
    
    hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
           'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
           'Accept-Encoding': 'none',
           'Accept-Language': 'pl-PL,pl;q=0.8',
           'Connection': 'keep-alive'}
    
    req = urllib2.Request(main_page_url, headers=hdr)
    
    try:
        page = urllib2.urlopen(req,context=context)
    except urllib2.HTTPError, e:
        print e.fp.read()
    
    content = page.read()
    print content
    

    Now it downloads the page.

  • yak
    yak over 8 years
    Yes, its Kubuntu 14.04, and my OpenSSL is OpenSSL 1.0.1f 6 Jan 2014
  • yak
    yak over 8 years
    Thank you so much. I used the SSLLabs page you posted, and checked the version of TLS used by the page. It turned out its TLS 1.2. I modified the code, will edit my first post and add modified code and the explanation. Thank you!
  • Steffen Ullrich
    Steffen Ullrich over 8 years
    @yak: since TLS 1.2 is also supported with Python 2.7.6. in (K)ubuntu 14.04 my guess is that the upgrade to Python 2.7.10 simply fixed the SNI issue and that's why it worked. Nevertheless, it counts that it works.