HTTPS POST request Python

46,305

Solution 1

The BadStatusLine: '' (in httplib.py) gives away that there might be something else going on here. This may happen when the server sends no reply back at all, and just closes the connection.

As you mentioned that you're using an SSL connection, this might be particularly interesting to debug (with curl -v URL if you want). If you find out that curl -2 URL (which forces the use of SSLv2) seems to work, while curl -3 URL (SSLv3), doesn't, you may want to take a look at issue #13636 and possibly #11220 on the python bugtracker. Depending on your Python version & a possibly misconfigured webserver, this might be causing a problem: the SSL defaults have changed in v2.7.3.

Solution 2

Is there a reason you've got to use urllib? Requests is simpler, better in almost every way, and abstracts away some of the cruft that makes urllib hard to work with.

As an example, I'd rework you example as something like:

import requests
resp = requests.post(url, data=values, allow_redirects=True)

At this point, the response from the server is available in resp.text, and you can do what you'd like with it. If requests wasn't able to POST properly (because you need a custom SSL certificate, for example), it should give you a nice error message that tells you why.

Even if you can't do this in your production environment, do this in a local shell to see what error messages you get from requests, and use that to debug urllib.

Solution 3

   conn = httplib.HTTPSConnection(host='www.site.com', port=443, cert_file=_certfile)
   params  = urllib.urlencode({'cmd': 'token', 'device_id_st': 'AAAA-BBBB-CCCC',
                                'token_id_st':'DDDD-EEEE_FFFF', 'product_id':'Unit Test',
                                'product_ver':"1.6.3"})
    conn.request("POST", "servlet/datadownload", params)
    content = conn.getresponse().read()
    #print response.status, response.reason
    conn.close()
Share:
46,305
francisMi
Author by

francisMi

Updated on July 12, 2022

Comments

  • francisMi
    francisMi almost 2 years

    I want to make a post request to a HTTPS-site that should respond with a .csv file. I have this Python code:

    url = 'https://www.site.com/servlet/datadownload'
    values = {
      'val1' : '123',
      'val2' : 'abc',
      'val3' : '1b3',
    }
    
    data = urllib.urlencode(values)
    req = urllib2.Request(url,data)
    response = urllib2.urlopen(req)
    myfile = open('file.csv', 'wb')
    shutil.copyfileobj(response.fp, myfile)
    myfile.close()
    

    But 'm getting the error:

    BadStatusLine: ''    (in httplib.py)
    

    I've tried the post request with the Chrome Extension: Advanced REST client (screenshot) and that works fine.

    What could be the problem and how could I solve it? (is it becasue of the HTTPS?)


    EDIT, refactored code:

    try:
        #conn = httplib.HTTPSConnection(host="www.site.com", port=443)
    

    => Gives an BadStatusLine: '' error

        conn = httplib.HTTPConnection("www.site.com");
        params  = urllib.urlencode({'val1':'123','val2':'abc','val3':'1b3'})
        conn.request("POST", "/nps/servlet/exportdatadownload", params)
        content = conn.getresponse()
        print content.reason, content.status
        print content.read()
        conn.close()
    except:
        import sys
        print sys.exc_info()[:2]
    

    Output:

    Found 302
    
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <HTML><HEAD>
    <TITLE>302 Found</TITLE>
    </HEAD><BODY>
    <H1>Found</H1>
    The document has moved <A HREF="https://www.site.com/nps/servlet/exportdatadownload">here</A>.<P>
    <HR>
    <ADDRESS>Oracle-Application-Server-10g/10.1.3.5.0 Oracle-HTTP-Server Server at mp-www1.mrco.be Port 7778</ADDRESS>
    </BODY></HTML>
    

    What am I doing wrong?

  • francisMi
    francisMi about 11 years
    I've tried your code, but adapted the first line to just httplib.HTTPSConnection('www.site.com'). When I print content.status I get Found 302. And printing the content it self, I get html code with The document has moved <A HREF="https://www.site.com/servlet/exportdatadownload">here<‌​/A>.<P> But how do I get the founed file?
  • francisMi
    francisMi about 11 years
    I've edited my question with more information and with your code.
  • bioffe
    bioffe about 11 years
    try url https://google.com, it feels you have some sort of server/destination issues.
  • francisMi
    francisMi about 11 years
    httplib.HTTPSConnection(host="www.google.com", port=443) gives an Not Found 404 output and httplib.HTTPConnection("www.google.com") gives Service Unavailable 503
  • bioffe
    bioffe about 11 years
    That's good. There isn't /servlet/datadownload URL on google's website, hence the error. Now I am confident your server is the issue. Try to read something simple, like static html page(that you can access via a browser).
  • francisMi
    francisMi about 11 years
    Normally, when I try the request in my browser, there's automatically downloaded a .csv file. So couldn't it be the server redirects the respone, and I need to 'follow' it with Python code?
  • francisMi
    francisMi about 11 years
    The same error: BadStatusLine: ConnectionError: HTTPSConnectionPool(host='www.site.com', port=443): Max retries exceeded with url: /nps/servlet/exportdatadownload/ (Caused by <class 'httplib.BadStatusLine'>: '') When I browse to https://www.site.com/nps/servlet/exportdatadownload?val1=123‌​& val2=abc&val3=1b3, the excel file is downloaded automatically , but still nog succes with a Python script...
  • Dan
    Dan about 11 years
    BadStatusLine means that the server sent back an HTTP status that Python doesn't understand (and it understands all the "normal" ones). From a command-line, can you do a curl -I https://site.com (with whatever the real URL is there) and paste the results? If you don't have curl, you can also use hurl.it (in which case I'm just interested in the first paragraph of the response).