Sending an ASP.net POST with Python's Requests

15,290

You have too many request parameters, and should not set the content-type, content-length, host, origin, or connection headers; leave those to requests to set.

You are also doubling up the url parameters; either add the vr parameter to the URL manually or use params, not do both.

It may well be that some of the parameters in the POST body are generated by the ASP application tied to a session. I'd use a GET request with a Session object the valuation_url, parse the form in that page to extract the __CALLBACKID parameter. The requests Session will then store any cookies the server sets and reuse those:

item_request_headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
    "Accept": "*/*",
    "Accept-Encoding": "gzip,deflate,sdch",
    "Accept-Language": "en-US,en;q=0.8"
}
payload = {"vr": int(item_number[0])}

session = requests.Session(headers=item_request_headers)

# Get form page
form_response = session.get(validation_url, params=payload) 

# parse form page; BeautifulSoup could do this for example
soup = BeautifulSoup(form_response.content)
callbackid = soup.select('input[name=__CALLBACKID]')[0]['value']

item_request_body = {
    "__SPSCEditMenu": "true",
    "MSOWebPartPage_PostbackSource": "",
    "MSOTlPn_SelectedWpId": "",
    "MSOTlPn_View": 0,
    "MSOTlPn_ShowSettings": "False",
    "MSOGallery_SelectedLibrary": "",
    "MSOGallery_FilterString": "",
    "MSOTlPn_Button": "none",
    "__EVENTTARGET": "",
    "__EVENTARGUMENT": "",
    "MSOAuthoringConsole_FormContext": "",
    "MSOAC_EditDuringWorkflow": "",
    "MSOSPWebPartManager_DisplayModeName": "Browse",
    "MSOWebPartPage_Shared": "",
    "MSOLayout_LayoutChanges": "",
    "MSOLayout_InDesignMode": "",
    "MSOSPWebPartManager_OldDisplayModeName": "Browse",
    "MSOSPWebPartManager_StartWebPartEditingName": "false",
    "__VIEWSTATE": viewstate,
    "keywords": "Search our site",
    "__CALLBACKID": callbackid,
    "__CALLBACKPARAM": "startvr"
}

item_url = 'http://www.example.com/EN/items/Pages/yourrates.aspx'

response = session.post(url=item_url, params=payload, data=item_request_body,
                        headers={'Referer': form_response.url})

The session handles the headers (setting a user agent, and accept parameters), only on the POST with the session do we add a referrer header as well.

Share:
15,290
David K.
Author by

David K.

Software engineer living and working in San Francisco. During the day, I mostly work on algorithms, data structures, and other topics in backend software development with an awesome group of people at an early-stage company. I enjoy building diverse products, and have experience in human-computer interaction design, low-level software, venture capital investing, public speaking, musical performance, and grassroots community service. Love products and relish a good challenge. Thanks for stopping by!

Updated on June 04, 2022

Comments

  • David K.
    David K. almost 2 years

    I'm scraping an old ASP.net website using Python's requests module.

    I've spent 5+ hours trying to figure out how to simulate this POST request to no avail. Doing it the way I do it below, I essentially get a message saying "No item matches this item reference."

    Any help would be deeply appreciated – here's the request and my code, a few things are modified out of respect to brevity and/or privacy:

    My own code:

    import requests
    
    # Scraping the item number from the website, I have confirmed this is working.
    
    #Then use the newly acquired item number to request the data.
    item_url = http://www.example.com/EN/items/Pages/yourrates.aspx?vr= + item_number[0]
    viewstate = r'/wEPD...' # Truncated for brevity.
    
    # Create the appropriate request and payload.
    payload = {"vr": int(item_number[0])}
    
    item_request_body = {
            "__SPSCEditMenu": "true",
            "MSOWebPartPage_PostbackSource": "",
            "MSOTlPn_SelectedWpId": "",
            "MSOTlPn_View": 0,
            "MSOTlPn_ShowSettings": "False",
            "MSOGallery_SelectedLibrary": "",
            "MSOGallery_FilterString": "",
            "MSOTlPn_Button": "none",
            "__EVENTTARGET": "",
            "__EVENTARGUMENT": "",
            "MSOAuthoringConsole_FormContext": "",
            "MSOAC_EditDuringWorkflow": "",
            "MSOSPWebPartManager_DisplayModeName": "Browse",
            "MSOWebPartPage_Shared": "",
            "MSOLayout_LayoutChanges": "",
            "MSOLayout_InDesignMode": "",
            "MSOSPWebPartManager_OldDisplayModeName": "Browse",
            "MSOSPWebPartManager_StartWebPartEditingName": "false",
            "__VIEWSTATE": viewstate,
            "keywords": "Search our site",
            "__CALLBACKID": "ctl00$SPWebPartManager1$g_dbb9e9c7_fe1d_46df_8789_99a6c9db4b22",
            "__CALLBACKPARAM": "startvr"
        }
    
    # Write the appropriate headers for the property information.
    item_request_headers = {
        "Host": home_site,
        "Connection": "keep-alive",
        "Content-Length": len(encoded_valuation_request),
        "Cache-Control": "max-age=0",
        "Origin": home_site,
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
        "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
        "Cookie": "__utma=48409910.1174413745.1405662151.1406402487.1406407024.17; __utmb=48409910.7.10.1406407024; __utmc=48409910; __utmz=48409910.1406178827.13.3.utmcsr=ratesandvallandingpage|utmccn=landingpages|utmcmd=button",
        "Accept": "*/*",
        "Referer": valuation_url,
        "Accept-Encoding": "gzip,deflate,sdch",
        "Accept-Language": "en-US,en;q=0.8"
    }
    
        response = requests.post(url=item_url, params=payload, data=item_request_body, headers=item_request_headers)
        print response.text
    

    What Chrome is telling me the request looks like:

    Remote Address:202.55.96.131:80
    Request URL:http://www.example.com/EN/items/Pages/yourrates.aspx?vr=123456789
    Request Method:POST
    Status Code:200 OK
    
    Request Headers
    Accept:*/*
    Accept-Encoding:gzip,deflate,sdch
    Accept-Language:en-US,en;q=0.8
    Cache-Control:max-age=0
    Connection:keep-alive
    Content-Length:21501
    Content-Type:application/x-www-form-urlencoded; charset=UTF-8
    Cookie:__utma=48409910.1174413745.1405662151.1406402487.1406407024.17; __utmb=48409910.7.10.1406407024; __utmc=48409910; __utmz=48409910.1406178827.13.3.utmcsr=ratesandvallandingpage|utmccn=landingpages|utmcmd=button
    Host:www.site.com
    Origin:www.site.com
    Referer:http://www.example.com/EN/items/Pages/yourrates.aspx?vr=123456789
    User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36
    
    Query String Parameters
    vr:123456789
    
    Form Data
    __SPSCEditMenu:true
    MSOWebPartPage_PostbackSource:
    MSOTlPn_SelectedWpId:
    MSOTlPn_View:0
    MSOTlPn_ShowSettings:False
    MSOGallery_SelectedLibrary:
    MSOGallery_FilterString:
    MSOTlPn_Button:none
    __EVENTTARGET:
    __EVENTARGUMENT:
    MSOAuthoringConsole_FormContext:
    MSOAC_EditDuringWorkflow:
    MSOSPWebPartManager_DisplayModeName:Browse
    MSOWebPartPage_Shared:
    MSOLayout_LayoutChanges:
    MSOLayout_InDesignMode:
    MSOSPWebPartManager_OldDisplayModeName:Browse
    MSOSPWebPartManager_StartWebPartEditingName:false
    __VIEWSTATE:/wEPD...(Omitted for length)
    keywords:Search our site
    __CALLBACKID:ctl00$SPWebPartManager1$g_dbb9e9c7_fe1d_46df_8789_99a6c9db4b22
    __CALLBACKPARAM:startvr
    
  • David K.
    David K. almost 10 years
    Super helpful, Martijn, thank you! I'm still working through things, but as soon as I finish implementing and testing the solution I will be sure to confirm :)
  • David K.
    David K. almost 10 years
    Also, do you know the way in which I would encode this type of thing? __CALLBACKID=ctl00%24SPWebPartManager1%24g_dbb9e9c7_fe1d_46d‌​f_8789_99a6c9db4b22 It gave me an error in the callback, presumably due to the unusual percent signs.
  • Martijn Pieters
    Martijn Pieters almost 10 years
    Include the decoded value; leave encoding to requests. %24 is an encoded $ for example.
  • David K.
    David K. almost 10 years
    You are so helpful Martijn! Thanks again! Works well, now I just need to automate retrieving a few of the bits of information via BeautifulSoup.
  • Martijn Pieters
    Martijn Pieters almost 10 years
    @DavidK.: you can combine requests and BeautifulSoup with robobrowser; it'll help you fill in the forms too.
  • David K.
    David K. almost 10 years
    Good idea, but I'm trying to do it purely with HTTP requests, no browser layer. I need maximal speed for a sizable dataset, and none of the browsers I've tried are fast enough. However, if you've had good experiences with robobrowser I may have to try it!
  • Martijn Pieters
    Martijn Pieters almost 10 years
    @DavidK.: robobrowser is not a browser layer. It is requests plus BeautifulSoup plus a little glue to handle forms.
  • David K.
    David K. almost 10 years
    Ah, well that sounds even better then! I'm just using bs4 and requests myself at the moment, but that may be a convenient package.