using requests to login to a website that has javascript login form

10,049

The last bit of javascript you posted gives a clue as to why your login POST request isn't working.

According to the javascript, you should be sending a dictionary that looks like the following with your login POST:

{
    'ZACTION': 'AJAX',
    'ZMETHOD': 'LOGIN',
    'func': 'LOGIN',
    'USERNAME': '<enter username>',
    'USERPASS': '<enter password>'
}, 
Share:
10,049
Gustavo Costa
Author by

Gustavo Costa

Updated on June 23, 2022

Comments

  • Gustavo Costa
    Gustavo Costa almost 2 years

    Let me preface by saying I have very little programming experience. I've learned a bunch in the last few days trying to write this program. I am running Python 2.7 on Windows 7 using PyCharm, requests, Beautiful Soup, and lxml.

    I am trying to scrape data from a website that relies heavily on Javascript. I have two options:

    1) The data I need is populated through Javascript and does not necessarily need a login. However I have not been able to figure how to get at this data. I've live monitored headers with live HTTP Headers chrome plugin and I think I've found the Javascript that does it but I'ts beyond my means to figure it out. Its a long bit of code, I'll post it if anyone is interested in taking a look.

    or

    2)On one of the main pages I found a series of ID numbers which I can use to generate URL's for each of the individual items I am analyzing. Problem is I have to be logged in to see these individual item pages. My code is as follows:

    from requests.adapters import HTTPAdapter
    from requests.packages.urllib3.poolmanager import PoolManager
    from BeautifulSoup import BeautifulSoup
    import ssl
    
    # Request a date from user
    UDate = "06/22/2015"  # raw_input('Enter a date mm/dd/yyyy\n')
    
    # Open TLSv1 Adapter (Whataver that means)
    class MyAdapter(HTTPAdapter):
        def init_poolmanager(self, connections, maxsize, block=False):
            self.poolmanager = PoolManager(num_pools=connections,
                                           maxsize=maxsize,
                                           block=block,
                                           ssl_version=ssl.PROTOCOL_TLSv1)
    
    # Begin a requests session. Every get from here on out will use TLSv1 Protocol
    import requests
    
    payload = {
        'LogName': 'xxxxxxxx',
        'LogPass': 'xxxxxxxx'
    }
    
    s = requests.Session()
    s.mount('https://xxxx.xxx', MyAdapter())
    
    # Login with post and Request source code from main page.
    log = s.post('LoginURL', data=payload)
    print log.text
    
    result = s.get(url)
    soup = BeautifulSoup(result.content)
    print soup
    

    Neither the post or the get show me a logged in website. The logform id's from the HTML source code look like this:

    <div id="DivLogForm">
            <label for="BadText"><div id="BadText" class="BadText" style="display:none" tabindex="-2">User Name or Password is Invalid</div></label>
    
            <div class="LogLabel">
                <label for="LogName" > User Name&nbsp;&nbsp;</label><input tabindex="0" id="LogName" class="LogInput" value="" />
            </div>
            <div  class="LogLabel">
                <label for="LogPass" >User Password&nbsp;&nbsp;</label><input  tabindex="0"id="LogPass" type="password" class="LogInput" value="" />
            </div>
    

    So I'm passing LogName and LogPass with the post.

    There is also a logform.js with this bit of code

    $("#LogButton").click(function()
            {   //$('#divLogForm').hide();
                //$('#divLoading').show();  
    
               var uName = $("#LogName").val();
               var uPass = $("#LogPass").val();
               var url = "/index.cfm";
               $.post(url, {ZACTION:'AJAX',ZMETHOD:'LOGIN',func:'LOGIN',USERNAME:uName, USERPASS:uPass}, 
                      function(data){if (data.isOk =="YES"){location.href="/index.cfm";}
                                      else {$('.BadText').show(); $('#BadText').focus();};
                                     },"json");
            });
    

    The LoginURL in my code is taken from the var url in this script. I have tried using USERNAME & USERPASS and I have tried uName and uPass with my post but these didnt work either.

    Not sure how to move forward here. Any help is greatly appreciated