How to simulate HTTP post request using Python Requests module?

python forms post python-requests

76,930

Solution 1

Some example code:

import requests

URL = 'https://www.yourlibrary.ca/account/index.cfm'
payload = {
    'barcode': 'your user name/login',
    'telephone_primary': 'your password',
    'persistent': '1'  # remember me
}

session = requests.session()
r = requests.post(URL, data=payload)
print r.cookies

The first step is to look at your source page and identify the form element that is being submitted (use Firebug/Chrome/IE tools whatever (or just looking at the source)). Then find the input elements and identify the required name attributes (see above).

The URL you provided happens to have a "Remember Me", which although I haven't tried (because I can't), implies it'll issue a cookie for a period of time to avoid further logins -- that cookies is kept in the request.session.

Then just use session.get(someurl, ...) to retrieve pages etc...

Solution 2

In order to use authentication within a requests get or post function you just supply the auth argument. Like this:

response = requests.get(url, auth = ('username', 'password')) Refer to the Requests Authentication Documentation for more detailed info.

Using Chrome's developer tools you can inspect the elements of your html page that contains the form that you would like to fill out and submit. For an explanation of how this is done go here. You can find the data that you need to populate your post request's data argument. If you are not worried about verifying the security certificate of the site you are accessing then you can also specify that in the get argument list.

If your html page has these elements to use for your web form posting:

<textarea id="text" class="wikitext" name="text" cols="80" rows="20">
This is where your edited text will go
</textarea>
<input type="submit" id="save" name="save" value="Submit changes">

Then the python code to post to this form is as follows:

import requests
from bs4 import BeautifulSoup

url = "http://www.someurl.com"

username = "your_username"
password = "your_password"

response = requests.get(url, auth=(username, password), verify=False)

# Getting the text of the page from the response data       
page = BeautifulSoup(response.text)

# Finding the text contained in a specific element, for instance, the 
# textarea element that contains the area where you would write a forum post
txt = page.find('textarea', id="text").string

# Finding the value of a specific attribute with name = "version" and 
# extracting the contents of the value attribute
tag = page.find('input', attrs = {'name':'version'})
ver = tag['value']

# Changing the text to whatever you want
txt = "Your text here, this will be what is written to the textarea for the post"

# construct the POST request
form_data = {
    'save' : 'Submit changes'
    'text' : txt
} 

post = requests.post(url,auth=(username, password),data=form_data,verify=False)

76,930

Author by

Display Name

Updated on April 27, 2020

Comments

Display Name about 4 years

This is the module that I'm trying to use and there is a form I'm trying to fill automatically. The reason I'd like to use Requests over Mechanize is because with Mechanize, I have to load the login page first before I can fill it out and submit, whereas with Requests, I can skip the loading stage and go straight to POSTing the message (hopefully). Basically, I'm trying to make the login process consume as little bandwidth as possible.

My second question is, after the login process and the redirection, is it possible to not fully download the whole page, but to only retrieve the page title? Basically, the title alone will tell me if the login succeeded or not, so I want to minimize bandwidth usage.

I'm kind of a noob when it comes to HTTP requests and whatnot, so any help would be appreciated. FYI, this is for a school project.

edit The first part of the question has been answered. My question now is for the second part