FormRequest Scrapy

13,153

Solution 1

Try using the FormRequest.from_response function

https://doc.scrapy.org/en/latest/topics/request-response.html#using-formrequest-from-response-to-simulate-a-user-login

import scrapy

class LoginSpider(scrapy.Spider):
    name = 'example.com'
    start_urls = ['http://www.example.com/users/login.php']

    def parse(self, response):
        return scrapy.FormRequest.from_response(
            response,
            formdata={'username': 'john', 'password': 'secret'},
            callback=self.after_login
        )

    def after_login(self, response):
        # check login succeed before going on
        if "authentication failed" in response.body:
            self.logger.error("Login failed")
            return

Solution 2

Additionally to answer @Uday question, if you have multiple form on a page, use formid or formname to select the right form:

def parse(self, response):
    return scrapy.FormRequest.from_response(
        response,
        formid='form_id_of_the_form',
        formdata={'username': 'john', 'password': 'secret'},
        callback=self.after_login
    )

Without, FormRequest takes the first form by default.

Share:
13,153
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm new to Scrapy and Python. I'm trying to use FormRequest from Scrapy example but seems that formdata parameter is not parsing the '[]' from "Air". Any ideas on a workaround for this? Here is the code:

    import scrapy
    import re
    import json
    from scrapy.http import FormRequest
    
    class AirfareSpider(scrapy.Spider):
        name = 'airfare'
        start_urls = [
        'http://www.viajanet.com.br/busca/voos-resultados#/POA/MEX/RT/01-03-2017/15-03-2017/-/-/-/1/0/0/-/-/-/-'
        ]
    
        def parse(self, response):
        return [FormRequest(url='http://www.viajanet.com.br/busca/resources/api/AvailabilityStatusAsync', 
           formdata={"Partner":{
                       "Token":"p0C6ezcSU8rS54+24+zypDumW+ZrLkekJQw76JKJVzWUSUeGHzltXDhUfEntPPLFLR3vJpP7u5CZZYauiwhshw==",
                       "Key":"OsHQtrHdMZPme4ynIP4lcsMEhv0=",
                       "Id":"52",
                       "ConsolidatorSystemAccountId":"80",
                       "TravelAgencySystemAccountId":"80",
                       "Name":"B2C"
                               },
                     "Air":[{
                       "Arrival":{
                       "Iata":"MEX",
                       "Date":"2017-03-15T15:00:00.000Z"
                            },
                     "Departure":{
                       "Iata":"POA",
                       "Date":"2017-03-01T15:00:00.000Z"
                      },
                   "InBoundTime":"0",
                   "OutBoundTime":"0",
                   "CiaCodeList":"[]",
                   "BookingClass":"-1",
                   "IsRoundTrip":"true",
                   "Stops":"-1",
                   "FareType":"-"
                   }],
                  "Pax":{
                       "adt":"1",
                       "chd":"0",
                       "inf":"0"
                  },
                  "DisplayTotalAmount":"false",
                  "GetDeepLink":"false",
                  "GetPriceMatrixOnly":"false",
                  "PageLength":"10",
                  "PageNumber":"2"
                  }
                 , callback=self.parse_airfare)]
    
        def parse_airfare(self, response):
            data = json.loads(response.body)
    
  • Zeugma
    Zeugma over 7 years
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review
  • Noah.Kim
    Noah.Kim over 7 years
    Thank you for your kind.
  • Lhassan Baazzi
    Lhassan Baazzi over 6 years
    When scraping with Scrapy framework and you have a form in webpage, always use the FormRequest.from_response function to submit the form, and use the FormRequest to send AJAX Requests data.
  • Uday Posia
    Uday Posia about 3 years
    What should I do if there are multiple form on that page and all of them don't have any id or name attribute? How would I select particular form for Form.request ?
  • Leonardo Maffei
    Leonardo Maffei over 2 years
    this answer surely deserves mode upvotes. It is specially useful when you have multiple forms of login. In my case, it helped me with gitlab