Get page generated with Javascript in Python

32,376

You could use Selenium Webdriver:

#!/usr/bin/env python
from contextlib import closing
from selenium.webdriver import Firefox # pip install selenium
from selenium.webdriver.support.ui import WebDriverWait

# use firefox to get page with javascript generated content
with closing(Firefox()) as browser:
     browser.get(url)
     button = browser.find_element_by_name('button')
     button.click()
     # wait for the page to load
     WebDriverWait(browser, timeout=10).until(
         lambda x: x.find_element_by_id('someId_that_must_be_on_new_page'))
     # store it to string variable
     page_source = browser.page_source
print(page_source)
Share:
32,376

Related videos on Youtube

xralf
Author by

xralf

Updated on January 23, 2020

Comments

  • xralf
    xralf over 4 years

    I'd like to download web page generated by Javascript and store it to string variable in Python code. The page is generated when you click on button.

    If I would know the resulting URL I would use urllib2 but this is not the case.

    thank you

    • e-satis
      e-satis over 12 years
      Is this generated completly in js or just built from an ajax call ?
    • xralf
      xralf over 12 years
      @e-satis I think that it's completely in js
    • e-satis
      e-satis over 12 years
      Then I'd got with J.F solution, or with python webkit. Just keep in mind they require a display server to be running so if you plan to make it run on a headless server, you'll need to hack a little bit.
  • xralf
    xralf over 12 years
    is the WebDriverWait with someId_that_must_be_on_new_page neccessary? Could it be done only with some sleep or delay function? And is it possible to set the user-agent string?
  • xralf
    xralf over 12 years
    There is one problem yet. On the web page is select element and something have to be selected. If nothing is selected the button won't work. And is neccessary to open and close firefox? Without guit this won't work?
  • jfs
    jfs over 12 years
    you could use any condition you like e.g., x.title == 'New Title'. You probably could modify user-agent by using appropriate firefox profile.
  • jfs
    jfs over 12 years
    here's an example on how to select option. .quit() is not necessary.
  • xralf
    xralf over 12 years
    The method select_option(self, selector, value) takes selector parameter. I'm not sure what this parameter should be. Let's say I want to click on option with value = 100 of select with id = 'sel_id' and name = 'sel_name'. Could this be expressed in CSS?
  • jfs
    jfs over 12 years
    @xralf: select_option('select#sel_id', '100'). You could pass an element instead select_option(browser.find_element_by_id('sel_id'), '100').
  • xralf
    xralf over 12 years
    Thanks. I already used options = browser.find_elements_by_tag_name('option') for option in options: if option.get_attribute('value') == "100": option.click() and worked too.
  • alper
    alper over 3 years
    Can this done by opening Firefox window on the background?
  • jfs
    jfs over 2 years
    @alper yes, there headless options