Selenium pdf automatic download not working

15,079

Solution 1

Disable the built-in pdfjs plugin and navigate to the URL - the PDF file would be downloaded automatically, the code:

from selenium import webdriver

fp = webdriver.FirefoxProfile()

fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir", "/home/jill/Downloads/Dinamalar")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf,application/x-pdf")

fp.set_preference("pdfjs.disabled", "true")  # < KEY PART HERE

browser = webdriver.Firefox(firefox_profile=fp)
browser.get("http://epaper.dinamalar.com/PUBLICATIONS/DM/MADHURAI/2015/05/26/PagePrint//26_05_2015_001_b2b69fda315301809dda359a6d3d9689.pdf");

Update (the complete code that worked for me):

from selenium import webdriver

mime_types = "application/pdf,application/vnd.adobe.xfdf,application/vnd.fdf,application/vnd.adobe.xdp+xml"

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", "/home/aafanasiev/Downloads")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", mime_types)
fp.set_preference("plugin.disable_full_page_plugin_for_types", mime_types)
fp.set_preference("pdfjs.disabled", True)

browser = webdriver.Firefox(firefox_profile=fp)
browser.get("http://epaper.dinamalar.com/")

webobj_get_link = browser.find_element_by_id("liSavePdf")
webobj_get_object = webobj_get_link.find_element_by_tag_name("a")
webobj_get_object.click()

Solution 2

I tested the following code and I succesfully downloaded your pdf on Windows 7:

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", download_location)
fp.set_preference("plugin.disable_full_page_plugin_for_types", "application/pdf")
fp.set_preference("pdfjs.disabled", True)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")



driver = webdriver.Firefox(fp)
driver.implicitly_wait(10)
driver.maximize_window()
driver.get("http://epaper.dinamalar.com/")
element = driver.find_element_by_css_selector("li#liSavePdf>a>img")
element.click()
Share:
15,079

Related videos on Youtube

Gaara
Author by

Gaara

Updated on June 05, 2022

Comments

  • Gaara
    Gaara almost 2 years

    I am new to selenium and I am writing a scraper to download pdf files automatically from a given site.

    Below is my code:

    from selenium import webdriver
    
    fp = webdriver.FirefoxProfile()
    
    fp.set_preference("browser.download.folderList",2);
    fp.set_preference("browser.download.manager.showWhenStarting",False)
    fp.set_preference("browser.download.dir", "/home/jill/Downloads/Dinamalar")
    fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")
    
    browser = webdriver.Firefox(firefox_profile=fp)
    browser.get("http://epaper.dinamalar.com/PUBLICATIONS/DM/MADHURAI/2015/05/26/PagePrint//26_05_2015_001_b2b69fda315301809dda359a6d3d9689.pdf");
    webobj = browser.find_element_by_id("download").click();
    

    I followed the steps mentioned in Selenium documentation and in the this link. I am not sure why download dialog box is getting shown every time.

    Is there anyway to fix it else can there be a way to give "application/all" so that all the files can be downloaded (work-around)?

  • Gaara
    Gaara almost 9 years
    I still face the issue even after the code mentioned. Any chance OS has any part in this? I use UBUNTU 14.04.
  • alecxe
    alecxe almost 9 years
    @Gaara interesting, it works for me: selenium 2.45 + firefox 35.0.1 on Mac.
  • Gaara
    Gaara almost 9 years
    mine is Selenium 2.45.0, Ubuntu 14.04 firefox 38.0. I am trying every possibility. Downloads pop up window does not come under window handle as well. It does not fall under alert. Any ideas on what more can be done? I can post a link to my script if you want.
  • alecxe
    alecxe almost 9 years
    @Gaara yes, please share the current code you are executing. Thanks.
  • Gaara
    Gaara almost 9 years
    Thanks a lot. Here is the link codeskulptor.org/#user40_loV03Asao9_0.py Function "download_page_from_child_link()" is responsible for clicking the "download" button and invoking the download dialog box. please let me know if you need any information
  • alecxe
    alecxe almost 9 years
    @Gaara good, thanks for the update - please see the updated answer.
  • Gaara
    Gaara almost 9 years
  • UltraBob
    UltraBob over 5 years
    Coming to this late, but it seems like maybe firefox has added new options, and this doesn't work anymore. In the Firefox preferences I see in Applications that Portable Document Format is set to Preview in Firefox, and have confirmed that if it is set to save file, the download will work properly, but I'm not sure how to find out what profile option I can use in code to do that.