Python Selenium download images (jpeg, png) or PDF using ChromeDriver
Solution 1
Instead of relying in specific browser / driver options I would implement a more generic solution using the image url to perform the download.
You can get the image URL using similar code:
driver.find_element_by_id("your-image-id").get_attribute("src")
And then I would download the image using, for example, urllib.
Here's some pseudo-code for Python2:
import urllib
url = driver.find_element_by_id("your-image-id").get_attribute("src")
urllib.urlretrieve(url, "local-filename.jpg")
Here's the same for Python3:
import urllib.request
url = driver.find_element_by_id("your-image-id").get_attribute("src")
urllib.request.urlretrieve(url, "local-filename.jpg")
Edit after the comment, just another example about how to download a file once you know its URL:
import requests
from PIL import Image
from io import StringIO
image_name = 'image.jpg'
url = 'http://example.com/image.jpg'
r = requests.get(url)
i = Image.open(StringIO(r.content))
i.save(image_name)
Solution 2
Here is another simple way, but @Pitto's answer above is slightly more succinct.
import requests
webelement_img = ff.find_element(By.XPATH, '//img')
url = webelement_img.get_attribute('src') or 'https://someimages.com/path-to-image.jpg'
data = requests.get(url).content
local_filename = 'filename_on_your_computer.jpg'
with open (local_filename, 'wb') as f:
f.write(data)
animesharma
Updated on June 28, 2022Comments
-
animesharma almost 2 years
I have a Selenium script in Python (using ChromeDriver on Windows) that fetches the download links of various attachments(of different file types) from a page and then opens these links to download the attachments. This works fine for the file types which ChromeDriver can't preview as they get downloaded by default. But images(JPEG, PNG) and PDFs are previewed by default and hence aren't automatically downloaded.
The ChromeDriver options I am currently using (work for non preview-able files) :
chrome_options = webdriver.ChromeOptions() prefs = {'download.default_directory' : 'custom_download_dir'} chrome_options.add_experimental_option('prefs', prefs) driver = webdriver.Chrome("./chromedriver.exe", chrome_options=chrome_options)
This downloads the files to 'custom_download_dir', no issues. But the preview-able files are just previewed in the ChromeDriver instance and not downloaded.
Are there any ChromeDriver Settings that can disable this preview behavior and directly download all files irrespective of the extensions?
If not, can this be done using Firefox for instance?
-
animesharma about 6 yearsThe problem is to view the image I require authentication. I tried with the Python Requests library and it requires Kerberos Authentication, I tried supplying credentials and using the Python Kerberos library but it just doesn't work. I can view it on Selenium WebDriver, so I am looking for a way to download via the WebDriver instance itself.
-
Pitto about 6 yearsWhat about disabling the auto-open for images on Google Chrome? That could trigger the automatic download... presentermedia.com/blog/2013/10/…
-
animesharma about 6 yearsIs there an option to disable auto-open using the Chrome WebDriver settings in Python?
-
oldboy almost 3 years@halfer apparently,
urlretrieve
is legacy. is there a newer, better way to do this? -
halfer almost 3 yearsNo probs @oldboy. Pitto, thanks for making an edit - don't forget to draw people's attention to them. People will only see the change if they subscribe to your answer, so it might have been missed in this case.