Selenium find all elements by xpath

python selenium web-crawler

23,292

To grab all prices on that page you should use such XPATH:

Header = driver.find_elements_by_xpath("//span[contains(concat(' ', normalize-space(@class), ' '), 'price-amount')]")

which means: find all span elements with class=price-amount, why so complex - see here

But more simply to find the same elements is by CSS locator:

.price-amount

23,292

Author by

Serious Ruffy

Updated on December 13, 2020

Comments

Serious Ruffy over 3 years

I used selenium to scrap a scrolling website and conducted the code below

import requests
from bs4 import BeautifulSoup
import csv
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import unittest
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import unittest
import re

output_file = open("Kijubi.csv", "w", newline='')  

class Crawling(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Firefox()
        self.driver.set_window_size(1024, 768)
        self.base_url = "http://www.viatorcom.de/"
        self.accept_next_alert = True

    def test_sel(self):
        driver = self.driver
        delay = 3
        driver.get(self.base_url + "de/7132/Seoul/d973-allthingstodo")
        for i in range(1,1):
            driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
            time.sleep(2)
    html_source = driver.page_source
    data = html_source.encode("utf-8")

My next step was to crawl specific information from the website like the price.

Hence, I added the following code:

 all_spans = driver.find_elements_by_xpath("/html/body/div[5]/div/div[3]/div[2]/div[2]/div[1]/div[1]/div")
    print(all_spans)
    for price in all_spans:
        Header = driver.find_elements_by_xpath("/html/body/div[5]/div/div[3]/div[2]/div[2]/div[1]/div[1]/div/div[2]/div[2]/span[2]")
        for span in Header:
            print(span.text)

But I get just one price instead all of them. Could you provide me feedback on what I could improve my code? Thanks:)

EDIT

Thanks to your guys I managed to get it running. Here is the additional code:

    elements = driver.find_elements_by_xpath("//div[@id='productList']/div/div")

    innerElements = 15

    outerElements = len(elements)/innerElements

    print(innerElements,  "\t", outerElements, "\t", len(elements))

    for j in range(1, int(outerElements)):

        for i in range(1, int(innerElements)):


            headline = driver.find_element_by_xpath("//div[@id='productList']/div["+str(j)+"]/div["+str(i)+"]/div/div[2]/h2/a").text

            price = driver.find_element_by_xpath("//div[@id='productList']/div["+str(j)+"]/div["+str(i)+"]/div/div[2]/div[2]/span[2]").text
            deeplink = driver.find_element_by_xpath("//div[@id='productList']/div["+str(j)+"]/div["+str(i)+"]/div/div[2]/h2/a").get_attribute("href")

            print("Header: " + headline + " | " + "Price: " + price + " | " + "Deeplink: " + deeplink)

Now my last issue is that I still do not get the last 20 prices back, which have a English description. I only get back the prices which have German description. For English ones, they do not get fetched although they share the same html structure.

E.g. html structure for the English items

     headline =   driver.find_element_by_xpath("//div[@id='productList']/div[6]/div[1]/div/div[2]/h2/a")

Do you guys know what I have to modify? Any feedback is appreciated:)

Serious Ruffy over 8 years

Thanks for your feedback

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Save complete web page (incl css, images) using python/selenium

how to click on the link using python selenium?

Python, Selenium : 'Element is no longer attached to the DOM'

Using selenium: How to keep logged in after closing Driver in Python

How to scroll down in Python Selenium step by step

Scraping text in h3 and div tags using beautifulSoup, Python

crawl site that has infinite scrolling using python

Selenium pdf automatic download not working

Flutter App dies with "Failed assertion: line 4901 pos 16: 'child is! ParentDataElement<ParentData>': is not true."

AttributeError: 'NoneType' object has no attribute 'strip' with Python WebCrawler

Selenium find all elements by xpath

Serious Ruffy

Comments

Recents

Related