Getting meta tags from a page source using Selenium Python
10,131
If I look at the page source, for example in Chrome view-source:https://play.google.com/store/apps/details?id=com.teslacoilsw.launcher&hl=en
. I also don't find a <div>
element with attribute @itemprop
and value price
.
So your XPath is completely wrong. Also browser.find_element_by_xpath()
returns an element and you want to extract the attribute value of @content
. You should then use next:
priceValue = browser.find_element_by_xpath("//meta[@itemprop='price']")
print priceValue.get_attribute("content")
Comments
-
Siddharthan Asokan almost 2 years
I'm trying to fetch the data from the URL https://play.google.com/store/apps/details?id=com.teslacoilsw.launcher&hl=en and fetch the below data
<meta content="3.99" itemprop="price">
I used the following code implemented in Python to fetch but it failed.
browser = webdriver.Firefox() # Get local session of firefox browser.get(sampleURL) # Load page assert "Google Play" in browser.title priceValue = browser.find_element_by_xpath("//div[@itemprop='price']")# print priceValue.text
But it says it can't find the xpath of value price. Any idea why?
EDIT
priceValue = browser.find_element_by_xpath("//meta[@itemprop='price']") print priceValue.text
I get empty string
-
Siddharthan Asokan over 10 yearsUsing firefox I get the following error with your suggestion: The given selector //meta[@itemprop=\'price\']/@content is either invalid or does not result in a WebElement. The following error occurred:\nInvalidSelectorError: The result of the xpath expression "//meta[@itemprop=\'price\']/@content" is: [object XrayWrapper [object Attr]]. It should be an element.'
-
Mark Veenstra over 10 yearsWell the XPath to the value
3.99
would be//meta[@itemprop='price']/@content
. If you need to get the element returned and not the value, you can use//meta[@itemprop='price']
. That would return the<meta>
element -
Mark Veenstra over 10 years@SiddharthanAsokan you do not need to do
print priceValue.text
, but you want the attribute value or not? You then need to doprint priceValue.get_attribute("content")
. See my corrected answer