Get text content of an HTML element using XPath?

html xml xpath html-parsing

68,853

You want to select all descendant text, not just child text:

//div[a[contains(., "Add to cart")]]/p//text()

Note the double slash between p and text() there.

This potentially will also include a lot of inter-tag whitespace though, you you'll need to clean that up. Example using lxml:

>>> import lxml.etree as ET
>>> tree = ET.fromstring('''<div>
... <div>
...     <p>
...     <span class="abc">Monitor</span> <b>$300</b>
...     </p>
...     <a href="/add">Add to cart</a>
... </div>
... <div>
...     <p>
...     <span class="abc">Keyboard</span> $20 
...     </p>
...     <a href="/add">Add to cart</a>
... </div>
... </div>''')
>>> tree.xpath('//div[a[contains(., "Add to cart")]]/p//text()')
['\n    ', 'Monitor', ' ', '$300', '\n    ', '\n    ', 'Keyboard', ' $20 \n    ']
>>> res = _
>>> [txt for txt in (txt.strip() for txt in res) if txt]
['Monitor', '$300', 'Keyboard', '$20']

68,853

Author by

Genghis Khan

/^[[:space:]]+$/

Updated on August 17, 2020

Comments

Genghis Khan almost 4 years

See this html

<div>
    <p>
    <span class="abc">Monitor</span> <b>$300</b>
    </p>
    <a href="/add">Add to cart</a>
</div>
<div>
    <p>
    <span class="abc">Keyboard</span> $20 
    </p>
    <a href="/add">Add to cart</a>
</div>

Using xpath I want to parse Monitor $300 and Keyboard $20. I use this xpath

 //div[a[contains(., "Add to cart")]]/p/text()

But it selects <span class="abc">Monitor</span> <b>$300</b>. I don't want the tags. How do I get only the text?

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

How to select the second element with same attribute ID in an XPATH?

XPath to select tags near (before and after) another element?

Select deeply nested element

Difference between child, following and descendant in XPath axes

XPath for a span based on its text?

How to use XPath contains() for specific text?

XPath to match @class value and element value?

How to read HTML as XML?

Get (text) in XPath

jQuery-like interface for PHP?

Get text content of an HTML element using XPath?

Genghis Khan

Comments

Recents

Related