Select deeply nested element
13,071
Assuming indentation denotes containment in your example, the following XPath will select the span
element for you:
//div[@id='...']/div[3]/div[2]/div/div/span
Of course, if there are no other span
elements beneath the id'ed div
, you could jump right to it:
//div[@id='...']//span
Or if there are no other span
elements in the entire document:
//span
Comments
-
grigy almost 2 years
I'm reading Scrapy/XPath tutorials but this does not seem trivial and I can't find an example that would explain it.
Given a markup like this how would you select the
<span>
element?<div id=”...”> <div> <div> <div> <div> <div> <div> <div> <span>
If we generalize the problem it would be:
- skip n divs in the div with id="..."
- skip m divs in the div
- ...
- select the span element in the div
-
grigy almost 9 yearsVery definitive answer. Thanks!
-
grigy almost 9 yearsIt helped but for some reason I can't extract the selector. If I log the content it prints a "square" symbol.
-
kjhughes almost 9 yearsHard to tell without seeing a complete example, but perhaps it's not showing the string value of the selected element as you're expecting. Does it help to explicitly select the text nodes of the span (by appending
/text()
to your XPath? -
grigy almost 9 yearsNo. It returns an empty list. It returns a non-empty value only for //div[@id='...'], for all other nodes under it the selector returns an empty list.
-
kjhughes almost 9 yearsIf you provide a Minimal, Complete, and Verifiable Example (MCVE) that exhibits the problem, it should be easy to see what's going on and help. Thanks.
-
kjhughes almost 9 yearsFrom what I'm seeing, the referenced page does not have any children under
<div id="developer_blog_index" data-referrer="developer_blog_index"></div>
, so of course you won't be able to select anything beneath there. -
grigy almost 9 yearsStrange, I can see the children (actually the whole content) under the div in the Chrome's element inspector. Maybe it loads it dynamically...