XPath select all elements between two specific elements
Solution 1
Same concept as bytebuster, but a different xpath:
/*/p[count(preceding-sibling::divider)=1]
Solution 2
Here is a general XPath expression:
/*/divider[$k]
/following-sibling::p
[count(.|/*/divider[$k+1]/preceding-sibling::p)
=
count(/*/divider[$k+1]/preceding-sibling::p)
]
If you substitute $k
with 1
then exactly the wanted p
nodes are selected.
if you substitute $k
with 2
then all p
elements between the 2nd and 3rd divider
, ..., etc.
Explanation:
This is a simple application of the Kayessian XPath 1.0 formula for node-set intersection:
$ns1[count(.|$ns2) = count($ns2)]
selects all the nodes that belong both to the nodesets $ns1
and $ns2
.
In this specific case we substitute $ns1
with:
/*/divider[$k]/following-sibling::p
and we substitute $ns2
with:
/*/divider[$k+1]/preceding-sibling::p
Solution 3
I think there's a much simpler and probably faster solution: you want all preceding siblings of the second divider that have at least one preceding sibling divider:
/doc/divider[2]/preceding-sibling::p[preceding-sibling::divider]
It gets a bit more complex, of course, if you want to find the paras between the second and third dividers: then you want something more like Daniel Haley's solution.
Solution 4
What about selecting all p
having exactly one element divider
as preceding-sibling
?
//doc/p[preceding-sibling::divider[1] and not (preceding-sibling::divider[2])]
Mirko
Updated on October 06, 2020Comments
-
Mirko over 3 years
I have a following xml:
<doc> <divider /> <p>text</p> <p>text</p> <p>text</p> <p>text</p> <p>text</p> <divider /> <p>text</p> <p>text</p> <divider /> <p>text</p> <divider /> </doc>
I want to select all p nodes after first divider element until next occurrence of divider element. I tried with following xpath:
//divider[1]/following-sibling::p[following::divider]
but the problem is it selects all p elements before last divider element. I'm not sure how to do it using xpath 1.
-
Be Brave Be Like Ukraine about 12 yearsGreat idea!
count
is more idiomatic as the1
is used only once. -
vhs about 4 yearsAllows start and end divider tags to vary so, for instance, one may select items between an
H1
and the firstTABLE
. Simple and flexible.