XPath - All following siblings except first specific elements

18,203

Solution 1

Use:

 /table[@id='target']/following-sibling::*[not(self::table) and not(self::ol)] 
| 
 /table[@id='target']/following-sibling::table[position() > 1]
|
 /table[@id='target']/following-sibling::ol[position() > 1]

This selects all the following siblings of the table that are not table and are not ol and all the following table siblings with position 2 or greater and all the following ol siblings with position 2 or greater.

Which is exactly what you want: all following siblings with the exception of the first table following sibling and the first ol following siblings.

This is pure XPath 1.0 and not using any XSLT functions.

Solution 2

Answering the second question first: what the above is doing is selecting all following siblings that are neither table nor ol elements.

Here's why: self::table[1] selects the context node's self (iff it passes the table element name test) and filters to select only the first node along the self:: axis. There is at most one node on the self:: axis passing the element name test, so the [1] is redundant. self::table[1] selects the context node whenever it is a table element, regardless of its position among its siblings. So not(self::table[1]) returns false whenever the context node is a table element, regardless of its position among siblings.

Similarly for self::ol[1].

How to do what you're trying to do: @John Kugelman's answer is almost correct, but misses the fact that we must ignore sibling elements before and including table[@id='target']. I don't think it's possible to do correctly in pure XPath 1.0. Do you have the possibility to use XPath 2.0? If you're working in a browser, the answer is generally no.

Some workarounds would be:

  • Skip the first following table sibling and the first following ol sibling by filtering on some other basis, such as their attributes;
  • Select //table[@id='target'] as a nodeset, return it to the host environment (i.e. outside of XPath, e.g. in JavaScript), loop through that nodeset; inside the loop: select following-sibling::* via XPath, iterate through that outside of XPath, and test each result (outside of XPath) to see if it is the first table or ol.
  • Select //table[@id='target'] as a nodeset, return it to the host environment (i.e. outside of XPath, e.g. in JavaScript), loop through that nodeset; inside the loop: select generate-id(following-sibling::table[1]) and generate-id(following-sibling::ol[1]) via XPath, receive those values into JS variables t1id and o1id, and construct a string for the XPath expression using the form 'following-sibling::*[generate-id() != ' + t1id + ' and generate-id() != ' + o1id + ']'. Select that string in XPath and you have your answer! :-p

Update: A solution is possible in XSLT 1.0 - see @Dimitre's.

Solution 3

There's only going to be one node when you use the self:: axis, so I believe self::*[1] will always be true. Every node is going to be the first (and only) node on its own self:: axis. This means your bracketed expression is equivalent to [not(self::table) and not(self::ol)], meaning all the tables and lists get filtered out.

I don't have a test environment set up, but off the top of my head this might do better:

/table[@id='target']/following-sibling::*
    [not(self::table and not(preceding-sibling::table)) and
     not(self::ol    and not(preceding-sibling::ol))]

It'll need some tweaking, but the idea is to filter out tables that do not have preceding-sibling tables, and ols that do not have preceding-sibling ols.

Share:
18,203
Shaun
Author by

Shaun

Updated on July 24, 2022

Comments

  • Shaun
    Shaun almost 2 years

    Let's say I'm querying an xhtml document, and I want to query all of the siblings following a table with id='target'. Also, I neither want the first <table> sibling nor the first <ol> sibling of this particular element. Here's the best I could come up with:

    //table[@id='target']/following-sibling::*[not(self::table[1]) and not(self::ol[1])]
    

    However, this isn't returning any results when it should. Obviously I'm not understanding some of the syntax for this (I couldn't find a good source of information). I would certainly appreciate it if someone more experienced with XPath syntax could give me a hand. Also, for purely academic purposes, I'd be curious what the above is actually doing.

    UPDATE:
    See LarsH's answer for the explanation of why my XPath wasn't working, and see Dimitre's answer for the accepted solution.

  • LarsH
    LarsH over 13 years
    as I noted in my answer, this solution misses the fact that the target table should not count as a preceding-sibling when you're testing following siblings to see which is [1]. Also, any table or ol elements before the target table have to be ignored by the preceding-sibling test.
  • Dimitre Novatchev
    Dimitre Novatchev over 13 years
    @John Kugelman: Take a look at my amswer for a pure and not too-complex XPath 1.0 solution.
  • Shaun
    Shaun over 13 years
    This is exactly the elegant solution I was looking for, and it's actually quite simple. It never occurred to me to get the union of the three mutually exclusive sets containing everything except those two elements. Thank you!
  • Shaun
    Shaun over 13 years
    Thanks for the great explanation as to why my attempted query was failing. I ultimately went with Dimitre's XPath to solve the problem.
  • LarsH
    LarsH over 13 years
    Ah, should have thought about it harder. :-) I was trying a union solution but couldn't get it. Typo though: according to the original question, // rather than / is needed before table.
  • John Kugelman
    John Kugelman over 13 years
    Heh, for some reason I decided a union solution would be bad and specifically avoided using it. +1 for not being boneheaded like me. :-)