Xpath for choosing next sibling

31,778

Solution 1

EDIT as noted by @Gaim, my original version failed to capture a terminal dt

string xml = @"
    <root>
    <dt>name</dt>
    <dd>value</dd>
    <dt>name2</dt>
    <dt>name3</dt>
    <dd>value3</dd>
    <dt>name4</dt>
    <dt>name5</dt>
    <dd>value5</dd>
    <dt>name6</dt>
    </root>
    ";

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

XmlNodeList nodes = 
    doc.SelectNodes("//dt[not(following-sibling::*[1][self::dd])]");

foreach (XmlNode node in nodes)
{
    Console.WriteLine(node.OuterXml);
}

Console.ReadLine();

Output is those dt nodes that do not have a dd immediately following them:

<dt>name2</dt>
<dt>name4</dt>
<dt>name6</dt>

What we are doing here is saying:

//dt

All dt nodes, anywhere....

[not(following-sibling::*[1]

....such that it's not the case that their first following sibling (whatever it is called)....

[self::dd]]

...is called dd.

Solution 2

I am not sure that I understand you but there is my solution. This XPath matches ALL <dt> which are not followed by <dd> directly. So There is test structure

<xml>
  <dt>name</dt> <!-- match -->

  <dt>name2</dt>
  <dd>value2</dd>

  <dt>name</dt>
  <dd>value</dd>

  <dt>name2</dt>  <!-- match -->
</xml>

There is the XPath

//dt[ name( following-sibling::*[1] ) != 'dd' ]

or

//dt[  not( following-sibling::*[1]/self::dd ) ]

they do same thing

Share:
31,778
Ula Krukar
Author by

Ula Krukar

Software developer, Java and C#.Net, always keen to find out something more...

Updated on July 12, 2022

Comments

  • Ula Krukar
    Ula Krukar almost 2 years

    I have piece of HTML like this:

    <dt>name</dt>
    <dd>value</dd>
    <dt>name2</dt>
    <dd>value2</dd>
    

    I want to find all places where the structure is incorrect, meaning there is no dd tag after dt tag.

    I tried this:

    //dt/following-sibling::dt
    

    but this doesn't work. Any suggestions?

  • Tomalak
    Tomalak over 14 years
    +1 -- The XPath expression can be molten down to //dt[following-sibling::*[1][self::dt]]
  • Gaim
    Gaim over 14 years
    @Tomalak Your XPath doesn't match all cases, look at my answer, you match only the first.
  • AakashM
    AakashM over 14 years
    +1 better than my original, which failed to capture a terminal dd-lacking dt
  • Tomalak
    Tomalak over 14 years
    @Gaim: You are right. The not() approach is the correct one, I did not think about the case where a <dt> is last sibling.
  • Frederic Bazin
    Frederic Bazin about 11 years
    any idea how to deal when <dd> is missing e.g. <b>label1</b> value1 <br> <b>label2</b>value2 <br>...... ?
  • Frederic Bazin
    Frederic Bazin about 11 years