Xpath for choosing next sibling
Solution 1
EDIT as noted by @Gaim, my original version failed to capture a terminal dt
string xml = @"
<root>
<dt>name</dt>
<dd>value</dd>
<dt>name2</dt>
<dt>name3</dt>
<dd>value3</dd>
<dt>name4</dt>
<dt>name5</dt>
<dd>value5</dd>
<dt>name6</dt>
</root>
";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XmlNodeList nodes =
doc.SelectNodes("//dt[not(following-sibling::*[1][self::dd])]");
foreach (XmlNode node in nodes)
{
Console.WriteLine(node.OuterXml);
}
Console.ReadLine();
Output is those dt
nodes that do not have a dd
immediately following them:
<dt>name2</dt>
<dt>name4</dt>
<dt>name6</dt>
What we are doing here is saying:
//dt
All dt
nodes, anywhere....
[not(following-sibling::*[1]
....such that it's not the case that their first following sibling (whatever it is called)....
[self::dd]]
...is called dd
.
Solution 2
I am not sure that I understand you but there is my solution. This XPath matches ALL <dt>
which are not followed by <dd>
directly. So There is test structure
<xml>
<dt>name</dt> <!-- match -->
<dt>name2</dt>
<dd>value2</dd>
<dt>name</dt>
<dd>value</dd>
<dt>name2</dt> <!-- match -->
</xml>
There is the XPath
//dt[ name( following-sibling::*[1] ) != 'dd' ]
or
//dt[ not( following-sibling::*[1]/self::dd ) ]
they do same thing
Ula Krukar
Software developer, Java and C#.Net, always keen to find out something more...
Updated on July 12, 2022Comments
-
Ula Krukar almost 2 years
I have piece of HTML like this:
<dt>name</dt> <dd>value</dd> <dt>name2</dt> <dd>value2</dd>
I want to find all places where the structure is incorrect, meaning there is no
dd
tag afterdt
tag.I tried this:
//dt/following-sibling::dt
but this doesn't work. Any suggestions?
-
Tomalak over 14 years+1 -- The XPath expression can be molten down to
//dt[following-sibling::*[1][self::dt]]
-
Gaim over 14 years@Tomalak Your XPath doesn't match all cases, look at my answer, you match only the first.
-
AakashM over 14 years+1 better than my original, which failed to capture a terminal dd-lacking dt
-
Tomalak over 14 years@Gaim: You are right. The
not()
approach is the correct one, I did not think about the case where a<dt>
is last sibling. -
Frederic Bazin about 11 yearsany idea how to deal when <dd> is missing e.g. <b>label1</b> value1 <br> <b>label2</b>value2 <br>...... ?
-
Frederic Bazin about 11 yearssee my new question at stackoverflow.com/questions/16745209/…