HtmlAgilityPack selecting childNodes not as expected
Solution 1
You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.
HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");
Solution 2
With an xpath query you can also use "." to indicate the search should start at the current node.
HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");
Solution 3
Also, Watch out for Null Check. SelectNodes returns null instead of blank collection.
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");
**if(linkNodes!=null)**
{
foreach(HtmlNode linkNode in linkNodes)
{
string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
if (linkTitle == string.Empty)
{
**HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");**
}
}
}
Sheff
Updated on November 18, 2020Comments
-
Sheff over 3 years
I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection of links. For each link I want to check if there is an image node and then parse its attribures but the SelectNodes and SelectSingleNode methods of linkNode seems to be searching the parent document not the childNodes of linkNode what gives?
HtmlDocument htmldoc = new HtmlDocument(); htmldoc.LoadHtml(content); HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]"); foreach(HtmlNode linkNode in linkNodes) { string linkTitle = linkNode.GetAttributeValue("title", string.Empty); if (linkTitle == string.Empty) { HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]"); } }
Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?
-
Sheff almost 15 yearsErrrm OK. That was pretty daft of me. I thought I was missing something. Sorry for wasting question space Thanks.
-
binarydreams almost 15 yearsThere's always plenty of space :)
-
mpen over 13 yearsWhich was a really stupid design decision IMO. There's no reason it shouldn't return an empty collection.
-
binarydreams over 12 yearsThe default axis is
children
to a prefix is actually not required at all. -
Matt about 12 years
-
Andranik Hovhannisyan almost 12 yearsYou the man! A sec ago i was cursing at the HtmlAgility project, but turns out they just implemented xpath the right way :)
-
wal about 9 yearsThis didnt work for me (HtmlAgilityPack 1.4.9) - I had to use the
.//
notation (answer below) -
binarydreams about 9 years@wal The syntax above assumes the target img is a direct child of
linkNode
. If you had to use.//
, I'm guessing the img was a descendant but not a direct child.