HtmlAgilityPack selecting childNodes not as expected

24,660

Solution 1

You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.

HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");

Solution 2

With an xpath query you can also use "." to indicate the search should start at the current node.

HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");

Solution 3

Also, Watch out for Null Check. SelectNodes returns null instead of blank collection.

HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

**if(linkNodes!=null)**
{
   foreach(HtmlNode linkNode in linkNodes)
  {
     string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
     if (linkTitle == string.Empty)
     {
       **HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");**   
     }
  }
}
Share:
24,660
Sheff
Author by

Sheff

Updated on November 18, 2020

Comments

  • Sheff
    Sheff over 3 years

    I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection of links. For each link I want to check if there is an image node and then parse its attribures but the SelectNodes and SelectSingleNode methods of linkNode seems to be searching the parent document not the childNodes of linkNode what gives?

    HtmlDocument htmldoc = new HtmlDocument();
    htmldoc.LoadHtml(content);
    HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");
    
    foreach(HtmlNode linkNode in linkNodes)
    {
        string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
        if (linkTitle == string.Empty)
        {
            HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");     
        }
    }
    

    Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?

  • Sheff
    Sheff almost 15 years
    Errrm OK. That was pretty daft of me. I thought I was missing something. Sorry for wasting question space Thanks.
  • binarydreams
    binarydreams almost 15 years
    There's always plenty of space :)
  • mpen
    mpen over 13 years
    Which was a really stupid design decision IMO. There's no reason it shouldn't return an empty collection.
  • binarydreams
    binarydreams over 12 years
    The default axis is children to a prefix is actually not required at all.
  • Matt
    Matt about 12 years
  • Andranik Hovhannisyan
    Andranik Hovhannisyan almost 12 years
    You the man! A sec ago i was cursing at the HtmlAgility project, but turns out they just implemented xpath the right way :)
  • wal
    wal about 9 years
    This didnt work for me (HtmlAgilityPack 1.4.9) - I had to use the .// notation (answer below)
  • binarydreams
    binarydreams about 9 years
    @wal The syntax above assumes the target img is a direct child of linkNode. If you had to use .//, I'm guessing the img was a descendant but not a direct child.