HtmlAgilityPack, using XPath contains method and predicates

11,227

Solution 1

Perhaps the issue is simply that you're missing the closing parenthesis on the contains() function:

//div[contains(@class, 'yom-art-content']//p
                                        v
//div[contains(@class, 'yom-art-content')]//p


List<string> paragraphs = 
        doc2.DocumentNode.SelectNodes("//div[contains(@class, 'yom-art-content')]//p")
            .Select(paragraphNode => paragraphNode.InnerHtml).ToList();

As a general suggestion, please explain what you mean when you say things like "it didn't work". I suspect you're getting an error message that might help track down the issue?

Solution 2

Instead of using the HAP for this, look into CsQuery that provides jQuery style selectors.

It looks particularly suited for what you are trying to do.

CsQuery is a jQuery port for .NET 4. It implements all CSS2 & CSS3 selectors, all the DOM manipulation methods of jQuery, and some of the utility methods. The majority of the jQuery test suite (as of 1.6.2) has been ported to C#.

Share:
11,227
Oscar Acevedo
Author by

Oscar Acevedo

Updated on June 12, 2022

Comments

  • Oscar Acevedo
    Oscar Acevedo almost 2 years

    HtmlAgilityPack, using XPath contains method

    I'm using HtmlAgilityPack and i need to know if a class attribute contains a specific word, now i have this page:

    <div class="yom-mod yom-art-content "><div class="bd">
    <p class="first"> ....................
      </p>
    </div>
    </div>
    

    I'm doing this:

    HtmlDocument doc2 = ...;
    List<string> paragraphs = doc2.DocumentNode.SelectNodes("//div[@class = 'yom-mod yom-art-content ']//p").Select(paragraphNode => paragraphNode.InnerHtml).ToList();
    

    But it's too much specific that I need is something like this:

    List<string> paragraphs = doc2.DocumentNode.SelectNodes("//div[contains(@class, 'yom-art-content']//p").Select(paragraphNode => paragraphNode.InnerHtml).ToList();
    

    But it don't work, please help me..