How to use ScrapySharp to parse elements in an html document?

16,783

Add

using ScrapySharp.Extensions;

It looks like you're missing that. That should make CssSelect available.

Just in case an example helps, here's a method, as well, that I use in a project:

private string GetPdfUrl(HtmlDocument document, string baseUrl)
{
    return new Uri(new Uri(baseUrl), document.DocumentNode.CssSelect(".table-of-content .head-row td.download a.text-pdf").Single().Attributes["href"].Value).ToString();
}
Share:
16,783
sergserg
Author by

sergserg

sergserg

Updated on June 13, 2022

Comments

  • sergserg
    sergserg almost 2 years

    Here's the project official "Documentation":

    https://bitbucket.org/rflechner/scrapysharp/wiki/Home


    No matter what I try, I can't find the CssSelect() method that the library is supposed to add to make querying things easier. Here's what I've tried:

    using ScrapySharp.Core;
    using ScrapySharp.Html.Parsing;
    using HtmlAgilityPack;
    
    HtmlWeb web = new HtmlWeb();
    HtmlDocument doc = web.Load("http://www.stackoverflow.com");
    
    var page = doc.DocumentNode.SelectSingleNode("//body");
    page.CssSel???
    

    Exactly how do I use this library? In the documentation it isn't clear what type html is.