Is there a jQuery-like CSS/HTML selector that can be used in C#?

26,226

Solution 1

You should definitely see @jamietre's CsQuery. Check out his answer to this question!

Fizzler and Sharp-Query provide similar functionality, but the projects seem to be abandoned.

Solution 2

Update 10/18/2012

CsQuery is now in release 1.3. The latest release incorporates a C# port of the validator.nu HTML5 parser. As a result CsQuery will now produce a DOM that uses the HTML5 spec for invalid markup handling and is completely standards compliant.

Original Answer

Old question but new answer. I've recently released version 1.1 of CsQuery, a jQuery port for .NET 4 written in C# that I've been working on for about a year. Also on NuGet as "CsQuery"

The current release implements all CSS2 & CSS3 selectors, all jQuery extensions, and all jQuery DOM manipulation methods. It's got extensive test coverage including all the tests from jQuery and sizzle (the jQuery CSS selection engine). I've also included some performance tests for direct comparisons with Fizzler; for the most part CsQuery dramatically outperforms it. The exception is actually loading the HTML in the first place where Fizzler is faster; I assume this is because fizzler doesn't build an index. You get that time back after your first selection, though.

There's documentation on the github site, but at a basic level it works like this:

Create from a string of HTML

CQ dom = CQ.Create(htmlString);

Load synchronously from the web

CQ dom = CQ.CreateFromUrl("http://www.jquery.com");

Load asynchronously (non-blocking)

CQ.CreateFromUrlAsync("http://www.jquery.com", responseSuccess => {
    Dom = response.Dom;        
}, responseFail => {
    ..
});

Run selectors & do jQuery stuff

var childSpans = dom["div > span"];
childSpans.AddClass("myclass");

the CQ object is like thejQuery object. The property indexer used above is the default method (like $(...).

Output:

string html = dom.Render();

Solution 3

Not quite jQuery like, but this may help: http://www.codeplex.com/htmlagilitypack

Solution 4

For XML you might use XPath...

Solution 5

I'm not entirely clear as to what you're trying to achieve, but if you have a HTML document that you're trying to extract data from, I'd recommend loading it with a parser, and then it becomes fairly trivial to query the object to pull desired elements.

The parser I linked above allows for use of XPath queries, which sounds like what you are looking for.

Let me know if I've misunderstood.

Share:
26,226
Dave
Author by

Dave

Updated on July 05, 2022

Comments

  • Dave
    Dave almost 2 years

    I'm wondering if there's a jQuery-like css selector that can be used in C#.

    Currently, I'm parsing some html strings using regex and thought it would be much nicer to have something like the css selector in jQuery to match my desired elements.

  • Dave
    Dave over 14 years
    Oh yea, I forgot to mention that. I wanted something like the css selector for it's simplicity and clarity.
  • Dave
    Dave over 14 years
    May I know what parser you are refering to? I just want something like Doc.select("div.foo") to return all the elements that is a div and have class foo.
  • patjbs
    patjbs over 14 years
    I added a link to the text, which points to a SO question about parsing HTML. In particular, the HTML Agility pack parser I've used in the past to load HTML docs and query against them with great success.
  • Dave
    Dave over 14 years
    yes... I just looked over the html agility pack few days ago. But, it still uses XPath for matching. It's not that I don't like XPath. But, the cleanness of the css selector syntax is much better imo.
  • Daniel
    Daniel over 14 years
    LINQ-to-Objects is probably what I'd use. But right - not as clean as selectors.
  • nakhli
    nakhli over 12 years
    Juste a note: Sharp query is GPL. Fizzler is LGPL, more business friendly.
  • Travis P
    Travis P about 12 years
    Check out HTML Agility Pack if you want to use XPath with potentially-non-well-formed HTML. htmlagilitypack.codeplex.com
  • casperOne
    casperOne almost 12 years
    Do you handle cases where there are new-lines, line-breaks, and tabs as whitespace separating the class names?
  • Jamie Treworgy
    Jamie Treworgy almost 12 years
    Just added a test for this, it already correctly interprets any whitespace in classes as a separator. So the answer is yes.
  • casperOne
    casperOne almost 12 years
    Thanks for the info. The question is unfortunately NC, but I've run into this specific issue a number of times.
  • Jamie Treworgy
    Jamie Treworgy almost 12 years
    By the way, is there some reason why you are closing all the old questions that ask "is there a jquery port for c#" because I've answered it, nearly three years later, now that there is? Whether or not you agree that the question is a good one for SO, it's been here for years, and appears high in google searches for the question. I would like people to be able to find this. To close it now seems, well, a bit vindictive. The only consequence will be that this project, which is free, useful, and MIT licensed, and didn't exist in a complete form until recently, will have less exposure.
  • casperOne
    casperOne almost 12 years
    It's actually the duplicate answer flags that the system is bringing up. These questions are list questions and they are specifically not allowed on Stack Overflow. There was a time when these questions were ok for Stack Overflow, but that time is no longer, and we close these when we see them. That said, if you're going to repeat the same answer, and the question is really a duplicate, you should flag it for moderator attention to be closed as such. Otherwise, we will probably delete the answers if we find they are on all duplicate questions.
  • Jamie Treworgy
    Jamie Treworgy almost 12 years
    Well, I guess it's your call, I think it's too bad that you are using the "letter of the law" to hinder my efforts to let people know about this project. I answered this less than a day ago and have gotten two upvotes already, so I guess people are finding it useful even as you are not. Too bad it will be gone from SO tomorrow.
  • casperOne
    casperOne almost 12 years
    I gave you one of the upvotes, so don't be so quick to judge. That said, Stack Overflow is not a place for promotion. It's explicitly forbidden in the FAQ. This has been hashed out many, many times on Meta Stack Overflow. There is no leeway on this.
  • Jamie Treworgy
    Jamie Treworgy almost 12 years
    It says "post good, relevant answers, and if some (but not all) happen to be about your product or website, so be it". I have answered hundreds of questions having nothing to do with my projects (not products or websites, anyway) over the years. I think my answer qualifies as "good and relevant" including example code and details, and I certainly disclosed my affiliation. This directive clearly seems oriented towards commercial interests. This is not one, and I am definitely not a spammer, which I think you can verify with a review of my history on SO.
  • casperOne
    casperOne almost 12 years
    The directive is not towards commercial interests, it's towards content that is repeated. It might be a relevant answer to the question, but you should judge whether or not it's a good question (which this clearly is not). That said, there are plenty of issues on Meta Stack Overflow which reference promotion as well as the NC closing of list questions. If you want to follow up there and get the community's take on it, I recommend that.
  • Jeroen K
    Jeroen K almost 12 years
    Looks like Fizzler is beta for 2 years, no activity. Sharp-Query not much better, status unclear.
  • Andy S
    Andy S almost 12 years
    Please consider upvoting @jamietre's answer instead of mine. He has a fantastic solution!
  • Frank Schwieterman
    Frank Schwieterman about 9 years
    Later I started using CsQuery and now prefer it.
  • Jeroen K
    Jeroen K almost 7 years
    CsQuery is no longer maintained. The author suggests to consider AngleSharp github.com/AngleSharp/AngleSharp
  • Toskan
    Toskan over 3 years
    csquery is no longer maintained