Is there a jQuery-like CSS/HTML selector that can be used in C#?
Solution 1
You should definitely see @jamietre's CsQuery. Check out his answer to this question!
Fizzler and Sharp-Query provide similar functionality, but the projects seem to be abandoned.
Solution 2
Update 10/18/2012
CsQuery is now in release 1.3. The latest release incorporates a C# port of the validator.nu HTML5 parser. As a result CsQuery will now produce a DOM that uses the HTML5 spec for invalid markup handling and is completely standards compliant.
Original Answer
Old question but new answer. I've recently released version 1.1 of CsQuery, a jQuery port for .NET 4 written in C# that I've been working on for about a year. Also on NuGet as "CsQuery"
The current release implements all CSS2 & CSS3 selectors, all jQuery extensions, and all jQuery DOM manipulation methods. It's got extensive test coverage including all the tests from jQuery and sizzle (the jQuery CSS selection engine). I've also included some performance tests for direct comparisons with Fizzler; for the most part CsQuery dramatically outperforms it. The exception is actually loading the HTML in the first place where Fizzler is faster; I assume this is because fizzler doesn't build an index. You get that time back after your first selection, though.
There's documentation on the github site, but at a basic level it works like this:
Create from a string of HTML
CQ dom = CQ.Create(htmlString);
Load synchronously from the web
CQ dom = CQ.CreateFromUrl("http://www.jquery.com");
Load asynchronously (non-blocking)
CQ.CreateFromUrlAsync("http://www.jquery.com", responseSuccess => {
Dom = response.Dom;
}, responseFail => {
..
});
Run selectors & do jQuery stuff
var childSpans = dom["div > span"];
childSpans.AddClass("myclass");
the CQ
object is like thejQuery object. The property indexer used above is the default method (like $(...)
.
Output:
string html = dom.Render();
Solution 3
Not quite jQuery like, but this may help: http://www.codeplex.com/htmlagilitypack
Solution 4
For XML you might use XPath...
Solution 5
I'm not entirely clear as to what you're trying to achieve, but if you have a HTML document that you're trying to extract data from, I'd recommend loading it with a parser, and then it becomes fairly trivial to query the object to pull desired elements.
The parser I linked above allows for use of XPath queries, which sounds like what you are looking for.
Let me know if I've misunderstood.
Dave
Updated on July 05, 2022Comments
-
Dave almost 2 years
I'm wondering if there's a jQuery-like css selector that can be used in C#.
Currently, I'm parsing some html strings using regex and thought it would be much nicer to have something like the css selector in jQuery to match my desired elements.
-
Dave over 14 yearsOh yea, I forgot to mention that. I wanted something like the css selector for it's simplicity and clarity.
-
Dave over 14 yearsMay I know what parser you are refering to? I just want something like Doc.select("div.foo") to return all the elements that is a div and have class foo.
-
patjbs over 14 yearsI added a link to the text, which points to a SO question about parsing HTML. In particular, the HTML Agility pack parser I've used in the past to load HTML docs and query against them with great success.
-
Dave over 14 yearsyes... I just looked over the html agility pack few days ago. But, it still uses XPath for matching. It's not that I don't like XPath. But, the cleanness of the css selector syntax is much better imo.
-
Daniel over 14 yearsLINQ-to-Objects is probably what I'd use. But right - not as clean as selectors.
-
nakhli over 12 yearsJuste a note: Sharp query is GPL. Fizzler is LGPL, more business friendly.
-
Travis P about 12 yearsCheck out HTML Agility Pack if you want to use XPath with potentially-non-well-formed HTML. htmlagilitypack.codeplex.com
-
casperOne almost 12 yearsDo you handle cases where there are new-lines, line-breaks, and tabs as whitespace separating the class names?
-
Jamie Treworgy almost 12 yearsJust added a test for this, it already correctly interprets any whitespace in classes as a separator. So the answer is yes.
-
casperOne almost 12 yearsThanks for the info. The question is unfortunately NC, but I've run into this specific issue a number of times.
-
Jamie Treworgy almost 12 yearsBy the way, is there some reason why you are closing all the old questions that ask "is there a jquery port for c#" because I've answered it, nearly three years later, now that there is? Whether or not you agree that the question is a good one for SO, it's been here for years, and appears high in google searches for the question. I would like people to be able to find this. To close it now seems, well, a bit vindictive. The only consequence will be that this project, which is free, useful, and MIT licensed, and didn't exist in a complete form until recently, will have less exposure.
-
casperOne almost 12 yearsIt's actually the duplicate answer flags that the system is bringing up. These questions are list questions and they are specifically not allowed on Stack Overflow. There was a time when these questions were ok for Stack Overflow, but that time is no longer, and we close these when we see them. That said, if you're going to repeat the same answer, and the question is really a duplicate, you should flag it for moderator attention to be closed as such. Otherwise, we will probably delete the answers if we find they are on all duplicate questions.
-
Jamie Treworgy almost 12 yearsWell, I guess it's your call, I think it's too bad that you are using the "letter of the law" to hinder my efforts to let people know about this project. I answered this less than a day ago and have gotten two upvotes already, so I guess people are finding it useful even as you are not. Too bad it will be gone from SO tomorrow.
-
casperOne almost 12 yearsI gave you one of the upvotes, so don't be so quick to judge. That said, Stack Overflow is not a place for promotion. It's explicitly forbidden in the FAQ. This has been hashed out many, many times on Meta Stack Overflow. There is no leeway on this.
-
Jamie Treworgy almost 12 yearsIt says "post good, relevant answers, and if some (but not all) happen to be about your product or website, so be it". I have answered hundreds of questions having nothing to do with my projects (not products or websites, anyway) over the years. I think my answer qualifies as "good and relevant" including example code and details, and I certainly disclosed my affiliation. This directive clearly seems oriented towards commercial interests. This is not one, and I am definitely not a spammer, which I think you can verify with a review of my history on SO.
-
casperOne almost 12 yearsThe directive is not towards commercial interests, it's towards content that is repeated. It might be a relevant answer to the question, but you should judge whether or not it's a good question (which this clearly is not). That said, there are plenty of issues on Meta Stack Overflow which reference promotion as well as the NC closing of list questions. If you want to follow up there and get the community's take on it, I recommend that.
-
Jeroen K almost 12 yearsLooks like Fizzler is beta for 2 years, no activity. Sharp-Query not much better, status unclear.
-
Andy S almost 12 yearsPlease consider upvoting @jamietre's answer instead of mine. He has a fantastic solution!
-
Frank Schwieterman about 9 yearsLater I started using CsQuery and now prefer it.
-
Jeroen K almost 7 yearsCsQuery is no longer maintained. The author suggests to consider AngleSharp github.com/AngleSharp/AngleSharp
-
Toskan over 3 yearscsquery is no longer maintained