DataView.RowFilter Vs DataTable.Select() vs DataTable.Rows.Find()
Solution 1
You are looking for the "best approach on finding rows in a datatable", so I first have to ask: "best" for what? I think, any technique has scenarios where it might fit better then the others.
First, let's look at DataView.RowFilter
: A DataView has some advantages in Data Binding. Its very view-oriented so it has powerful sorting, filtering or searching features, but creates some overhead and is not optimized for performance. I would choose the DataView.RowFilter
for smaller recordsets and/or where you take advantage of the other features (like, a direct data binding to the view).
Most facts about the DataView, which you can read in older posts, still apply.
Second, you should prefer DataTable.Rows.Find
over DataTable.Select
if you want just a single hit. Why? DataTable.Rows.Find returns only a single row. Essentially, when you specify the primary key, a binary tree is created. This has some overhead associated with it, but tremendously speeds up the retrieval.
DataTable.Select
is slower, but can come very handy if you have multiple criteria and don't care about indexed or unindexed rows: It can find basically everything but is not optimized for performance. Essentially, DataTable.Select has to walk the entire table and compare every record to the criteria that you passed in.
I hope you find this little overview helpful.
I'd suggest to take a look at this article, it was helpful for me regarding performance questions. This post contains some quotes from it.
A little UPDATE: By the way, this might seem a little out of scope of your question, but its nearly always the fastest solution to do the filtering and searching on the backend. If you want the simplicity and have an SQL Server as backend and .NET3+ on client, go for LINQ-to-SQL. Searching Linq objects is very comfortable and creates queries which are performed on server side. While LINQ-to-Objects is also a very comfortable but also slower technique. In case you didn't know already....
Solution 2
Thomashaid's post sums it up nicely:
-
DataView.RowFilter
is for binding. -
DataTable.Rows.Find
is for searching by primary key only. -
DataTable.Select
is for searching by multiple columns and also for specifying an order.
Avoid creating many DataViews in a loop and using their RowFilters to search for records. This will drastically reduce performance.
I wanted to add that DataTable.Select
can take advantage of indexes. You can create an index on a DataTable by creating a DataView and specifying a sort order:
DataView dv = new DataView(dt);
dv.Sort = "Col1, Col2";
Then, when you call DataTable.Select()
, it can use this index when running the query. We have used this technique to seriously improve performance in places where we use the same query many, many times. (Note that this was before Linq existed.)
The trick is to define the sort order correctly for the Select
statement. So if your query is "Col1 = 1 and Col2 = 4", then you'll want "Col1, Col2" like in the example above.
Note that the index creation may depend on the actual calls to create the DataView. We had to use the new DataView(DataTable dt)
constructor, and then specify the Sort property in a separate step. The behavior may change slightly with different .NET versions.
![A G](https://i.stack.imgur.com/3qx5g.jpg?s=256&g=1)
A G
I am a creative, hands-on developer who is passionate about software engineering, building products that are easy to use, delight users and solve real world problems. Though not a designer by profession, I have keen interest in usability & user experience (application/software design). Started programming & graphic design in high school (1997). Professionally, I have a diverse work experience (~15 years) from building products like forex trading systems, bitcoin miner to working as CTO for a funded startup. Recently I worked as a technical architect (& lead) of a multi million $ mobile app build in React Native for STC (Saudi Telecom Company). At present I am helping a few international companies build their engineering teams.
Updated on November 04, 2020Comments
-
A G over 3 years
Considering the code below:
Dataview someView = new DataView(sometable) someView.RowFilter = someFilter; if(someView.count > 0) { …. }
Quite a number of articles which say Datatable.Select() is better than using DataViews, but these are prior to VS2008.
Solved: The Mystery of DataView's Poor Performance with Large Recordsets
Array of DataRecord vs. DataView: A Dramatic Difference in PerformanceGoogling on this topic I found some articles/forum topics which mention Datatable.Select() itself is quite buggy(not sure on this) and underperforms in various scenarios.
On this(Best Practices ADO.NET) topic on msdn it is suggested that if there is primary key defined on a datatable the findrows() or find() methods should be used insted of Datatable.Select().
This article here (.NET 1.1) benchmarks all the three approaches plus a couple more. But this is for version 1.1 so not sure if these are valid still now. Accroding to this DataRowCollection.Find() outperforms all approaches and Datatable.Select() outperforms DataView.RowFilter.
So I am quite confused on what might be the best approach on finding rows in a datatable. Or there is no single good way to do this, multiple solutions exist depending upon the scenario?
-
James over 11 yearsI just found a case where the results were different between the .Select and RowFilter techniques. In my case Select returned 532 rows and RowFilter was returning 540. I found the difference to be related to extra spaces in the table data, and resolved it by using Trim in the select statement TRIM(VendorNumber) = '500'
-
Chris Smith over 10 yearsWhoa this is super handy. I can't believe this isn't documented on MSDN. With like 1 line of code, I drastically improved the performance of my DataTable.Select() calls without doing all the silly FindRows() and Dictionary work arounds. THANKS
-
JohanLarsson over 8 yearssuper, that made my day. Now queries are 300% faster!
-
LMK almost 8 yearsIf you step through the underlying .Net source, you will see that often .Select() does create an index itself, if the conditions are right. Such as when a simple expression like "col1 = 3 and col2 = 4" is used. You can see this by examining the private [indexes] field of the table after the select. In those cases there is no need to create a DataView. The answer above also doesn't work for me, I need to create a DataView with just the table constructor, and then set the [Sort] property separately. Not sure why...
-
Paul Williams almost 8 yearsOur code has the same pattern: 1) create
DataView
withDataTable
-only constructor 2) set Sort property. I will note that in the answer. -
مسعود almost 8 years@ paul suppose there is a function for lookup in the same file which takes datatable as argument. so how can I use this concept in that function. will it use the view i created above automatically or i have to pass it in the function instead of the datatable.
-
Paul Williams almost 8 years@Masood This sounds like a new question that you could ask on StackOverflow. The answer depends on your implementation. Just don't create the same index repeatedly in a loop. Note that creating the index may be as expensive as querying the table only one time.