efficient way to implement paging

101,975

Solution 1

Trying to give you a brief answer to your doubt, if you execute the skip(n).take(m) methods on linq (with SQL 2005 / 2008 as database server) your query will be using the Select ROW_NUMBER() Over ... statement, with is somehow direct paging in the SQL engine.

Giving you an example, I have a db table called mtcity and I wrote the following query (work as well with linq to entities):

using (DataClasses1DataContext c = new DataClasses1DataContext())
{
    var query = (from MtCity2 c1 in c.MtCity2s
                select c1).Skip(3).Take(3);
    //Doing something with the query.
}

The resulting query will be:

SELECT [t1].[CodCity], 
    [t1].[CodCountry], 
    [t1].[CodRegion], 
    [t1].[Name],  
    [t1].[Code]
FROM (
    SELECT ROW_NUMBER() OVER (
        ORDER BY [t0].[CodCity], 
        [t0].[CodCountry], 
        [t0].[CodRegion], 
        [t0].[Name],
        [t0].[Code]) AS [ROW_NUMBER], 
        [t0].[CodCity], 
        [t0].[CodCountry], 
        [t0].[CodRegion], 
        [t0].[Name],
        [t0].[Code]
    FROM [dbo].[MtCity] AS [t0]
    ) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]

Which is a windowed data access (pretty cool, btw cuz will be returning data since the very begining and will access the table as long as the conditions are met). This will be very similar to:

With CityEntities As 
(
    Select ROW_NUMBER() Over (Order By CodCity) As Row,
        CodCity //here is only accessed by the Index as CodCity is the primary
    From dbo.mtcity
)
Select [t0].[CodCity], 
        [t0].[CodCountry], 
        [t0].[CodRegion], 
        [t0].[Name],
        [t0].[Code]
From CityEntities c
Inner Join dbo.MtCity t0 on c.CodCity = t0.CodCity
Where c.Row Between @p0 + 1 AND @p0 + @p1
Order By c.Row Asc

With the exception that, this second query will be executed faster than the linq result because it will be using exclusively the index to create the data access window; this means, if you need some filtering, the filtering should be (or must be) in the Entity listing (where the row is created) and some indexes should be created as well to keep up the good performance.

Now, whats better?

If you have pretty much solid workflow in your logic, implementing the proper SQL way will be complicated. In that case LINQ will be the solution.

If you can lower that part of the logic directly to SQL (in a stored procedure), it will be even better because you can implement the second query I showed you (using indexes) and allow SQL to generate and store the Execution Plan of the query (improving performance).

Solution 2

Try using

FROM [TableX]
ORDER BY [FieldX]
OFFSET 500 ROWS
FETCH NEXT 100 ROWS ONLY

to get the rows from 501 to 600 in the SQL server, without loading them in memory. Note that this syntax has become available with SQL Server 2012 only

Solution 3

While LINQ-to-SQL will generate an OFFSET clause (possibly emulated using ROW_NUMBER() OVER() as others have mentioned), there is an entirely different, much faster way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.

SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < @previousScore)
   OR (score = @previousScore AND player_id < @previousPlayerId)
ORDER BY score DESC, player_id DESC

The @previousScore and @previousPlayerId values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY direction is ASC, simply use > instead.

With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).

This is the best way to implement paging when lazy loading more data in web applications, for instance.

Note, the "seek method" is also called keyset paging.

Solution 4

LinqToSql will automatically convert a .Skip(N1).Take(N2) into the TSQL syntax for you. In fact, every "query" you do in Linq, is actually just creating a SQL query for you in the background. To test this, just run SQL Profiler while your application is running.

The skip/take methodology has worked very well for me, and others from what I read.

Out of curiosity, what type of self-paging query do you have, that you believe is more efficient than Linq's skip/take?

Solution 5

We use a CTE wrapped in Dynamic SQL (because our application requires dynamic sorting of data server side) within a stored procedure. I can provide a basic example if you'd like.

I haven't had a chance to look at the T/SQL that LINQ produces. Can someone post a sample?

We don't use LINQ or straight access to the tables as we require the extra layer of security (granted the dynamic SQL breaks this somewhat).

Something like this should do the trick. You can add in parameterized values for parameters, etc.

exec sp_executesql 'WITH MyCTE AS (
    SELECT TOP (10) ROW_NUMBER () OVER ' + @SortingColumn + ' as RowID, Col1, Col2
    FROM MyTable
    WHERE Col4 = ''Something''
)
SELECT *
FROM MyCTE
WHERE RowID BETWEEN 10 and 20'
Share:
101,975
StoneHeart
Author by

StoneHeart

Updated on June 03, 2020

Comments

  • StoneHeart
    StoneHeart almost 4 years

    Should I use LINQ's Skip() and Take() method for paging, or implement my own paging with a SQL query?

    Which is most efficient? Why would I choose one over the other?

    I'm using SQL Server 2008, ASP.NET MVC and LINQ.

  • Jarrod Dixon
    Jarrod Dixon almost 15 years
    Nice answer - common table expression is a good way to do paging.
  • Imre Pühvel
    Imre Pühvel about 12 years
    I doubt there will be any performance impact by using this advice. Can't find a reference for this but inner join order in query can differ from actual join order. The latter is decided by query optimizer using table's statistics and operation cost estimations.
  • Misi
    Misi almost 12 years
    Could you check my question(stackoverflow.com/questions/11100929/…)? I made a SP that I added to my EDMX and used it in a linq-to-entities query.
  • Matt
    Matt almost 12 years
    @mrdenny - One hint for the example you've provided: With sp_executesql you have the possibility to pass parameters in a secure way, e.g.: EXECUTE sp_executesql 'WITH myCTE AS ... WHERE Col4=@p1) ...', '@p1 nvarchar(max)', @ValueForCol4. Secure in this context means it is robust against SQL injection - you can pass every possible value inside the variable @ValueForCol4 - even '--', and the query will still work!
  • Ezequiel
    Ezequiel almost 12 years
    @mrdenny Hi, instead of concatenating the query we use something like this: SELECT ROW_NUMBER() OVER (ORDER BY CASE WHEN @CampoId = 1 THEN Id WHEN @CampoId = 2 THEN field2 END)
  • mrdenny
    mrdenny almost 12 years
    That can produce some awful SQL Execution plans.
  • Cohen
    Cohen over 11 years
    +1, good answer, I appreciate you explain the performance benefits of the second example
  • Lukas Eder
    Lukas Eder over 10 years
    @ImreP: This might actually somewhat correspond to the seek method, that I have described. Although, I'm not sure where @p0 and more specifically @p1 come from
  • Lukas Eder
    Lukas Eder over 10 years
    @mrdenny: For large page numbers, the seek method can be much faster than ROW_NUMBER() OVER() offset emulation. See also: 4guysfromrolla.com/webtech/042606-1.shtml
  • Lukas Eder
    Lukas Eder over 10 years
    @Johan: There is an alternative called the seek method that heavily outperforms offsets for large page numbers.
  • Smudge202
    Smudge202 over 7 years
    I think this is incorrect. The SQL displayed shows rows from 502-601 (unless you're zero-indexing?)
  • Volkan Sen
    Volkan Sen about 4 years
    No it does get rows from 501 to 600