EF Core nested Linq select results in N + 1 SQL queries

13,374

Solution 1

The GitHub issue #4007 has been marked as closed-fixed for milestone 2.1.0-preview1. And now the 2.1 preview1 has been made available on NuGet as discussed in this .NET Blog post.

Version 2.1 proper is also released, install it with the following command:

Install-Package Microsoft.EntityFrameworkCore.SqlServer -Version 2.1.0

Then use .ToList() on the nested .Select(x => ...) to indicate the result should be fetched immediately. For my original question this looks like this:

var query = context.Top
    .Select(t => new {
        prop1 = t.C1,
        prop2 = t.Sub.Select(s => new {
            prop21 = s.C3
        })
        .ToList() // <-- Add this
    });
var res = query.ToArray(); // Execute the Linq query

This results in 2 SQL queries being run on the database (instead of N + 1); First a plain SELECT FROM the 'Top' table and then a SELECT FROM the 'Sub' table with an INNER JOIN FROM the 'Top' table, based on Key-ForeignKey relation [Sub].[TopId] = [Top].[Id]. The results of these queries are then combined in memory.

The result is exactly what you would expect and very similar to what EF6 would have returned: An array of anonymous type 'a which has properties prop1 and prop2 where prop2 is a List of anonymous type 'b which has a property prop21. Most importantly all of this is fully loaded after the .ToArray() call!

Solution 2

I faced the same problem.

The solution what you proposed doesn't work for relatively big tables. If you have a look on the generated query it would be an inner join without where condition.

var query2 = context.Top .Include(t => t.Sub) .ToArray() .Select(t => new //... select what is needed, fill anonymous types

I solved it with redesign of database though I would be happy to hear a better solution.

In my case, I have two tables A and B. Table A has one-to-many with B. When I tried to solve it directly with a list as you desribed I didn't manage to do it(running time for .NET LINQ was 0.5 second, whereas .NET Core LINQ failed after 30 seconds of running time).

As a result I had to create a foreign key for table B and start from the side of table B without an inner list.

context.A.Where(a => a.B.ID == 1).ToArray();

Afterwards you can simply manipulate the resulted .NET objects.

Share:
13,374
GWigWam
Author by

GWigWam

Curly brace enthusiast The Problem(s) with Nullable Reference Types

Updated on June 06, 2022

Comments

  • GWigWam
    GWigWam almost 2 years

    I have a data model where a 'Top' object has between 0 and N 'Sub' objects. In SQL this is achieved with a foreign key dbo.Sub.TopId.

    var query = context.Top
        //.Include(t => t.Sub) Doesn't seem to do anything
        .Select(t => new {
            prop1 = t.C1,
            prop2 = t.Sub.Select(s => new {
                prop21 = s.C3 //C3 is a column in the table 'Sub'
            })
            //.ToArray() results in N + 1 queries
        });
    var res = query.ToArray();
    

    In Entity Framework 6 (with lazy-loading off) this Linq query would be converted to a single SQL query. The result would be fully loaded, so res[0].prop2 would be an IEnumerable<SomeAnonymousType> which is already filled.

    When using EntityFrameworkCore (NuGet v1.1.0) however the sub-collection is not yet loaded and is of type:

    System.Linq.Enumerable.WhereSelectEnumerableIterator<Microsoft.EntityFrameworkCore.Storage.ValueBuffer, <>f__AnonymousType1<string>>.
    

    The data will not be loaded until you iterate over it, resulting in N + 1 queries. When i add .ToArray() to the query (as shown in comments) the data gets fully loaded into var res, using a SQL profiler however shows this isn't achieved in 1 SQL query anymore. For each 'Top' object a query on the 'Sub' table is executed.

    First specifying .Include(t => t.Sub) doesn't seem to change anything. The use of anonymous types doesn't seem to be the problem either, replacing the new { ... } blocks with new MyPocoClass { ... } doesn't change anything.

    My question is: Is there a way to get behavior similar to EF6, where all data is loaded immediately?


    Note: i realize that in this example the problem can be fixed by creating the anonymous objects in memory after executing the query like so:

    var query2 = context.Top
        .Include(t => t.Sub)
        .ToArray()
        .Select(t => new //... select what is needed, fill anonymous types
    

    However this is just an example, i do actually need the creation of objects to be part of the Linq query as AutoMapper uses this to fill DTOs in my project


    Update: Tested with the new EF Core 2.0, issue is stil present. (21-08-2017)

    Issue is tracked on aspnet/EntityFrameworkCore GitHub repo: Issue 4007

    Update: A year later, this issue has been fixed in version 2.1.0-preview1-final. (2018-03-01)

    Update: EF version 2.1 has been released, it includes a fix. see my answer below. (2018-05-31)