How to perform a group join in .NET Core 3.0 Entity Framework?

21,837

Solution 1

As discussed here, you're attempting a query that isn't supported by the database. EF Core 2 used client-side evaluation to make your code work, but EF Core 3 refuses, because the client-side convenience comes at the cost of hard-to-debug performance problems as the dataset increases.

You can use use DefaultIfEmpty to left join the patients' studies and then group manually with ToLookup.

var query =
    from p in db.Patients
    join s in db.Studies on p.Id equals s.PatientId into studies
    from s in studies.DefaultIfEmpty()
    select new { Patient = p, Study = s };

var grouping = query.ToLookup(e => e.Patient); // Grouping done client side

The above example grabs the full Patient and Study entities, but you can cherry pick columns instead. If the data you need from Patient is too big to repeat for each Study, in the joined query select only the Patient ID, querying the rest of the Patient data in a separate non-joined query.

Solution 2

Had exactly the same issue and a big struggle with it. It turns out that .net Core 3.0 does not support Join or Groupjoin in method syntax (yet?). The fun part is though, it does work in Query syntax.

Try this, it's query syntax with a bit of method syntax. This translates nicely to the correct SQL query with a nice left outer join and it is processed on the database. I haven't got your models so you need to check the syntax yourselves....

var queryResults1 = 
    (from p in _context.patients
    from s in _context.Studies.Where(st => st.PatientId == p.Id).DefaultIfEmpty()
    select new
    {
        p.DateOfBirth,
        p.Id,
        p.Name,
        p.Sex,
        Studies = studies.Select(s1 => s1)
    }).ToListAsync();
Share:
21,837

Related videos on Youtube

shelbypereira
Author by

shelbypereira

.NET Software Architect with 10 years experience. considerable experience with embedded C++ on the Linux platform and some Java development. Experience with all the major DBs (Oracle, SQL Server, MySQL) currently mostly working with .NET Core 3.0 on medical applications including Dicom imaging software.

Updated on July 09, 2022

Comments

  • shelbypereira
    shelbypereira almost 2 years

    With the changes to .NET Core 3.0 I am getting

    ... NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.) ---> System.InvalidOperationException: Processing of the LINQ expression 'GroupJoin, ...

    This is a really simple query so there must be a way to perform it in .NET CORE 3.0:

     var queryResults1 = await patients
                .GroupJoin(
                    _context.Studies,
                    p => p.Id,
                    s => s.Patient.Id,
                    (p, studies) => new 
                    {
                        p.DateOfBirth,
                        p.Id,
                        p.Name,
                        p.Sex,
                       Studies =studies.Select(s1=>s1)
                    }
                )
                .AsNoTracking().ToListAsync();
    

    I am basically looking for a Linq query (or method syntax as above) which will join Studies onto Patients, and set Studies to an empty list or null if there are no studies for the given patient.

    Any ideas? This was working in .NET Core 2.2. Also the MSFT link above mentions that the key breaking change is related to client side evaluation and avoiding that the generated query reads entire tables which must then be joined or filtered client side. However with this simple query, the join should be easily doable server side.

  • shelbypereira
    shelbypereira over 4 years
    Answer works! I guess there is still some work to do in the query translator. A simple query like this should be translatable. There should not be performance issues for a simple group join of 2 tables as the dataset increase assuming FK/indexes are correct. I suspect many people will have tis issue, a 2 table group join is quite a standard and often used query.
  • Edward Brey
    Edward Brey over 4 years
    @she72 I agree. It looks like the problem stems from the difference in how LINQ and SQL use the "group" keyword. EF Core should translate the LINQ groupby into left joins where doing so doesn't pull back more rows than expected. I posted a comment accordingly.
  • shelbypereira
    shelbypereira over 4 years
    I have a follow-up question, I am still trying to understand why the grouping for this type of query needs to be done client side, seems a limitation of the new LINQ framework. For the case above I don't see any risks that it slows down client side execution in unexpected ways. Can you clarify?
  • shelbypereira
    shelbypereira over 4 years
    And as a further follow-up the main concern is: in your reformulated query which groups client side if I have 1000 studies per patient, I will be loading each patient 1000 times from the DB? is there any alternative to force this work to be done in the DB and return the grouped results?
  • Edward Brey
    Edward Brey over 4 years
    @shev72 The only grouping the database understands involve aggregates, for a example a query of patients with a count of studies per patient. The database always returns a rectangular dataset. A hierarchical grouping has to be composed by the client. You could look at it as client-side evaluation or as part of the ORM. In a hierarchical grouping, parent entity data is repeated, though not re-queried.
  • hwmaat
    hwmaat over 4 years
    By the way, the Join and GroupJoin with Method syntac DO work with the non-core Framework and EF. And do translate to the right query that is pocessed server side
  • Ankur Arora
    Ankur Arora over 4 years
    what is studies in studies.Select(s1 => s1)
  • hwmaat
    hwmaat over 4 years
    The models were not included in the question so I do not know the studies model. My best guess is that this is a virtual collection in the model.
  • NetMage
    NetMage almost 4 years
    All query syntax is translated to fluent / lambda / "method" syntax so if it translates in query syntax, the equivalent fluent query will also translate.
  • thalacker
    thalacker about 3 years
    Why not do a .ToList() rather than the .ToLookup()?
  • Edward Brey
    Edward Brey about 3 years
    @thalacker ToList doesn't group the results. You would just get a flat list of studies (and their related patients) rather than a list of studies grouped by patients.