GroupBy with elementSelector and resultSelector

10,306

For IEnumerable:

petsList.GroupBy(
    pet => Math.Floor(pet.Age), // keySelector
    (age, pets) => new          // resultSelector
    {
        Key = age,
        Count = pets.Count(),
        Min = pets.Min(pet => pet.Age),
        Max = pets.Max(pet => pet.Age)
    });

is equevalent to:

var query = petsList.GroupBy(
    pet => Math.Floor(pet.Age), // keySelector
    pet => pet,             // elementSelector
    (baseAge, ages) => new      // resultSelector
    {
        Key = baseAge,
        Count = ages.Count(),
        Min = ages.Min(pet => pet.Age),
        Max = ages.Max(pet => pet.Age)
    });

using of elementSelector could simplifies expressions in resultSelector (compare next and previous):

var query = petsList.GroupBy(
    pet => Math.Floor(pet.Age), // keySelector
    pet => pet.Age,             // elementSelector
    (baseAge, ages) => new      // resultSelector
    {
        Key = baseAge,
        Count = ages.Count(),
        Min = ages.Min(), //there is no lambda due to element selector
        Max = ages.Max() ////there is no lambda due to element selector
    });

In IQueryable, it is not so simple. You can look at sources of this methods:

public static IQueryable<TResult> GroupBy<TSource, TKey, TElement, TResult>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector, Expression<Func<TSource, TElement>> elementSelector, Expression<Func<TKey, IEnumerable<TElement>, TResult>> resultSelector)
        {
            if (source == null)
                throw Error.ArgumentNull("source"); 
            if (keySelector == null)
                throw Error.ArgumentNull("keySelector"); 
            if (elementSelector == null) 
                throw Error.ArgumentNull("elementSelector");
            if (resultSelector == null) 
                throw Error.ArgumentNull("resultSelector");
            return source.Provider.CreateQuery<TResult>(
                Expression.Call(
                    null, 
                    ((MethodInfo)MethodBase.GetCurrentMethod()).MakeGenericMethod(typeof(TSource), typeof(TKey), typeof(TElement), typeof(TResult)),
                    new Expression[] { source.Expression, Expression.Quote(keySelector), Expression.Quote(elementSelector), Expression.Quote(resultSelector) } 
                    )); 
        }

public static IQueryable<TResult> GroupBy<TSource, TKey, TResult>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector,Expression<Func<TKey, IEnumerable<TSource>, TResult>> resultSelector)
        {
            if (source == null)
                throw Error.ArgumentNull("source"); 
            if (keySelector == null)
                throw Error.ArgumentNull("keySelector"); 
            if (resultSelector == null) 
                throw Error.ArgumentNull("resultSelector");
            return source.Provider.CreateQuery<TResult>( 
                Expression.Call(
                    null,
                    ((MethodInfo)MethodBase.GetCurrentMethod()).MakeGenericMethod(typeof(TSource), typeof(TKey), typeof(TResult)),
                    new Expression[] { source.Expression, Expression.Quote(keySelector), Expression.Quote(resultSelector) } 
                    ));
        } 

As you can see, they returns different expressions, so I'm not sure that result SQL query will be same in all cases, but i suppose that SQL query for overload with elementSelector + resultSelector will be not slower compare to overload without elementSelector.

Answer 1: No, for IEnumerable there is no query that you cannot express by using the resultSelector alone.

Answer 2. No, there are no counterpart for the two different overloads when using LINQ query syntax. Extension methods have more possibilities compare to LINQ query syntax.

Answer 3 (For side question): it is not guaranteed that sql queries will be same for this overloads.

Share:
10,306

Related videos on Youtube

Slauma
Author by

Slauma

Updated on September 14, 2022

Comments

  • Slauma
    Slauma over 1 year

    The Enumerable.GroupBy and Queryable.GroupBy extensions have 8 overloads. Two of them (for Enumerable.GroupBy) are:

    // (a)
    IEnumerable<TResult> GroupBy<TSource, TKey, TResult>(
        this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector,
        Func<TKey, IEnumerable<TSource>, TResult> resultSelector);
    
    // (b)
    IEnumerable<TResult> GroupBy<TSource, TKey, TElement, TResult>(
        this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector,
        Func<TSource, TElement> elementSelector,
        Func<TKey, IEnumerable<TElement>, TResult> resultSelector);
    

    (for Queryable.GroupBy the same, just with Expression<Func<... instead of Func<...)

    (b) has an additional elementSelector as parameter.

    On MSDN is an example for overload (a) and an example for overload (b). They both work with the same example source collection:

    List<Pet> petsList = new List<Pet>
    {
        new Pet { Name="Barley", Age=8.3 },
        new Pet { Name="Boots", Age=4.9 },
        new Pet { Name="Whiskers", Age=1.5 },
        new Pet { Name="Daisy", Age=4.3 }
    };
    

    Example (a) uses this query:

    var query = petsList.GroupBy(
        pet => Math.Floor(pet.Age), // keySelector
        (age, pets) => new          // resultSelector
        {
            Key = age,
            Count = pets.Count(),
            Min = pets.Min(pet => pet.Age),
            Max = pets.Max(pet => pet.Age)
        });
    

    And example (b) uses this query:

    var query = petsList.GroupBy(
        pet => Math.Floor(pet.Age), // keySelector
        pet => pet.Age,             // elementSelector
        (baseAge, ages) => new      // resultSelector
        {
            Key = baseAge,
            Count = ages.Count(),
            Min = ages.Min(),
            Max = ages.Max()
        });
    

    The result of both queries is exactly the same.

    Question 1: Is there any kind of query that I cannot express by using the resultSelector alone and where I really would need the elementSelector? Or are the capabilities of the two overloads always equivalent and it is just a matter of taste to use one or the other way?

    Question 2: Is there a counterpart for the two different overloads when using LINQ query syntax?

    (As a side question: When using Queryable.GroupBy with Entity Framework, will both overloads be translated into the exact same SQL?)