LINQ Partition List into Lists of 8 members

27,123

Solution 1

Use the following extension method to break the input into subsets

public static class IEnumerableExtensions
{
    public static IEnumerable<List<T>> InSetsOf<T>(this IEnumerable<T> source, int max)
    {
        List<T> toReturn = new List<T>(max);
        foreach(var item in source)
        {
                toReturn.Add(item);
                if (toReturn.Count == max)
                {
                        yield return toReturn;
                        toReturn = new List<T>(max);
                }
        }
        if (toReturn.Any())
        {
                yield return toReturn;
        }
    }
}

Solution 2

We have just such a method in MoreLINQ as the Batch method:

// As IEnumerable<IEnumerable<T>>
var items = list.Batch(8);

or

// As IEnumerable<List<T>>
var items = list.Batch(8, seq => seq.ToList());

Solution 3

You're better off using a library like MoreLinq, but if you really had to do this using "plain LINQ", you can use GroupBy:

var sequence = new[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};

var result = sequence.Select((x, i) => new {Group = i/8, Value = x})
                     .GroupBy(item => item.Group, g => g.Value)
                     .Select(g => g.Where(x => true));

// result is: { {1,2,3,4,5,6,7,8}, {9,10,11,12,13,14,15,16} }

Basically, we use the version of Select() that provides an index for the value being consumed, we divide the index by 8 to identify which group each value belongs to. Then we group the sequence by this grouping key. The last Select just reduces the IGrouping<> down to an IEnumerable<IEnumerable<T>> (and isn't strictly necessary since IGrouping is an IEnumerable).

It's easy enough to turn this into a reusable method by factoring our the constant 8 in the example, and replacing it with a specified parameter. It's not necessarily the most elegant solution, and it is not longer a lazy, streaming solution ... but it does work.

You could also write your own extension method using iterator blocks (yield return) which could give you better performance and use less memory than GroupBy. This is what the Batch() method of MoreLinq does IIRC.

Solution 4

It's not at all what the original Linq designers had in mind, but check out this misuse of GroupBy:

public static IEnumerable<IEnumerable<T>> BatchBy<T>(this IEnumerable<T> items, int batchSize)
{
    var count = 0;
    return items.GroupBy(x => (count++ / batchSize)).ToList();
}

[TestMethod]
public void BatchBy_breaks_a_list_into_chunks()
{
    var values = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    var batches = values.BatchBy(3);
    batches.Count().ShouldEqual(4);
    batches.First().Count().ShouldEqual(3);
    batches.Last().Count().ShouldEqual(1);
}

I think it wins the "golf" prize for this question. The ToList is very important since you want to make sure the grouping has actually been performed before you try doing anything with the output. If you remove the ToList, you will get some weird side effects.

Share:
27,123

Related videos on Youtube

jlemley
Author by

jlemley

Pretzel is a bit Twisted and sometimes a little Salty; He always goes great with beer! Quote of the Moment: Physicists stand on each others’ shoulders while computer scientists stand on each others' toes. (Sad, but often true.)

Updated on May 04, 2020

Comments

  • jlemley
    jlemley almost 4 years

    How would one take a List (using LINQ) and break it into a List of Lists partitioning the original list on every 8th entry?

    I imagine something like this would involve Skip and/or Take, but I'm still pretty new to LINQ.

    Edit: Using C# / .Net 3.5

    Edit2: This question is phrased differently than the other "duplicate" question. Although the problems are similar, the answers in this question are superior: Both the "accepted" answer is very solid (with the yield statement) as well as Jon Skeet's suggestion to use MoreLinq (which is not recommended in the "other" question.) Sometimes duplicates are good in that they force a re-examination of a problem.

    • frankgut
      frankgut over 13 years
      Are you using VB or C#? The presence of iterators makes a big difference.
    • Harald Coppoolse
      Harald Coppoolse about 8 years
      This is not a duplicate. The other question wanted a to break the list into sublists of every n-th element, so a list with elements 0, 8, 16, 24, etc and a list with elements 1, 9, 17, 25, etc. and a list with elements 2, 10, 18, etc. This user wants to break into a list with 0..7 and a list with 8..15 and a list with 16..24, similar to paging
  • Kirk Woll
    Kirk Woll over 13 years
    Cool, this is implemented very nicely, including a resultSelector (important for manipulating/ordering the inner list).
  • jlemley
    jlemley over 13 years
    Phew. I thought perhaps I was a bit daft in not being able to figure this out. Good to see that there are some things that are "missing" from regular LINQ-to-Objects. :)
  • jlemley
    jlemley over 13 years
    I'm going to try this now as this seems quite clever... The thought of "yield return" popped into my head while mulling this over, but I couldn't see a clear way to do it... I'll let you know how this works for me.
  • jlemley
    jlemley over 13 years
    Wow! That's really frickin' cool. I'm going with this! Thanks for the help! :-)
  • LBushkin
    LBushkin over 13 years
    @Pretzel: It's not that this is impossible using plain old LINQ ... it's just that it's neither terribly efficient or easy to understand. See my answer for a "plain LINQ" example.
  • jlemley
    jlemley over 13 years
    Thanks for your input. Yeah, it doesn't seem efficient and as you can imagine, I was struggling to understand how I could do it with regular LINQ. (I'm staring at your answer right now and I really don't understand it very well.) I'll have to fiddle with it more later. (Thanks again!)
  • LBushkin
    LBushkin over 13 years
    The approach using GroupBy() breaks down if the sequence you're planning on batching is going to be extremely large (or infinite). As far as how it works - it creates an anonymous object which associates each item with it's index, and then groups this into a series of sequences based on divisibility by 8 (or any other non-zero constant).
  • jlemley
    jlemley over 13 years
    +1, Thanks for the link to this library. I'll see if I can use it in future projects.
  • Mel
    Mel over 13 years
    For the record, Handcraftsman's "yield return"-based version performs much better, but I still like the "Hey, you're not supposed to be doing that" aspect of this code.
  • Raymond
    Raymond over 11 years
    +1 But note this groups every 8th item, i.e. you get {0,8,16}, {1,9,17}, ...
  • nawfal
    nawfal over 11 years
    -1 This answer is wrong. If you replace division by modulo operator you get batchSize number of partitions which is more suited for this thread stackoverflow.com/questions/438188/…
  • nawfal
    nawfal over 11 years
    This is more about splitting than partitioning - more suitable here
  • nawfal
    nawfal over 11 years
    Furthermore this might give empty inner IEnumerables if number 8 is more than the number of items in items which may or may not be desirable. I incorporated your answer in the other thread anyway..
  • Mel
    Mel over 11 years
    I disagree. If you use modulo, then you're assigning items to the groups by the remainder. You'd get item 0 in group 0, item 1 in group 1, etc. This would totally scramble their order. Using the integer division, as I have done here, means that items 0-2 go in group 0, 3-5 go in group 1, etc. I believe this is what was intended. If you agree, please remove the -1 from my answer.
  • nawfal
    nawfal over 11 years
    you're right. I missed the ToList part which actually executes the query before leaving. Why not use a lighter ToArray? I have to edit your answer to remove my -1. +1ed too! :)
  • Mel
    Mel over 11 years
    ToList is just what I used at the time, but ToArray ought to work just fine.
  • drzaus
    drzaus almost 10 years
    whew, that is much faster! although stackoverflow.com/a/4835541/1037948 started edging you out after a couple runs in linqpad... ;)
  • nawfal
    nawfal almost 10 years
    @drzaus keep in mind that that answer is a dangerous construct with side effects. Since the inner and outer runs on the same enumerator, you get results you do not expect. If you enumerate just the outer sequence, the inner sequences are no longer valid; or you can not iterate the inner sequences twice. Or simply try var x = returnedResult.ElementAt(n).ToList();, you get unexpected results.
  • drzaus
    drzaus over 9 years
    So I recently tried running some perf comparisons again (tests+results here) between just calling this method and doing something with the results, and it wasn't as fast vs some of the other suggestions. Am I missing something?
  • nawfal
    nawfal over 9 years
    @drzaus it was wonderful you did that benchmarking. I will see to your link in a week. I need some time now. Thanks :) will let you know.
  • drzaus
    drzaus almost 9 years
    Recently realized the 'straightforward' solution results in duplicate enumeration/evaluation -- see explanation gist.github.com/zaus/4cae1a5b42475a083287 which is probably also pointed out on the "already answered" question
  • nawfal
    nawfal almost 9 years
    I will see all that. Thanks. Gimme some time.
  • drzaus
    drzaus almost 9 years
    Minor question - I happened to copy the Batch code, and Resharper warns me that in return Batch(source, size, x => x); the x is a possible multiple enumeration issue. Correct/Ignore?