LINQ Partition List into Lists of 8 members
Solution 1
Use the following extension method to break the input into subsets
public static class IEnumerableExtensions
{
public static IEnumerable<List<T>> InSetsOf<T>(this IEnumerable<T> source, int max)
{
List<T> toReturn = new List<T>(max);
foreach(var item in source)
{
toReturn.Add(item);
if (toReturn.Count == max)
{
yield return toReturn;
toReturn = new List<T>(max);
}
}
if (toReturn.Any())
{
yield return toReturn;
}
}
}
Solution 2
We have just such a method in MoreLINQ as the Batch method:
// As IEnumerable<IEnumerable<T>>
var items = list.Batch(8);
or
// As IEnumerable<List<T>>
var items = list.Batch(8, seq => seq.ToList());
Solution 3
You're better off using a library like MoreLinq, but if you really had to do this using "plain LINQ", you can use GroupBy
:
var sequence = new[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
var result = sequence.Select((x, i) => new {Group = i/8, Value = x})
.GroupBy(item => item.Group, g => g.Value)
.Select(g => g.Where(x => true));
// result is: { {1,2,3,4,5,6,7,8}, {9,10,11,12,13,14,15,16} }
Basically, we use the version of Select()
that provides an index for the value being consumed, we divide the index by 8 to identify which group each value belongs to. Then we group the sequence by this grouping key. The last Select
just reduces the IGrouping<>
down to an IEnumerable<IEnumerable<T>>
(and isn't strictly necessary since IGrouping
is an IEnumerable
).
It's easy enough to turn this into a reusable method by factoring our the constant 8
in the example, and replacing it with a specified parameter.
It's not necessarily the most elegant solution, and it is not longer a lazy, streaming solution ... but it does work.
You could also write your own extension method using iterator blocks (yield return
) which could give you better performance and use less memory than GroupBy
. This is what the Batch()
method of MoreLinq does IIRC.
Solution 4
It's not at all what the original Linq designers had in mind, but check out this misuse of GroupBy:
public static IEnumerable<IEnumerable<T>> BatchBy<T>(this IEnumerable<T> items, int batchSize)
{
var count = 0;
return items.GroupBy(x => (count++ / batchSize)).ToList();
}
[TestMethod]
public void BatchBy_breaks_a_list_into_chunks()
{
var values = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var batches = values.BatchBy(3);
batches.Count().ShouldEqual(4);
batches.First().Count().ShouldEqual(3);
batches.Last().Count().ShouldEqual(1);
}
I think it wins the "golf" prize for this question. The ToList
is very important since you want to make sure the grouping has actually been performed before you try doing anything with the output. If you remove the ToList
, you will get some weird side effects.
Related videos on Youtube
jlemley
Pretzel is a bit Twisted and sometimes a little Salty; He always goes great with beer! Quote of the Moment: Physicists stand on each others’ shoulders while computer scientists stand on each others' toes. (Sad, but often true.)
Updated on May 04, 2020Comments
-
jlemley almost 4 years
How would one take a List (using LINQ) and break it into a List of Lists partitioning the original list on every 8th entry?
I imagine something like this would involve Skip and/or Take, but I'm still pretty new to LINQ.
Edit: Using C# / .Net 3.5
Edit2: This question is phrased differently than the other "duplicate" question. Although the problems are similar, the answers in this question are superior: Both the "accepted" answer is very solid (with the
yield
statement) as well as Jon Skeet's suggestion to use MoreLinq (which is not recommended in the "other" question.) Sometimes duplicates are good in that they force a re-examination of a problem.-
frankgut over 13 yearsAre you using VB or C#? The presence of iterators makes a big difference.
-
Harald Coppoolse about 8 yearsThis is not a duplicate. The other question wanted a to break the list into sublists of every n-th element, so a list with elements 0, 8, 16, 24, etc and a list with elements 1, 9, 17, 25, etc. and a list with elements 2, 10, 18, etc. This user wants to break into a list with 0..7 and a list with 8..15 and a list with 16..24, similar to paging
-
-
Kirk Woll over 13 yearsCool, this is implemented very nicely, including a resultSelector (important for manipulating/ordering the inner list).
-
jlemley over 13 yearsPhew. I thought perhaps I was a bit daft in not being able to figure this out. Good to see that there are some things that are "missing" from regular LINQ-to-Objects. :)
-
jlemley over 13 yearsI'm going to try this now as this seems quite clever... The thought of "yield return" popped into my head while mulling this over, but I couldn't see a clear way to do it... I'll let you know how this works for me.
-
jlemley over 13 yearsWow! That's really frickin' cool. I'm going with this! Thanks for the help! :-)
-
LBushkin over 13 years@Pretzel: It's not that this is impossible using plain old LINQ ... it's just that it's neither terribly efficient or easy to understand. See my answer for a "plain LINQ" example.
-
jlemley over 13 yearsThanks for your input. Yeah, it doesn't seem efficient and as you can imagine, I was struggling to understand how I could do it with regular LINQ. (I'm staring at your answer right now and I really don't understand it very well.) I'll have to fiddle with it more later. (Thanks again!)
-
LBushkin over 13 yearsThe approach using
GroupBy()
breaks down if the sequence you're planning on batching is going to be extremely large (or infinite). As far as how it works - it creates an anonymous object which associates each item with it's index, and then groups this into a series of sequences based on divisibility by8
(or any other non-zero constant). -
jlemley over 13 years+1, Thanks for the link to this library. I'll see if I can use it in future projects.
-
Mel over 13 yearsFor the record, Handcraftsman's "yield return"-based version performs much better, but I still like the "Hey, you're not supposed to be doing that" aspect of this code.
-
Raymond over 11 years+1 But note this groups every 8th item, i.e. you get {0,8,16}, {1,9,17}, ...
-
nawfal over 11 years-1 This answer is wrong. If you replace division by modulo operator you get batchSize number of partitions which is more suited for this thread stackoverflow.com/questions/438188/…
-
nawfal over 11 yearsThis is more about splitting than partitioning - more suitable here
-
nawfal over 11 yearsFurthermore this might give empty inner IEnumerables if number 8 is more than the number of items in
items
which may or may not be desirable. I incorporated your answer in the other thread anyway.. -
Mel over 11 yearsI disagree. If you use modulo, then you're assigning items to the groups by the remainder. You'd get item 0 in group 0, item 1 in group 1, etc. This would totally scramble their order. Using the integer division, as I have done here, means that items 0-2 go in group 0, 3-5 go in group 1, etc. I believe this is what was intended. If you agree, please remove the -1 from my answer.
-
nawfal over 11 yearsyou're right. I missed the ToList part which actually executes the query before leaving. Why not use a lighter ToArray? I have to edit your answer to remove my -1. +1ed too! :)
-
Mel over 11 yearsToList is just what I used at the time, but ToArray ought to work just fine.
-
drzaus almost 10 yearswhew, that is much faster! although stackoverflow.com/a/4835541/1037948 started edging you out after a couple runs in linqpad... ;)
-
nawfal almost 10 years@drzaus keep in mind that that answer is a dangerous construct with side effects. Since the inner and outer runs on the same enumerator, you get results you do not expect. If you enumerate just the outer sequence, the inner sequences are no longer valid; or you can not iterate the inner sequences twice. Or simply try
var x = returnedResult.ElementAt(n).ToList();
, you get unexpected results. -
drzaus over 9 yearsSo I recently tried running some perf comparisons again (tests+results here) between just calling this method and doing something with the results, and it wasn't as fast vs some of the other suggestions. Am I missing something?
-
nawfal over 9 years@drzaus it was wonderful you did that benchmarking. I will see to your link in a week. I need some time now. Thanks :) will let you know.
-
drzaus almost 9 yearsRecently realized the 'straightforward' solution results in duplicate enumeration/evaluation -- see explanation gist.github.com/zaus/4cae1a5b42475a083287 which is probably also pointed out on the "already answered" question
-
nawfal almost 9 yearsI will see all that. Thanks. Gimme some time.
-
drzaus almost 9 yearsMinor question - I happened to copy the
Batch
code, and Resharper warns me that inreturn Batch(source, size, x => x);
thex
is a possible multiple enumeration issue. Correct/Ignore?