LINQ to find series of consecutive numbers

10,468

Solution 1

A linqish way can be writing an extension method GroupWhile like below (All checks omitted. not optimized to understand easily.)

int[] list = new int[] { 1, 2, 3, 5, 7, 8 };
var result = list.GroupWhile((x, y) => y - x == 1)
                 .Select(x => new {i = x.First(), len = x.Count()  })
                 .ToList();

public static IEnumerable<IEnumerable<T>> GroupWhile<T>(this IEnumerable<T> seq, Func<T,T,bool> condition)
{
    T prev = seq.First();
    List<T> list = new List<T>() { prev };

    foreach(T item in seq.Skip(1))
    {
        if(condition(prev,item)==false)
        {
            yield return list;
            list = new List<T>();
        }
        list.Add(item);
        prev = item;
    }

    yield return list;
}

TODO: use IGrouping :)

Solution 2

This seems like a reasonable approach:

  1. Zip the original list with a Range, so each element is tupled with its index
  2. Select those elements whose list predecessor is not their natural predecessor
  3. Convert to array and save to temporary variable (to facilitate the last step).
  4. Deduce the length of the subarrays from the indices. For the last item it is the difference with the original list length. For the other items it is the difference with the next index.

var list = new int[] { 1, 2, 3, 5, 7, 8 };
var filtered = list.Zip(Enumerable.Range(0, list.Length), Tuple.Create)
            .Where((x, i) => i == 0 || list[i - 1] != x.Item1 - 1).ToArray();

var result = filtered.Select((x, i) => i == filtered.Length - 1 
                ? Tuple.Create(x.Item1, list.Length - x.Item2) 
                : Tuple.Create(x.Item1, filtered[i + 1].Item2 - x.Item2));

foreach (var t in result)
{
    Console.WriteLine(t);
}

This results in

(1, 3)
(5, 1)
(7, 2)

Solution 3

Did someone ask to shoehorn a solution into a LINQ query of dubious readability?

var serieses = input.Aggregate(
    new List<Tuple<int, int>>(),
    (l, i) =>
        {
            var last = l.LastOrDefault();
            if (last == null ||
                last.Item1 + last.Item2 != i)
            {
                l.Add(new Tuple<int, int>(i, 1));
            }
            else if (last.Item1 + last.Item2 == i)
            {
                l.RemoveAt(l.Count - 1);
                l.Add(new Tuple<int, int>(last.Item1, last.Item2 + 1));
            }

            return l;
        },
    l => l);

Solution 4

Try with this:

// Static class to contain the extension methods.
public static class MyExtensions
{
    public static IEnumerable<IGrouping<TKey, TSource>> ChunkBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
    {
        return source.ChunkBy(keySelector, EqualityComparer<TKey>.Default);
    }

    public static IEnumerable<IGrouping<TKey, TSource>> ChunkBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
    {
        // Flag to signal end of source sequence.
        const bool noMoreSourceElements = true;

        // Auto-generated iterator for the source array.       
        var enumerator = source.GetEnumerator();

        // Move to the first element in the source sequence.
        if (!enumerator.MoveNext()) yield break;

        // Iterate through source sequence and create a copy of each Chunk.
        // On each pass, the iterator advances to the first element of the next "Chunk"
        // in the source sequence. This loop corresponds to the outer foreach loop that
        // executes the query.
        Chunk<TKey, TSource> current = null;
        while (true)
        {
            // Get the key for the current Chunk. The source iterator will churn through
            // the source sequence until it finds an element with a key that doesn't match.
            var key = keySelector(enumerator.Current);

            // Make a new Chunk (group) object that initially has one GroupItem, which is a copy of the current source element.
            current = new Chunk<TKey, TSource>(key, enumerator, value => comparer.Equals(key, keySelector(value)));

            // Return the Chunk. A Chunk is an IGrouping<TKey,TSource>, which is the return value of the ChunkBy method.
            // At this point the Chunk only has the first element in its source sequence. The remaining elements will be
            // returned only when the client code foreach's over this chunk. See Chunk.GetEnumerator for more info.
            yield return current;

            // Check to see whether (a) the chunk has made a copy of all its source elements or 
            // (b) the iterator has reached the end of the source sequence. If the caller uses an inner
            // foreach loop to iterate the chunk items, and that loop ran to completion,
            // then the Chunk.GetEnumerator method will already have made
            // copies of all chunk items before we get here. If the Chunk.GetEnumerator loop did not
            // enumerate all elements in the chunk, we need to do it here to avoid corrupting the iterator
            // for clients that may be calling us on a separate thread.
            if (current.CopyAllChunkElements() == noMoreSourceElements)
            {
                yield break;
            }
        }
    }

    // A Chunk is a contiguous group of one or more source elements that have the same key. A Chunk 
    // has a key and a list of ChunkItem objects, which are copies of the elements in the source sequence.
    class Chunk<TKey, TSource> : IGrouping<TKey, TSource>
    {
        // INVARIANT: DoneCopyingChunk == true || 
        //   (predicate != null && predicate(enumerator.Current) && current.Value == enumerator.Current)

        // A Chunk has a linked list of ChunkItems, which represent the elements in the current chunk. Each ChunkItem
        // has a reference to the next ChunkItem in the list.
        class ChunkItem
        {
            public ChunkItem(TSource value)
            {
                Value = value;
            }
            public readonly TSource Value;
            public ChunkItem Next = null;
        }

        // The value that is used to determine matching elements
        private readonly TKey key;

        // Stores a reference to the enumerator for the source sequence
        private IEnumerator<TSource> enumerator;

        // A reference to the predicate that is used to compare keys.
        private Func<TSource, bool> predicate;

        // Stores the contents of the first source element that
        // belongs with this chunk.
        private readonly ChunkItem head;

        // End of the list. It is repositioned each time a new
        // ChunkItem is added.
        private ChunkItem tail;

        // Flag to indicate the source iterator has reached the end of the source sequence.
        internal bool isLastSourceElement = false;

        // Private object for thread syncronization
        private object m_Lock;

        public Chunk(TKey key, IEnumerator<TSource> enumerator, Func<TSource, bool> predicate)
        {
            this.key = key;
            this.enumerator = enumerator ?? throw new NullReferenceException(nameof(enumerator));
            this.predicate = predicate ?? throw new NullReferenceException(nameof(predicate));

            // A Chunk always contains at least one element.
            head = new ChunkItem(enumerator.Current);

            // The end and beginning are the same until the list contains > 1 elements.
            tail = head;

            m_Lock = new object();
        }

        // Indicates that all chunk elements have been copied to the list of ChunkItems, 
        // and the source enumerator is either at the end, or else on an element with a new key.
        // the tail of the linked list is set to null in the CopyNextChunkElement method if the
        // key of the next element does not match the current chunk's key, or there are no more elements in the source.
        private bool DoneCopyingChunk => tail == null;

        // Adds one ChunkItem to the current group
        // REQUIRES: !DoneCopyingChunk && lock(this)
        private void CopyNextChunkElement()
        {
            // Try to advance the iterator on the source sequence.
            // If MoveNext returns false we are at the end, and isLastSourceElement is set to true
            isLastSourceElement = !enumerator.MoveNext();

            // If we are (a) at the end of the source, or (b) at the end of the current chunk
            // then null out the enumerator and predicate for reuse with the next chunk.
            if (isLastSourceElement || !predicate(enumerator.Current))
            {
                enumerator = null;
                predicate = null;
            }
            else
            {
                tail.Next = new ChunkItem(enumerator.Current);
            }

            // tail will be null if we are at the end of the chunk elements
            // This check is made in DoneCopyingChunk.
            tail = tail.Next;
        }

        // Called after the end of the last chunk was reached. It first checks whether
        // there are more elements in the source sequence. If there are, it 
        // Returns true if enumerator for this chunk was exhausted.
        internal bool CopyAllChunkElements()
        {
            while (true)
            {
                lock (m_Lock)
                {
                    if (DoneCopyingChunk)
                    {
                        // If isLastSourceElement is false,
                        // it signals to the outer iterator
                        // to continue iterating.
                        return isLastSourceElement;
                    }
                    else
                    {
                        CopyNextChunkElement();
                    }
                }
            }
        }

        public TKey Key => key;

        // Invoked by the inner foreach loop. This method stays just one step ahead
        // of the client requests. It adds the next element of the chunk only after
        // the clients requests the last element in the list so far.
        public IEnumerator<TSource> GetEnumerator()
        {
            //Specify the initial element to enumerate.
            ChunkItem current = head;

            // There should always be at least one ChunkItem in a Chunk.
            while (current != null)
            {
                // Yield the current item in the list.
                yield return current.Value;

                // Copy the next item from the source sequence, 
                // if we are at the end of our local list.
                lock (m_Lock)
                {
                    if (current == tail)
                    {
                        CopyNextChunkElement();
                    }
                }

                // Move to the next ChunkItem in the list.
                current = current.Next;
            }
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() => GetEnumerator();
    }
}

// A simple named type is used for easier viewing in the debugger. Anonymous types
// work just as well with the ChunkBy operator.
public class KeyValPair
{
    public string Key { get; set; }
    public string Value { get; set; }
}

class Program
{
    // The source sequence.
    public static IEnumerable<KeyValPair> list;

    // Query variable declared as class member to be available
    // on different threads.
    static IEnumerable<IGrouping<string, KeyValPair>> query;

    static void Main(string[] args)
    {
        // Initialize the source sequence with an array initializer.
        list = new[]
        {
            new KeyValPair{ Key = "A", Value = "We" },
            new KeyValPair{ Key = "A", Value = "think" },
            new KeyValPair{ Key = "A", Value = "that" },
            new KeyValPair{ Key = "B", Value = "Linq" },
            new KeyValPair{ Key = "C", Value = "is" },
            new KeyValPair{ Key = "A", Value = "really" },
            new KeyValPair{ Key = "B", Value = "cool" },
            new KeyValPair{ Key = "B", Value = "!" }
        };

        // Create the query by using our user-defined query operator.
        query = list.ChunkBy(p => p.Key);

        // ChunkBy returns IGrouping objects, therefore a nested
        // foreach loop is required to access the elements in each "chunk".
        foreach (var item in query)
        {
            Console.WriteLine($"Group key = {item.Key}");
            foreach (var inner in item)
            {
                Console.WriteLine($"\t{inner.Value}");
            }
        }

        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }
}

The extension method implements the native IGrouping interface (same as GroupBy). The storage is done with a linked element list instead of a List. The execution of the extension method is thread safe and lazy.

Solution 5

There is no such out of box extension method, but you can create you own:

public static class LinqUtils{
    public class ConsecutiveGroup<T>
    {
        public T Left { get; internal set; }
        public T Right { get; internal set; }
        public long Count { get; internal set; }
    }

    public static IEnumerable<ConsecutiveGroup<T>> ConsecutiveCounts<T>(this IEnumerable<T> src, Func<T, T, bool> consecutive)
    {
        ConsecutiveGroup<T> current = null;
        foreach (var s in src)
        {
            if (current==null)
            {
                current = new ConsecutiveGroup<T>
                    {
                        Left = s,
                        Right = s,
                        Count = 1
                    };
                continue;
            }

            if(consecutive(current.Right, s))
            {
                current.Right = s;
                current.Count += 1;
                continue;
            }

            yield return current;

            current = new ConsecutiveGroup<T>
            {
                Left = s,
                Right = s,
                Count = 1
            };
        }

        if (current!=null)
        {
            yield return current;
        }
    }
}

[TestFixture]
public static class LinqUtilsTests
{
    [Test]
    public void TestConsecutiveCounts()
    {
        var src = new[] {1,2,3,5,7,8};

        var expected = new[]
            {
                Tuple.Create<int,long>(1, 3),
                Tuple.Create<int,long>(5, 1),
                Tuple.Create<int,long>(7, 2)
            };

        var result = src
            .ConsecutiveCounts((prev, current) => current == (prev + 1))
            .Select(c=>Tuple.Create(c.Left, c.Count));

        Assert.IsTrue(expected.SequenceEqual(result));
    }
}
Share:
10,468

Related videos on Youtube

John NoCookies
Author by

John NoCookies

Updated on June 30, 2022

Comments

  • John NoCookies
    John NoCookies almost 2 years

    I have a list of integers. I want to find all runs of consecutive numbers on that list, defined by the start index and length. So for example, for input list of [1,2,3,5,7,8], the output would be [{1,3}, {5,1}, {7,2}]. This is easy enough to do using a loop, something like this (untested pseudocode):

    for(i=1, i < maxNum; i++)
    {
      number = list[i];
      previousNumber = list[i-1];
      if(number - previousNumber == 1)
      {
        runLength++;
      }
      else
      {
        result.Add(startingNumber, runLength);
        runLength = 1;
        startingNumber = number;
      }
    }
    

    But I thought it would be possible to do using LINQ. Any ideas how to do that?

    • L.B
      L.B over 10 years
      Why does it have to be in Linq? (HINT Take, Skip )
    • John NoCookies
      John NoCookies over 10 years
      Because I'm curious if it can be done using a readable one-liner and Linq seems like the right tool for that.
    • Sergey Kalinichenko
      Sergey Kalinichenko over 10 years
      If it's easy enough to do using a loop, do it using a loop. It is possible that one could shoehorn a solution into a LINQ query of dubious readability, but why bother, if you have a perfectly readable solution with a simple loop?
    • Vitaliy Kalinin
      Vitaliy Kalinin over 10 years
      Care to choose between 5 solutions?
  • Servy
    Servy over 10 years
    1) This is iterating the source sequence multiple times 2) since the groupings are all references to the same exact list, it will cause major problems if each group isn't processed before the next group is generated. A simple GroupWhile(...).ToList() will result in a bunch of references to the same group. You should create a new list for each group. 3) At the end of the foreach the group will always have at least one item, never zero, so there's no need to check the count.
  • Vitaliy Kalinin
    Vitaliy Kalinin over 10 years
    OMG condition(prev,item)==false