How to use LINQ with a 2 dimensional array

15,586

Solution 1

It's hard to work with multidimentional arrays with LINQ, but here's how you could do:

var arr = new [,] { { 0, 0, 0, 0, 1 }, { 1, 1, 1, 1, 0 }, { 0, 0, 1, 1, 1 }, { 1, 0, 1, 0, 1 } };

var data =
    Enumerable.Range(0, 4)
        .Select(
            row =>
                new
                {
                    index = row,
                    count = Enumerable.Range(0, 5).Select(col => arr[row, col]).Count(x => x == 1)
                })
        .OrderByDescending(x => x.count)
        .Select(x => x.index)
        .First();

Solution 2

1) You can do it with LINQ this way...

private static int GetMaxIndex(byte[,] TwoDArray) {
    return Enumerable.Range(0, TwoDArray.GetLength(0))
                     .Select(
                         x => new {
                             Index = x,
                             Count = Enumerable.Range(0, TwoDArray.GetLength(1)).Count(y => TwoDArray[x, y] == 1)
                         })
                     .OrderByDescending(x => x.Count)
                     .First()
                     .Index;
}

... you'd have to test it to see if LINQ is faster or slower.

2) It is possible to use PLINQ. Just use ParallelEnumerable.Range for the row index generator

private static int GetMaxIndex2(byte[,] TwoDArray) {
    return ParallelEnumerable.Range(0, TwoDArray.GetLength(0))
                             .Select(
                                 x => new {
                                     Index = x,
                                     Count = Enumerable.Range(0, TwoDArray.GetLength(1)).Count(y => TwoDArray[x, y] == 1)
                                 })
                             .OrderByDescending(x => x.Count)
                             .First()
                             .Index;
}

Solution 3

Here is how I would do it. It's the same as others more or less, but without any Enumerable.Range (not that there is anything wrong with those (I use them all the time)...it just makes the code more indented in this case).

This one also includes PLINQ stuff. TPL (async/await) wouldn't be suitable for this because it is computationally bound and TPL is better suited to I/O bound operations. Your code would end up executing sequentially if you used async/await rather than PLINQ. This is because async/await won't go parallel until the thread is released (and it can start the next task...which could then go parallel) and purely synchronous functions (such as CPU stuff) won't every actually await...they'll just run all the way through. Basically, it would finish the first thing in your list before it even started the next thing, making it sequentially executed. PLINQ explicitly starts parallel tasks and doesn't have this issue.

//arry is your 2d byte array (byte[,] arry)
var maxIndex = arry
    .Cast<byte>() //cast the entire array into bytes
    .AsParallel() //make the transition to PLINQ (remove this to not use it)
    .Select((b, i) => new // create indexes
        {
            value = b,
            index = i
        })
    .GroupBy(g => g.index / arry.GetLength(1)) // group it by rows
    .Select((g, i) => new
        {
            sum = g.Select(g2 => (int)g2.value).Sum(), //sum each row
            index = i
        })
    .OrderByDescending(g => g.sum) //max by sum
    .Select(g => g.index) //grab the index
    .First(); //this should be the highest index

In terms of efficiency, you would probably get better results with your for loop. The question I would ask is, which is more readable and clear?

Share:
15,586
Bob Bryan
Author by

Bob Bryan

Software developer for over 20 years. Interested in efficient software methodology, user requirements, design, implementation, and testing. Experienced with C#, VC++, Boost, ASIO, WPF, Blend, Sql Server, and office tools. MCSD.

Updated on June 04, 2022

Comments

  • Bob Bryan
    Bob Bryan almost 2 years

    I have a 2-dimensional byte array that looks something like this:

    0 0 0 0 1

    1 1 1 1 0

    0 0 1 1 1

    1 0 1 0 1

    Each value in the array can only be 0 or 1. The above simplified example shows 4 rows with each row having 5 columns. I am trying to figure out how to use LINQ to return the index to the row that has the largest number of 1s set, which in the above example should return 1.

    The following non LINQ C# code solves the problem:

    static int GetMaxIndex(byte[,] TwoDArray)
    {
       // This method finds the row with the greatest number of 1s set.
       //
       int NumRows = TwoDArray.GetLength(0);
       int NumCols = TwoDArray.GetLength(1);
       int RowCount, MaxRowCount = 0, MaxRowIndex = 0;
       //
       for (int LoopR = 0; LoopR < NumRows; LoopR++)
       {
          RowCount = 0;
          for (int LoopC = 0; LoopC < NumCols; LoopC++)
          {
             if (TwoDArray[LoopR, LoopC] != 0)
                RowCount++;
          }
          if (RowCount > MaxRowCount)
          {
             MaxRowCount = RowCount;
             MaxRowIndex = LoopR;
          }
       }
       return MaxRowIndex;
    }
    
    static void Main()
    {
       byte[,] Array2D = new byte[4, 5] { { 0, 0, 0, 0, 1 }, { 1, 1, 1, 1, 0 }, { 0, 0, 1, 1, 1 }, { 1, 0, 1, 0, 1 } };
       int MaxInd = GetMaxIndex(Array2D);
       Console.WriteLine("MaxInd = {0}", MaxInd);
    }
    

    So, my questions are:

    1. How can LINQ be used to solve this, and would using LINQ here be less efficient that using the non LINQ code above?
    2. Is it possible to solve this problem with PLINQ? Or, would it be more efficient to use the Task Parallel Library (TPL) directly for the above code and split out the count of the number of 1s in each row to a separate thread, assuming that each row has at least 1,000 columns?