What is the most efficient loop in c#

45,061

Solution 1

The answer the majority of the time is it does not matter. The number of items in the loop (even what one might consider a "large" number of items, say in the thousands) isn't going to have an impact on the code.

Of course, if you identify this as a bottleneck in your situation, by all means, address it, but you have to identify the bottleneck first.

That said, there are a number of things to take into consideration with each approach, which I'll outline here.

Let's define a few things first:

  • All of the tests were run on .NET 4.0 on a 32-bit processor.
  • TimeSpan.TicksPerSecond on my machine = 10,000,000
  • All tests were performed in separate unit test sessions, not in the same one (so as not to possibly interfere with garbage collections, etc.)

Here's some helpers that are needed for each test:

The MyObject class:

public class MyObject
{
    public int IntValue { get; set; }
    public double DoubleValue { get; set; }
}

A method to create a List<T> of any length of MyClass instances:

public static List<MyObject> CreateList(int items)
{
    // Validate parmaeters.
    if (items < 0) 
        throw new ArgumentOutOfRangeException("items", items, 
            "The items parameter must be a non-negative value.");

    // Return the items in a list.
    return Enumerable.Range(0, items).
        Select(i => new MyObject { IntValue = i, DoubleValue = i }).
        ToList();
}

An action to perform for each item in the list (needed because Method 2 uses a delegate, and a call needs to be made to something to measure impact):

public static void MyObjectAction(MyObject obj, TextWriter writer)
{
    // Validate parameters.
    Debug.Assert(obj != null);
    Debug.Assert(writer != null);

    // Write.
    writer.WriteLine("MyObject.IntValue: {0}, MyObject.DoubleValue: {1}", 
        obj.IntValue, obj.DoubleValue);
}

A method to create a TextWriter which writes to a null Stream (basically a data sink):

public static TextWriter CreateNullTextWriter()
{
    // Create a stream writer off a null stream.
    return new StreamWriter(Stream.Null);
}

And let's fix the number of items at one million (1,000,000, which should be sufficiently high to enforce that generally, these all have about the same performance impact):

// The number of items to test.
public const int ItemsToTest = 1000000;

Let's get into the methods:

Method 1: foreach

The following code:

foreach(var item in myList) 
{
   //Do stuff
}

Compiles down into the following:

using (var enumerable = myList.GetEnumerable())
while (enumerable.MoveNext())
{
    var item = enumerable.Current;

    // Do stuff.
}

There's quite a bit going on there. You have the method calls (and it may or may not be against the IEnumerator<T> or IEnumerator interfaces, as the compiler respects duck-typing in this case) and your // Do stuff is hoisted into that while structure.

Here's the test to measure the performance:

[TestMethod]
public void TestForEachKeyword()
{
    // Create the list.
    List<MyObject> list = CreateList(ItemsToTest);

    // Create the writer.
    using (TextWriter writer = CreateNullTextWriter())
    {
        // Create the stopwatch.
        Stopwatch s = Stopwatch.StartNew();

        // Cycle through the items.
        foreach (var item in list)
        {
            // Write the values.
            MyObjectAction(item, writer);
        }

        // Write out the number of ticks.
        Debug.WriteLine("Foreach loop ticks: {0}", s.ElapsedTicks);
    }
}

The output:

Foreach loop ticks: 3210872841

Method 2: .ForEach method on List<T>

The code for the .ForEach method on List<T> looks something like this:

public void ForEach(Action<T> action)
{
    // Error handling omitted

    // Cycle through the items, perform action.
    for (int index = 0; index < Count; ++index)
    {
        // Perform action.
        action(this[index]);
    }
}

Note that this is functionally equivalent to Method 4, with one exception, the code that is hoisted into the for loop is passed as a delegate. This requires a dereference to get to the code that needs to be executed. While the performance of delegates has improved from .NET 3.0 on, that overhead is there.

However, it's negligible. The test to measure the performance:

[TestMethod]
public void TestForEachMethod()
{
    // Create the list.
    List<MyObject> list = CreateList(ItemsToTest);

    // Create the writer.
    using (TextWriter writer = CreateNullTextWriter())
    {
        // Create the stopwatch.
        Stopwatch s = Stopwatch.StartNew();

        // Cycle through the items.
        list.ForEach(i => MyObjectAction(i, writer));

        // Write out the number of ticks.
        Debug.WriteLine("ForEach method ticks: {0}", s.ElapsedTicks);
    }
}

The output:

ForEach method ticks: 3135132204

That's actually ~7.5 seconds faster than using the foreach loop. Not completely surprising, given that it uses direct array access instead of using IEnumerable<T>.

Remember though, this translates to 0.0000075740637 seconds per item being saved. That's not worth it for small lists of items.

Method 3: while (myList.MoveNext())

As shown in Method 1, this is exactly what the compiler does (with the addition of the using statement, which is good practice). You're not gaining anything here by unwinding the code yourself that the compiler would otherwise generate.

For kicks, let's do it anyways:

[TestMethod]
public void TestEnumerator()
{
    // Create the list.
    List<MyObject> list = CreateList(ItemsToTest);

    // Create the writer.
    using (TextWriter writer = CreateNullTextWriter())
    // Get the enumerator.
    using (IEnumerator<MyObject> enumerator = list.GetEnumerator())
    {
        // Create the stopwatch.
        Stopwatch s = Stopwatch.StartNew();

        // Cycle through the items.
        while (enumerator.MoveNext())
        {
            // Write.
            MyObjectAction(enumerator.Current, writer);
        }

        // Write out the number of ticks.
        Debug.WriteLine("Enumerator loop ticks: {0}", s.ElapsedTicks);
    }
}

The output:

Enumerator loop ticks: 3241289895

Method 4: for

In this particular case, you're going to gain some speed, as the list indexer is going directly to the underlying array to perform the lookup (that's an implementation detail, BTW, there's nothing to say that it can't be a tree structure backing the List<T> up).

[TestMethod]
public void TestListIndexer()
{
    // Create the list.
    List<MyObject> list = CreateList(ItemsToTest);

    // Create the writer.
    using (TextWriter writer = CreateNullTextWriter())
    {
        // Create the stopwatch.
        Stopwatch s = Stopwatch.StartNew();

        // Cycle by index.
        for (int i = 0; i < list.Count; ++i)
        {
            // Get the item.
            MyObject item = list[i];

            // Perform the action.
            MyObjectAction(item, writer);
        }

        // Write out the number of ticks.
        Debug.WriteLine("List indexer loop ticks: {0}", s.ElapsedTicks);
    }
}

The output:

List indexer loop ticks: 3039649305

However the place where this can make a difference is arrays. Arrays can be unwound by the compiler to process multiple items at a time.

Instead of doing ten iterations of one item in a ten item loop, the compiler can unwind this into five iterations of two items in a ten item loop.

However, I'm not positive here that this is actually happening (I have to look at the IL and the output of the compiled IL).

Here's the test:

[TestMethod]
public void TestArray()
{
    // Create the list.
    MyObject[] array = CreateList(ItemsToTest).ToArray();

    // Create the writer.
    using (TextWriter writer = CreateNullTextWriter())
    {
        // Create the stopwatch.
        Stopwatch s = Stopwatch.StartNew();

        // Cycle by index.
        for (int i = 0; i < array.Length; ++i)
        {
            // Get the item.
            MyObject item = array[i];

            // Perform the action.
            MyObjectAction(item, writer);
        }

        // Write out the number of ticks.
        Debug.WriteLine("Enumerator loop ticks: {0}", s.ElapsedTicks);
    }
}

The output:

Array loop ticks: 3102911316

It should be noted that out-of-the box, Resharper offers a suggestion with a refactoring to change the above for statements to foreach statements. That's not to say this is right, but the basis is to reduce the amount of technical debt in code.


TL;DR

You really shouldn't be concerned with the performance of these things, unless testing in your situation shows that you have a real bottleneck (and you'll have to have massive numbers of items to have an impact).

Generally, you should go for what's most maintainable, in which case, Method 1 (foreach) is the way to go.

Solution 2

In regards to the final bit of the question, "Did I miss any?" Yes, and I feel I would be remiss to not mention this even though the question is quite old. While those four ways of doing it will execute in relatively the same amount of time, there is a way not shown above that runs faster than all of them. Quite significantly, in fact, as the number of items in the iterated list increases. It would be the exact same way as the last method but instead of getting .Count in the condition check of the loop, you assign this value to a variable before setting up the loop and use that instead. Which leaves you with something like this:

var countVar = list.Count;
for(int i = 0; i < countVar; i++)
{
 //loop logic
}

By doing it this way, you're only looking up a variable value at each iteration, rather than resolving the Count or Length properties, which is considerably less efficient.

Share:
45,061
TheAlbear
Author by

TheAlbear

Updated on December 28, 2021

Comments

  • TheAlbear
    TheAlbear over 2 years

    There are a number of different way to accomplish the same simple loop though the items of an object in c#.

    This has made me wonder if there is any reason be it performance or ease of use, as to use on over the other. Or is it just down to personal preference.

    Take a simple object

    var myList = List<MyObject>; 
    

    Lets assume the object is filled and we want to iterate over the items.

    Method 1.

    foreach(var item in myList) 
    {
       //Do stuff
    }
    

    Method 2

    myList.Foreach(ml => 
    {
       //Do stuff
    });
    

    Method 3

    while (myList.MoveNext()) 
    {
      //Do stuff
    }
    

    Method 4

    for (int i = 0; i < myList.Count; i++)
    {
      //Do stuff   
    }
    

    What I was wondering is do each of these compiled down to the same thing? is there a clear performance advantage for using one over the others?

    or is this just down to personal preference when coding?

    Have I missed any?