Starting Tasks In foreach Loop Uses Value of Last Item

28,141

Solution 1

You're closing over the loop variable. Don't do that. Take a copy instead:

foreach (string path in paths)
{
    string pathCopy = path;
    var task = Task.Factory.StartNew(() =>
        {
            Boolean taskResult = ProcessPicture(pathCopy);
            return taskResult;
        });
    // See note at end of post
    task.ContinueWith(t => result &= t.Result);
    tasks.Add(task);
}

Your current code is capturing path - not the value of it when you create the task, but the variable itself. That variable changes value each time you go through the loop - so it can easily change by the time your delegate is called.

By taking a copy of the variable, you're introducing a new variable each time you go through the loop - when you capture that variable, it won't be changed in the next iteration of the loop.

Eric Lippert has a pair of blog posts which go into this in a lot more detail: part 1; part 2.

Don't feel bad - this catches almost everyone out :(


Note about this line:

task.ContinueWith(t => result &= t.Result);

As pointed out in comments, this isn't thread-safe. Multiple threads could execute it at the same time, potentially stamping on each other's results. I haven't added locking or anything similar as it would distract from the main issue that the question is interested, namely variable capture. However, it's worth being aware of.

Solution 2

The lambda that you're passing to StartNew is referencing the path variable, which changes on each iteration (i.e. your lambda is making use of the reference of path, rather than just its value). You can create a local copy of it so that you aren't pointing to a version that will change:

foreach (string path in paths)
{
    var lambdaPath = path;
    var task = Task.Factory.StartNew(() =>
        {
            Boolean taskResult = ProcessPicture(lambdaPath);
            return taskResult;
        });
    task.ContinueWith(t => result &= t.Result);
    tasks.Add(task);
}
Share:
28,141
Wonko the Sane
Author by

Wonko the Sane

Just another geek hoping to inherit the Earth. #SOreadytohelp

Updated on July 09, 2022

Comments

  • Wonko the Sane
    Wonko the Sane almost 2 years

    I am making a first attempt at playing with the new Tasks, but something is happening that I don't understand.

    First, the code, which is pretty straight-forward. I pass in a list of paths to some image files, and attempt to add a task to process each of them:

    public Boolean AddPictures(IList<string> paths)
    {
        Boolean result = (paths.Count > 0);
        List<Task> tasks = new List<Task>(paths.Count);
    
        foreach (string path in paths)
        {
            var task = Task.Factory.StartNew(() =>
                {
                    Boolean taskResult = ProcessPicture(path);
                    return taskResult;
                });
            task.ContinueWith(t => result &= t.Result);
            tasks.Add(task);
        }
    
        Task.WaitAll(tasks.ToArray());
    
        return result;
    }
    

    I've found that if I just let this run with, say, a list of 3 paths in a unit test, all three tasks use the last path in the provided list. If I step through (and slow down the processing of the loop), each path from the loop is used.

    Can somebody please explain what is happening, and why? Possible workarounds?

  • Wonko the Sane
    Wonko the Sane over 13 years
    But of course. Forest for the trees and all that. :)
  • BrokenGlass
    BrokenGlass over 13 years
    this closure problem and in-proper use of Random() must be in the top 5 frequency-wise at SO
  • Louis Kottmann
    Louis Kottmann about 12 years
    Please note that this "bug" (which was originally by design) is supposed to be fixed in C#5.0
  • Snixtor
    Snixtor about 11 years
    This behaviour has changed, not only in C#5.0 (as noted in the update on Eric Lipperts blog article), but also in VS2012 if you're targeting 4.0.
  • Jon Skeet
    Jon Skeet about 11 years
    @Snixtor: That's still the C# 5 compiler. It's important to distinguish between the language version you're using, and the framework version you're targeting.
  • Francesco Bonizzi
    Francesco Bonizzi about 8 years
    What happens if, for example, the iterating list is a List<DataRow> ?
  • Jon Skeet
    Jon Skeet about 8 years
    @FrancescoB.: The same thing - in C# 4, the variable would refer to "the last item referred to".
  • Servy
    Servy over 5 years
    This code has a race condition in that you're aggregating the results into the result variable from multiple threads without proper synchronization.
  • Jon Skeet
    Jon Skeet over 5 years
    @Servy: Thanks - I've added a note about this. Let me know if you think it's appropriate for the context.
  • Servy
    Servy over 5 years
    Yeah, the note is fine. I only bothered to mention it because someone posted a question today in which they were just copying the code here. I'm aware that the issue was in the original question, and that it's not related to what was being asked about.
  • Mehdi Dehghani
    Mehdi Dehghani about 5 years
    That copy note saved my life! it's so simple (and maybe stupid at first) but I could not figure that out by myself, even if I thought for 100 years!
  • Jon Skeet
    Jon Skeet about 5 years
    @MehdiDehghani: Note that this was back in C# 4 days. C# 5 fixed the way that foreach variables are handled in closures, so most of the time you don't need to think about this copying. (You do for a for variable though.)
  • Deepak
    Deepak over 3 years
    I did this cache thing before reading any answers and it worked, but still was afraid of how I did solve it myself. Thanks for the ans ;)