C# - Performance comparison of ConcurrentBag vs List

11,438

Solution 1

ConcurrentBag<T> will inevitably be less performant than List<T>. Although you will only be accessing it from a single thread, the structure still needs to have mechanisms in place to protect against the possibility of race hazards should concurrent access arise.

If you will be loading the collection from a single thread before starting your enumerations, you can avoid the performance overhead by using the ConcurrentBag(IEnumerable<T>) constructor, rather than adding each item individually through its Add method.

ConcurrentBag<T> provides “moment-in-time snapshot” semantics for enumerations; see the remarks for its GetEnumerator method. When you access ConcurrentBag<T> from a foreach loop, it will first copy its entire contents into a plain List<T>, then enumerate over that. This will incur a substantial performance overhead (both computation- and memory-wise) each time you use it in a loop.

If your scenario is that your list will be populated by multiple threads, but then only read by one thread, then you should convert it to a List<T> as soon as the writers are done.

Solution 2

Billions of items and List or Concurrent bag? That is a "no go".

As far as performance goes try this to test adding: (feel free to modify this to test other operations)

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace ConcurrentBagTest
{
    // You must compile this for x64 or you will get OutOfMemory exception
    class Program
    {
        static void Main(string[] args)
        {
            ListTest(10000000);
            ListTest(100000000);
            ListTest(1000000000);
            ConcurrentBagTest(10000000);
            ConcurrentBagTest(100000000);

            Console.ReadKey();

        }


        static void ConcurrentBagTest(long count)
        {
            try
            {
                var bag = new ConcurrentBag<long>();
                Console.WriteLine($"--- ConcurrentBagTest count = {count}");
                Console.WriteLine($"I will use {(count * sizeof(long)) / Math.Pow(1024, 2)} MiB of RAM");
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Start();
                for (long i = 0; i < count; i++)
                {
                    bag.Add(i);
                }
                stopwatch.Stop();
                Console.WriteLine($"Inserted {bag.LongCount()} items in {stopwatch.Elapsed.TotalSeconds} s");
                Console.WriteLine();
                Console.WriteLine();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.ToString());
            }
            GC.Collect();
            GC.WaitForPendingFinalizers();
        }

        static void ListTest(long count)
        {
            try
            {
                var list = new List<long>();
                Console.WriteLine($"--- ListTest count = {count}");
                Console.WriteLine($"I will use {(count * sizeof(long)) / Math.Pow(1024, 2)} MiB of RAM");
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Start();
                for (long i = 0; i < count; i++)
                {
                    list.Add(i);
                }
                stopwatch.Stop();
                Console.WriteLine($"Inserted {list.LongCount()} items in {stopwatch.Elapsed.TotalSeconds} s");
                Console.WriteLine();
                Console.WriteLine();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.ToString());
            }
            GC.Collect();
            GC.WaitForPendingFinalizers();
        }
    }
}

My Output:

--- ListTest count = 10000000
I will use 76,2939453125 MiB of RAM
Inserted 10000000 items in 0,0807315 s


--- ListTest count = 100000000
I will use 762,939453125 MiB of RAM
Inserted 100000000 items in 0,7741546 s


--- ListTest count = 1000000000
I will use 7629,39453125 MiB of RAM
System.OutOfMemoryException: Array dimensions exceeded supported range.

--- ConcurrentBagTest count = 10000000
I will use 76,2939453125 MiB of RAM
Inserted 10000000 items in 1,0744069 s


--- ConcurrentBagTest count = 100000000
I will use 762,939453125 MiB of RAM
Inserted 100000000 items in 11,3976436 s

Using CPU: Intel Core i7-2600 @ 3.4 GHz,

Using RAM: 16 GB

Also take a look at this answer for limitations.

Solution 3

But, if you need to remove items, the ConcurrentBag is SIGNIFICANTLY faster than the List

void Main()
{
    ConcurrentBag<int> bag = new ConcurrentBag<int>();
    ConcurrentStack<int> stack = new ConcurrentStack<int>();
    ConcurrentQueue<int> q = new ConcurrentQueue<int>();
    List<int> list = new List<int>();

    Stopwatch sw = new Stopwatch();
    int count = 100000;
    sw.Start();
    for (int i = 0; i < count; i++)
    {
        bag.Add(i);
    }
    for (int i = 0; i< count; i++)
    {
        bag.TryTake(out _);
    }
    sw.Elapsed.Dump("BAG");
    sw.Start();
    for (int i = 0; i < count; i++)
    {
        stack.Push(i);
    }
    for (int i = 0; i < count; i++)
    {
        stack.TryPop(out _);
    }
    sw.Elapsed.Dump("Stack");
    sw.Start();
    for (int i = 0; i < count; i++)
    {
        q.Enqueue(i);
    }
    for (int i = 0; i < count; i++)
    {
        q.TryDequeue(out _);
    }
    sw.Elapsed.Dump("Q");
    sw.Start();
    for (int i = 0; i < count; i++)
    {
        list.Add(i);
    }
    for (int i = 0; i < count; i++)
    {
        list.RemoveAt(0);
    }
    sw.Elapsed.Dump("list remove at 0");
    sw.Start();
    for (int i = 0; i < count; i++)
    {
        list.Add(i);
    }
    for (int i = 0; i < count; i++)
    {
        list.RemoveAt(list.Count -1);
    }
    sw.Elapsed.Dump("list remove at end");
}

Results:

BAG 00:00:00.0144421

Stack 00:00:00.0341379

Q 00:00:00.0400114

list remove at 0 00:00:00.6188329

list remove at end 00:00:00.6202170

Share:
11,438
Leonardo
Author by

Leonardo

Developer, Architect, Tech Lover!

Updated on July 23, 2022

Comments

  • Leonardo
    Leonardo almost 2 years

    Preface: I'm only asking this because I don't have an environment (dataset large enough + computing power) to test it in a reliable fashion.

    Question: Given a Concurrent Bag, loaded with billions of items, being accessed/used by a single thread, does it perform similar to a List? Putting in another words, is the enumeration over a Concurrent Bag any more or less performatic than over a List<T>?