IEnumerable.Count() or ToList().Count

14,142

Solution 1

You asked:

I wonder, what would be faster.

Whenever you ask that you should actually time it and find out.

I set out to test all of these variants of obtaining a count:

var enumerable = Enumerable.Range(0, 1000000);
var list = enumerable.ToList();

var methods = new Func<int>[]
{
    () => list.Count,
    () => enumerable.Count(),
    () => list.Count(),
    () => enumerable.ToList().Count(),
    () => list.ToList().Count(),
    () => enumerable.Select(x => x).Count(),
    () => list.Select(x => x).Count(),
    () => enumerable.Select(x => x).ToList().Count(),
    () => list.Select(x => x).ToList().Count(),
    () => enumerable.Where(x => x % 2 == 0).Count(),
    () => list.Where(x => x % 2 == 0).Count(),
    () => enumerable.Where(x => x % 2 == 0).ToList().Count(),
    () => list.Where(x => x % 2 == 0).ToList().Count(),
};

My testing code explicitly runs each method 1,000 times, measures each execution time with a Stopwatch, and ignores all results where garbage collection occurred. It then gets an average execution time per method.

var measurements =
    methods
        .Select((m, i) => i)
        .ToDictionary(i => i, i => new List<double>());

for (var run = 0; run < 1000; run++)
{
    for (var i = 0; i < methods.Length; i++)
    {
        var sw = Stopwatch.StartNew();
        var gccc0 = GC.CollectionCount(0);
        var r = methods[i]();
        var gccc1 = GC.CollectionCount(0);
        sw.Stop();
        if (gccc1 == gccc0)
        {
            measurements[i].Add(sw.Elapsed.TotalMilliseconds);
        }
    }
}

var results =
    measurements
        .Select(x => new
        {
            index = x.Key,
            count = x.Value.Count(),
            average = x.Value.Average().ToString("0.000")
        });

Here are the results (ordered from slowest to fastest):

+---------+-----------------------------------------------------------+
| average |                          method                           |
+---------+-----------------------------------------------------------+
| 14.879  | () => enumerable.Select(x => x).ToList().Count(),         |
| 14.188  | () => list.Select(x => x).ToList().Count(),               |
| 10.849  | () => enumerable.Where(x => x % 2 == 0).ToList().Count(), |
| 10.080  | () => enumerable.ToList().Count(),                        |
| 9.562   | () => enumerable.Select(x => x).Count(),                  |
| 8.799   | () => list.Where(x => x % 2 == 0).ToList().Count(),       |
| 8.350   | () => enumerable.Where(x => x % 2 == 0).Count(),          |
| 8.046   | () => list.Select(x => x).Count(),                        |
| 5.910   | () => list.Where(x => x % 2 == 0).Count(),                |
| 4.085   | () => enumerable.Count(),                                 |
| 1.133   | () => list.ToList().Count(),                              |
| 0.000   | () => list.Count,                                         |
| 0.000   | () => list.Count(),                                       |
+---------+-----------------------------------------------------------+

Two things come out that are significant here.

One, any method with a .ToList() inline is significantly slower than the equivalent without it.

Two, LINQ operators take advantage of the underlying type of the enumerable, where possible, to short-cut computations. The enumerable.Count() and list.Count() methods show this.

There is no difference between the list.Count and list.Count() calls. So the key comparison is between the enumerable.Where(x => x % 2 == 0).Count() and enumerable.Where(x => x % 2 == 0).ToList().Count() calls. Since the latter contains an extra operation we would expect it to take longer. It's almost 2.5 milliseconds longer.

I don't know why you say that you're going to call the counting code twice, but if you do it is better to build the list. If not just do the plain .Count() call after your query.

Solution 2

Generally, materializing to a list will be less efficient.

Additionally, if you are using two conditions, there is no point in caching the result or materializing the query to a List.

You should just use the overload of Count which accepts a predicate:

collection.Count(someocondition);

As @CodeCaster mentions in the comments, it is equivalent to collection.Where(condition).Count(), but is more readable and concise.

Solution 3

Using it exactly this way

var count = collection.Where(somecondition).ToList().Count;

doesn't make sense - populating a list just to get the count, so using IEnumerable<T>.Count() is the appropriate way for this case.

Using ToList would make sense in a case you do something like this

var list = collection.Where(somecondition).ToList();
var count = list.Count;
// do something else with the list
Share:
14,142
Paweł Mikołajczyk
Author by

Paweł Mikołajczyk

Student of Computer Science and Junior Developer.

Updated on July 21, 2022

Comments

  • Paweł Mikołajczyk
    Paweł Mikołajczyk almost 2 years

    I got List of objects of my own class which looks like:

    public class IFFundTypeFilter_ib
    {
        public string FundKey { get; set; }
        public string FundValue { get; set; }
        public bool IsDisabled { get; set; }
    }
    

    The property IsDisabled is set by doing query collection.Where(some condition) and counting the number of matching objects. The result is IEnumarable<IFFundTypeFilter_ib> which does not contain property Count. I wonder, what would be faster.

    This one:

    collection.Where(somecondition).Count();
    

    or this one:

    collection.Where(someocondition).ToList().Count;
    

    Collection could contains few objects but could also contains, for example 700. I am going to make counting call two times and with other conditions. In first condition I check whether FundKey equals some key and in the second condition I do the same, but I compare it with other key value.

    • CodeCaster
      CodeCaster over 8 years
      That depends entirely on what collection actually is. You can also just not count twice, but store that number.
    • Paweł Mikołajczyk
      Paweł Mikołajczyk over 8 years
      I can't because at second call I will check other condition.
    • Sriram Sakthivel
      Sriram Sakthivel over 8 years
      @CodeCaster What type of collection doesn't matter here. I guess you overlooked the where condition in question.
    • Bauss
      Bauss over 8 years
      It would be better just to loop through and have an incrermenter that you increase every time the condition is met.
    • Dennis
      Dennis over 8 years
      Where produces lazy enumerable, regardless of the type of source collection. ToList in this case must iterate through that enumerable. Comparing these lines, the second one is slower. But it doesn't clear for me, what OP means by " I am going to make counting call two times", especially, "I can't because at second call I will check other condition". @PawełMikołajczyk: could you post, what exactly are you going to do?
    • CodeCaster
      CodeCaster over 8 years
      @Bauss no, that's exactly what Count() does.
    • user1703401
      user1703401 over 8 years
      Well, what is faster when you count the dollar bills in your pocket? A: count them off one by one, B: buying a new wallet, making a photo copy of every bill, putting the copy in the new wallet and then count them?
  • CodeCaster
    CodeCaster over 8 years
    Where(x).Count() does the same as Count(x).
  • Rotem
    Rotem over 8 years
    @CodeCaster What happens above the water also matters :)
  • Rotem
    Rotem over 8 years
    @CodeCaster I mean that while the statements are equivalent in performance, the latter is more readable and concise.
  • Ivan Stoev
    Ivan Stoev over 8 years
    I would disagree with @CodeCaster. They are not equivalent in performance either (otherwise why bother providing the overload) - Where(x).Count() involves creating and chaining 2 enumerators, while Count(x) needs only one.
  • CodeCaster
    CodeCaster over 8 years
    @Ivan if it is on a collection from an ORM like Entity Framework, it will be translated to the same SQL. The overhead of instantiating another enumerator is negligible. As always with premature micro-optimizations: don't bother until it proves to be a problem; go for the most readable code.
  • Ivan Stoev
    Ivan Stoev over 8 years
    @CodeCaster Fully agree for IQueryable and partially for IEnumerable and micro optimizations - some people even complaining why IEnumerator needs 2 virtual calls (MoveNext and Current) to achieve the goal :-) But seriously, I don't see any other reason for providing all these predicate overloads for Count, LongCount and Any.
  • ZarNi Myo Sett Win
    ZarNi Myo Sett Win almost 5 years
    average time are sorted with seconds or milliseconds? Thanks.
  • Enigmativity
    Enigmativity almost 5 years
    It's sw.Elapsed.TotalMilliseconds.