Empty HashSet - Count vs Any

14,086

Solution 1

On a HashSet you can use both, since HashSet internally manages the count.

However, if your data is in an IEnumerable<T> or IQueryable<T> object, using result.Any() is preferable over result.Count() (Both Linq Methods).

Linq's .Count() will iterate through the whole Enumerable, .Any() will only peek if any objects exists within the Enumerable or not.

Update: Just small addition: In your case with the HashSet .Count may be preferable as .Any() would require an IEmumerator to be created and returned which is a small overhead if you are not going to use the Enumerator anywhere in your code (foreach, Linq, etc.). But I think that would be considered "Micro optimization".

Solution 2

HastSet<T> implements ICollection<T>, which has a Count property, so a call to Count() will just call HastSet<T>.Count, which I'm assuming is an O(1) operation (meaning it doesn't actually have to count - it just returns the current size of the HashSet).

Any will iterate until it finds an item that matches the condition, then stop.

So in your case, it will just iterate one item, then stop, so the difference will probably be negligible.

If you had a filter that you wanted to apply (e.g. x => x.IsValid) then Any would definitely be faster since Count(x => x.IsValid) would iterate over the entire collection, while Any would stop as soon as if finds a match.

For those reasons I generally prefer to use Any() rather than Count()==0 since it's more direct and avoids any potential performance problems. I would only switch to Count()==0 if it provided a significant performance boost over Any().

Note that Any(x=>true) is logically the same as calling Any(). That doesn't change your question, but it looks cleaner without the lambda.

Share:
14,086
Andy
Author by

Andy

Updated on June 04, 2022

Comments

  • Andy
    Andy almost 2 years

    I am only interested to know whether a HashSet hs is empty or not. I am NOT interested to know exactly how many elements it contains.

    So I could use this:

    bool isEmpty = (hs.Count == 0);
    

    ...or this:

    bool isEmpty = hs.Any(x=>true);
    

    Which one provides better results, performance-wise(specially when the HashSet contains a large number of elements) ?

  • Tseng
    Tseng over 10 years
    No. .Any() parameterless won't iterate until it stops. .Any(x => x) will because it has to evaluate the contents. All the parameterless has to do is reset the enumerator and call enumerator.GetNext() and return it's result to find out if there are any results. .Any(x => x) will have to check the actually contents
  • Servy
    Servy over 10 years
    @Tseng .Any(x=>true) will evaluate the first item, find that it resolves to true, and then evaluate no more. It's a negligible amount more work than just .Any(). The primary advantage of using the parameterless overload is just that it's faster to type and (arguably) a bit clearer to the reader.
  • Tseng
    Tseng over 10 years
    You're right. I've overseen the fact that the it returns a constant.
  • mjwills
    mjwills over 5 years
    Linq's .Count() will iterate through the whole Enumerable It won't with a HashSet.