Empty HashSet - Count vs Any
Solution 1
On a HashSet you can use both, since HashSet internally manages the count.
However, if your data is in an IEnumerable<T>
or IQueryable<T>
object, using result.Any()
is preferable over result.Count()
(Both Linq Methods).
Linq's .Count()
will iterate through the whole Enumerable, .Any()
will only peek if any objects exists within the Enumerable or not.
Update:
Just small addition:
In your case with the HashSet .Count
may be preferable as .Any()
would require an IEmumerator
to be created and returned which is a small overhead if you are not going to use the Enumerator anywhere in your code (foreach
, Linq, etc.). But I think that would be considered "Micro optimization".
Solution 2
HastSet<T>
implements ICollection<T>
, which has a Count
property, so a call to Count()
will just call HastSet<T>.Count
, which I'm assuming is an O(1) operation (meaning it doesn't actually have to count - it just returns the current size of the HashSet
).
Any
will iterate until it finds an item that matches the condition, then stop.
So in your case, it will just iterate one item, then stop, so the difference will probably be negligible.
If you had a filter that you wanted to apply (e.g. x => x.IsValid
) then Any
would definitely be faster since Count(x => x.IsValid)
would iterate over the entire collection, while Any
would stop as soon as if finds a match.
For those reasons I generally prefer to use Any()
rather than Count()==0
since it's more direct and avoids any potential performance problems. I would only switch to Count()==0
if it provided a significant performance boost over Any()
.
Note that Any(x=>true)
is logically the same as calling Any()
. That doesn't change your question, but it looks cleaner without the lambda.
Andy
Updated on June 04, 2022Comments
-
Andy almost 2 years
I am only interested to know whether a HashSet
hs
is empty or not. I am NOT interested to know exactly how many elements it contains.So I could use this:
bool isEmpty = (hs.Count == 0);
...or this:
bool isEmpty = hs.Any(x=>true);
Which one provides better results, performance-wise(specially when the HashSet contains a large number of elements) ?
-
Tseng over 10 yearsNo.
.Any()
parameterless won't iterate until it stops..Any(x => x)
will because it has to evaluate the contents. All the parameterless has to do is reset the enumerator and callenumerator.GetNext()
and return it's result to find out if there are any results..Any(x => x)
will have to check the actually contents -
Servy over 10 years@Tseng
.Any(x=>true)
will evaluate the first item, find that it resolves to true, and then evaluate no more. It's a negligible amount more work than just.Any()
. The primary advantage of using the parameterless overload is just that it's faster to type and (arguably) a bit clearer to the reader. -
Tseng over 10 yearsYou're right. I've overseen the fact that the it returns a constant.
-
mjwills over 5 years
Linq's .Count() will iterate through the whole Enumerable
It won't with aHashSet
.