count the number of occurrences of a certain value in a dictionary in python?
Solution 1
As mentioned in THIS ANSWER using operator.countOf()
is the way to go but you can also use a generator within sum()
function as following:
sum(value == 0 for value in D.values())
# Or the following which is more optimized
sum(1 for v in D.values() if v == 0)
Or as a slightly more optimized and functional approach you can use map
function by passing the __eq__
method of the integer as the constructor function.
sum(map((0).__eq__, D.values()))
Benchmark:
In [15]: D = dict(zip(range(1000), range(1000)))
In [16]: %timeit sum(map((0).__eq__, D.values()))
49.6 µs ± 770 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [17]: %timeit sum(v==0 for v in D.values())
60.9 µs ± 669 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [18]: %timeit sum(1 for v in D.values() if v == 0)
30.2 µs ± 515 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [19]: %timeit countOf(D.values(), 0)
16.8 µs ± 74.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Note that although using map
function in this case may be more optimized, but in order to have a more comprehensive and general idea about the two approaches you should run the benchmark for relatively large datasets as well. Then, you can use the most proper approach based on the structure and amount of data you have.
Solution 2
Alternatively, using collections.Counter
:
from collections import Counter
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
Counter(D.values())[0]
# 3
Solution 3
You can count it converting it to a list as follows:
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
print(list(D.values()).count(0))
>>3
Or iterating over the values:
print(sum([1 for i in D.values() if i == 0]))
>>3
Solution 4
That's a job for operator.countOf
.
countOf(D.values(), 0)
Benchmark with your example dictionary:
1537 ns 1540 ns 1542 ns Counter(D.values())[0]
791 ns 800 ns 802 ns sum(value == 0 for value in D.values())
694 ns 697 ns 717 ns sum(map((0).__eq__, D.values()))
680 ns 682 ns 689 ns sum(1 for value in D.values() if value == 0)
599 ns 599 ns 600 ns sum([1 for i in D.values() if i == 0])
368 ns 369 ns 375 ns list(D.values()).count(0)
229 ns 231 ns 231 ns countOf(D.values(), 0)
Code (Try it online!):
from timeit import repeat
setup = '''
from collections import Counter
from operator import countOf
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
'''
E = [
'Counter(D.values())[0]',
'sum(value == 0 for value in D.values())',
'sum(map((0).__eq__, D.values()))',
'sum(1 for value in D.values() if value == 0)',
'sum([1 for i in D.values() if i == 0])',
'list(D.values()).count(0)',
'countOf(D.values(), 0)',
]
for _ in range(3):
for e in E:
number = 10 ** 5
ts = sorted(repeat(e, setup, number=number))[:3]
print(*('%4d ns ' % (t / number * 1e9) for t in ts), e)
print()
Related videos on Youtube
![RowanX](https://i.stack.imgur.com/IcIEG.jpg?s=256&g=1)
RowanX
Mainly focusing on mobile development and always up for new challenges!
Updated on December 16, 2021Comments
-
RowanX over 2 years
If I have got something like this:
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
If I want for example to count the number of occurrences for the "0" as a value without having to iterate the whole list, is that even possible and how?
-
Peter Wood over 6 years
sum(1 for value in D.values() if value == 0)
-
Mazdak over 6 years@PeterWood Or even better:
sum(value == 0 for value in D.values())
-
Peter Wood over 6 years@Kasramvd that relies upon automatic conversion of a boolean to an integer, which isn't as clear, in my opinion
-
Jean-François Fabre over 6 years@PeterWood on the contrary. Booleans are integers.
-
Peter Wood over 6 years@Jean-FrançoisFabre
type(True) != type(1)
-
Jean-François Fabre over 6 yearsboolean is an integer subclass. 1==True even if not the same class (numerical types are ok):
isinstance(True,int)
isTrue
-
Peter Wood over 6 years
1 == True
,2 != True
,bool(2) == True
-
Terry Jan Reedy over 6 yearsIn part, boolean are integers so that they can summed as in @k
-
-
juanpa.arrivillaga over 6 yearsProbably better because generators tend to be a bit slower.
-
juanpa.arrivillaga over 6 yearsIn fact, I'm getting
%timeit sum([value == 0 for value in D.values()])
as faster than the generator expression version. -
Mazdak over 6 years@juanpa.arrivillaga Definitely but the interesting part is that
(0).__eq__
is not a built-in function and yet it performs better with map. That means the drawback of generators (extra function calls,__next__
, etc) is more influential than passing a non-built-in function tomap
. -
juanpa.arrivillaga over 6 yearsAlso, the creation of the actual generator object might have more overhead than the creation of the
map
object, and with adict
this small, it would make a difference -
Kelly Bundy over 2 years
sum(1 for value in D.values() if value == 0)
is likely faster. But I'd saycountOf
is really the best way. -
Mazdak over 2 years@KellyBundy Yup, seems like some new under the hood optimizations. The
countOf
is also the best way tho. -
Kelly Bundy over 2 yearsWith under the hood optimizations, are you talking about the
sum(1 ... if)
? I think that mostly benefits from fewer values transferred to and processed bysum
, same reason as this. Especially when the value appears very rarely, like in your test data.