How to count the number of true elements in a NumPy bool array
Solution 1
You have multiple options. Two options are the following.
boolarr.sum()
numpy.count_nonzero(boolarr)
Here's an example:
>>> import numpy as np
>>> boolarr = np.array([[0, 0, 1], [1, 0, 1], [1, 0, 1]], dtype=np.bool)
>>> boolarr
array([[False, False, True],
[ True, False, True],
[ True, False, True]], dtype=bool)
>>> boolarr.sum()
5
Of course, that is a bool
-specific answer. More generally, you can use numpy.count_nonzero
.
>>> np.count_nonzero(boolarr)
5
Solution 2
That question solved a quite similar question for me and I thought I should share :
In raw python you can use sum()
to count True
values in a list
:
>>> sum([True,True,True,False,False])
3
But this won't work :
>>> sum([[False, False, True], [True, False, True]])
TypeError...
Solution 3
In terms of comparing two numpy arrays and counting the number of matches (e.g. correct class prediction in machine learning), I found the below example for two dimensions useful:
import numpy as np
result = np.random.randint(3,size=(5,2)) # 5x2 random integer array
target = np.random.randint(3,size=(5,2)) # 5x2 random integer array
res = np.equal(result,target)
print result
print target
print np.sum(res[:,0])
print np.sum(res[:,1])
which can be extended to D dimensions.
The results are:
Prediction:
[[1 2]
[2 0]
[2 0]
[1 2]
[1 2]]
Target:
[[0 1]
[1 0]
[2 0]
[0 0]
[2 1]]
Count of correct prediction for D=1: 1
Count of correct prediction for D=2: 2
Related videos on Youtube
norio
I'm a postdoc doing theoretical and computational research in atomic, molecular, and optical science.
Updated on July 08, 2022Comments
-
norio almost 2 years
I have a NumPy array 'boolarr' of boolean type. I want to count the number of elements whose values are
True
. Is there a NumPy or Python routine dedicated for this task? Or, do I need to iterate over the elements in my script?-
Private about 7 yearsFor pandas: stackoverflow.com/questions/26053849/…
-
-
norio over 12 yearsThanks, David. They look neat. About the method with sum(..), is True always equal to 1 in python (or at least in numpy)? If it is not guaranteed, I will add a check, 'if True==1:' beforehand. About count_nonzero(..), unfortunately, it seems not implemented in my numpy module at version 1.5.1, but I may have a chance to use it in the future.
-
David Alber over 12 years@norio Regarding
bool
: boolean values are treated as 1 and 0 in arithmetic operations. See "Boolean Values" in the Python Standard Library documentation. Note that NumPy'sbool
and Pythonbool
are not the same, but they are compatible (see here for more information). -
David Alber over 12 years@norio Regarding
numpy.count_nonzero
not being in NumPy v1.5.1: you are right. According to this release announcement, it was added in NumPy v1.6.0. -
norio over 12 yearsThank you very much for the replies with the links!
-
tommy chheng over 11 yearsYou should "flatten" the array of arrays first. unfortunately, there's no builtin method, see stackoverflow.com/questions/2158395/…
-
chbrown over 10 yearsFWIW,
numpy.count_nonzero
is about a thousand times faster, in my Python interpreter, at least.python -m timeit -s "import numpy as np; bools = np.random.uniform(size=1000) >= 0.5" "np.count_nonzero(bools)"
vs.python -m timeit -s "import numpy as np; bools = np.random.uniform(size=1000) >= 0.5" "sum(bools)"
-
mab over 8 years@chbrown you are right. But you should compare to
np.sum(bools)
instead! However,np.count_nonzero(bools)
is still ~12x faster. -
Elliptica almost 8 yearsIf I try either of those, it works as long as my answer is non-zero. But if I get 0 and I'm doing it in a pivot table, my answer is always False.
-
JJFord3 over 7 yearsThanks Guillaume! Works with Pandas dataframes as well.
-
Zikoat over 3 yearsIf you intend to check if there are more 1 or more elements in the array after true values have been counted, you can do this with
np.any(bools)
-
discover over 2 years@DavidAlber
numpy.count_nonzero
returns wrong results for masked array. If masked array has some mask values and all True values in other cells, it's different fromnp.sum()
. Is it bug or expected result?