Fast way of counting non-zero bits in positive integer
Solution 1
For arbitrary-length integers, bin(n).count("1")
is the fastest I could find in pure Python.
I tried adapting Óscar's and Adam's solutions to process the integer in 64-bit and 32-bit chunks, respectively. Both were at least ten times slower than bin(n).count("1")
(the 32-bit version took about half again as much time).
On the other hand, gmpy popcount()
took about 1/20th of the time of bin(n).count("1")
. So if you can install gmpy, use that.
To answer a question in the comments, for bytes I'd use a lookup table. You can generate it at runtime:
counts = bytes(bin(x).count("1") for x in range(256)) # py2: use bytearray
Or just define it literally:
counts = (b'\x00\x01\x01\x02\x01\x02\x02\x03\x01\x02\x02\x03\x02\x03\x03\x04'
b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
b'\x04\x05\x05\x06\x05\x06\x06\x07\x05\x06\x06\x07\x06\x07\x07\x08')
Then it's counts[x]
to get the number of 1 bits in x
where 0 ≤ x ≤ 255.
Solution 2
Python 3.10 introduces int.bit_count()
:
>>> n = 19
>>> bin(n)
'0b10011'
>>> n.bit_count()
3
>>> (-n).bit_count()
3
This is functionally equivalent to bin(n).count("1")
but should be ~6 times faster. See also Issue29882.
Solution 3
You can adapt the following algorithm:
def CountBits(n):
n = (n & 0x5555555555555555) + ((n & 0xAAAAAAAAAAAAAAAA) >> 1)
n = (n & 0x3333333333333333) + ((n & 0xCCCCCCCCCCCCCCCC) >> 2)
n = (n & 0x0F0F0F0F0F0F0F0F) + ((n & 0xF0F0F0F0F0F0F0F0) >> 4)
n = (n & 0x00FF00FF00FF00FF) + ((n & 0xFF00FF00FF00FF00) >> 8)
n = (n & 0x0000FFFF0000FFFF) + ((n & 0xFFFF0000FFFF0000) >> 16)
n = (n & 0x00000000FFFFFFFF) + ((n & 0xFFFFFFFF00000000) >> 32) # This last & isn't strictly necessary.
return n
This works for 64-bit positive numbers, but it's easily extendable and the number of operations growth with the logarithm of the argument (i.e. linearly with the bit-size of the argument).
In order to understand how this works imagine that you divide the entire 64-bit string into 64 1-bit buckets. Each bucket's value is equal to the number of bits set in the bucket (0 if no bits are set and 1 if one bit is set). The first transformation results in an analogous state, but with 32 buckets each 2-bit long. This is achieved by appropriately shifting the buckets and adding their values (one addition takes care of all buckets since no carry can occur across buckets - n-bit number is always long enough to encode number n). Further transformations lead to states with exponentially decreasing number of buckets of exponentially growing size until we arrive at one 64-bit long bucket. This gives the number of bits set in the original argument.
Solution 4
Here's a Python implementation of the population count algorithm, as explained in this post:
def numberOfSetBits(i):
i = i - ((i >> 1) & 0x55555555)
i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
return (((i + (i >> 4) & 0xF0F0F0F) * 0x1010101) & 0xffffffff) >> 24
It will work for 0 <= i < 0x100000000
.
Solution 5
I really like this method. Its simple and pretty fast but also not limited in the bit length since python has infinite integers.
It's actually more cunning than it looks, because it avoids wasting time scanning the zeros. For example it will take the same time to count the set bits in 1000000000000000000000010100000001 as in 1111.
def get_bit_count(value):
n = 0
while value:
n += 1
value &= value-1
return n
Comments
-
zidarsk8 almost 2 years
I need a fast way to count the number of bits in an integer in python. My current solution is
bin(n).count("1")
but I am wondering if there is any faster way of doing this?
PS: (i am representing a big 2D binary array as a single list of numbers and doing bitwise operations, and that brings the time down from hours to minutes. and now I would like to get rid of those extra minutes.
Edit: 1. it has to be in python 2.7 or 2.6
and optimizing for small numbers does not matter that much since that would not be a clear bottleneck, but I do have numbers with 10 000 + bits at some places
for example this is a 2000 bit case:
12448057941136394342297748548545082997815840357634948550739612798732309975923280685245876950055614362283769710705811182976142803324242407017104841062064840113262840137625582646683068904149296501029754654149991842951570880471230098259905004533869130509989042199261339990315125973721454059973605358766253998615919997174542922163484086066438120268185904663422979603026066685824578356173882166747093246377302371176167843247359636030248569148734824287739046916641832890744168385253915508446422276378715722482359321205673933317512861336054835392844676749610712462818600179225635467147870208L
-
Marcin over 12 years@StevenRumbalski How is it unhelpful?
-
MrGomez over 12 yearsThat's clever. Looking this up instead of shooting an answer from the hip is completely appropriate!
-
Steven Rumbalski over 12 yearsWhen I read your answer it contained only your first sentence: "The number of bits in an integer is constant in python."
-
zidarsk8 over 12 yearsI already have a bit count lookup table for all the the counts that it's possible to store, but having a large list of numbers and operating on them with a[i] & a[j] , makes your soltuion useless unless i have 10+GB of ram. array for & ^ | for tripples of 10000 numbers would be 3*10000^3 lookup table size. since i don't know what i will need, it makes more sense to just count the few thousand when i need them
-
zidarsk8 over 12 yearsI seriously have no idea how this would work with 10 000 bit numbers though, but I do like the solution. can you give me a hint if and how i can applay that to bigger numbers?
-
David Weldon over 12 yearsDid you benchmark this? On my machine using python 2.7, I found this to actually be a bit slower than
bin(n).count("1")
. -
Marcin over 12 years@zidarsk8 Or, you could use some kind of database or persistent key-value store.
-
zidarsk8 over 12 yearsI have edited my question to make it clear I need this for Large numbers (10k bits and more). optimizing something for 32 bit integers would probablt not make that much of a difference since the number of counts would have to be really big, in which case that would cause the slow execute time.
-
Marcin over 12 years@zidarsk8 10+GB of ram is not shockingly large. If you want to perform fast numerical computation, it's not unreasonable to use medium-large iron.
-
zidarsk8 over 12 years@Marcin I am sorry but 10000^3 would make this a 1000 000 000 000 numbers . if each one is calculating 2000 bits that would make this a 500+ hour job. Look up tables are good when they can be used, in this case it's impossible and not the answer to my question. ps: for your method to work i would need an even faster way of counting these bits.
-
Adam Zalcman over 12 yearsI didn't see the number of bits you're dealing with here. Have you considered writing your data handling code in a low-level language like C? Perhaps as an extension to your python code? You can certainly improve performance by using large arrays in C compared to large numerals in python. That said, you can rewrite the
CountBits()
to handle 10k-bits numbers by adding just 8 lines of code. But it'll become unwieldy due to huge constants. -
zidarsk8 over 12 yearsalso this algorithm takes 3 minutes on my leptop with 2GB of ram (also that 10G was an optimistic guess), but let's not argue about this, since it clearly has no connection to the original question. Thank you anyway for trying. Every answer is a good one even if not usefull in this case. (maybe next time)
-
Marcin over 12 years@zidarsk8 You only need to generate a look-up table once; in fact, you can generate it both incrementally, and piece-meal, so once you have a lookup table that is complete over a given number of bits, you can take your integer in chunks of that length of bits, and count the set bits in each chunk. You seem to be specifically resisting any practical solution proposed in any answer here. Why?
-
Óscar López over 12 years@DavidWeldon No I didn't, could you please post your benchmarks?
-
Karl Knechtel over 12 yearsYou can write code to generate the sequence of constants, and set up a loop for the processing.
-
zidarsk8 over 12 years
-
the wolf over 12 years+1! The converse of this is not accurate, however, it should be stated:
bin(n).count("0")
is not accurate because of the '0b' prefix. Would need to bebin(n)[2:].count('0')
for those counting naughts.... -
kindall over 12 yearsYou can't really count zero bits without knowing how many bytes you're filling, though, which is problematic with a Python long integer because it could be anything.
-
James Youngman over 12 yearsBut GMP is exactly for very large numbers, including numbers at and far beyond the sizes you mention.
-
gsnedders over 9 yearsMemory usage will be better if you use
array.array
forPOPCOUNT_TABLE16
, as then it'll be stored as an array of integers, instead of as a dynamically sized list of Pythonint
objects. -
gerrit over 8 years
%timeit numberOfSetBits(23544235423)
:1000000 loops, best of 3: 818 ns per loop
;%timeit bitCountStr(23544235423)
:1000000 loops, best of 3: 577 ns per loop
. -
gerrit over 8 yearsAlthough those are fast options for single integers, note that algorithms presented in other answers may be potentially vectorised, thus much faster if run across many elements of a large
numpy
array. -
gerrit over 8 yearsThis answer has the great advantage that it can be vectorised for cases dealing with large
numpy
arrays. -
gerrit over 8 yearsHowever,
numberOfSetBits
processes my 864×64numpy.ndarray
in 841 µs. WithbitCountStr
I have to loop explicitly, and it takes 40.7 ms, or almost 50 times longer. -
kindall over 8 yearsFor numpy arrays I'd look into something like this: gist.github.com/aldro61/f604a3fa79b3dec5436a
-
northtree over 7 yearsI have used
bin(n).count("1")
. However, only beats 60% of python submission. @ leetcode -
hola over 5 years"arbitrary-length integers" -- What if it is given that the length is, say, 8 bits?
-
kindall over 5 years@pushpen.paul For 8-bit integers I'd just use a lookup table. Could use a bytearray for it for space efficiency.
-
zidarsk8 about 5 yearslooks great, but it's only good for very "sparse" integers. on average it's quite slow. Still, it looks really useful in certain use cases.
-
Robotbugs about 5 yearsI'm not quite sure what you mean by "on average it's quite slow". Quite slow compared to what? Do you mean slow compared to some other python code that you're not quoting? It's twice as fast as counting bit by bit for the average number. In fact on my macbook it counts 12.6 million bits per second which is a lot faster than I can count them. If you have another generic python algorithm that works for any length of integer and is faster than this I'd like to hear about it.
-
Robotbugs about 5 yearsThis works fast. There's an error, at least on p3, the [1:] should be [2:] because oct() returns '0o' before the string. The code runs a lot faster though if you use hex() instead of oct() and make a 16 entry dictionary,
-
Robotbugs about 5 yearsI do accept that it is actually slower than the answer by Manuel above.
-
zidarsk8 about 5 yearsQuite slow on average means, counting bits for 10000 numbers with 10000 digits takes 0.15s with
bin(n).count("1")
but it took 3.8s for your function. If the numbers had very few bits set it would work fast, but if you take any random number, on average the function above will be orders of magnitude slower. -
Robotbugs about 5 yearsOK I will accept that. I guess I was just being a dick cos you're a little imprecise but you're totally right. I just hadn't tested the method using the method by Manuel above before my comment. It looks very clunky but it is actually very fast. I'm now using a version like that but with 16 values in the dictionary and that's even much faster than the one he quoted. But for the record I was using mine in an application with only a few bits that were set to 1. But for totally random bits yeah it's going to about 50:50 with a little variance decreasing with length.
-
Robotbugs about 5 yearsAlso thanks for taking your time to actually type in and profile the function i quoted. That's appreciated.
-
user202729 over 3 years(I know this answer is old but) the OP was using
long
(as it's called in Python 2) and (seems to) not aware that it's the case. So the first sentence is wrong -- at least in this question. -- -- By the way, Python 2'sint
type maximum value is 2^63-1, so that would definitely not fit in any kind of table; and if you have to use a disk, it's already slower than the slowest algorithm you would find reasonable, since in this case the task is so simple. -
user202729 over 3 years"the number of operations growth with the logarithm of the argument (i.e. linearly with the bit-size of the argument). " -- This is wrong, the number of (addition, bitwise) operation is (asymptotically) log(log n) or log(number of bits). The time complexity is log(n) log(log n) or (number of bits * log(number of bits)) -- because Python's integer is arbitrary-precision (technically
int
data type in Python 2 is 64 bit, but the OP was usinglong
without knowing it) -- so it would almost-likely be slower than gmpy for single numbers. -
PM 2Ring over 3 yearsOn second thoughts, the speeds may vary a fair bit, depending on the environment. I just tried your benchmark script in Python 3.7 on SageMathCell and f-strings were mostly faster. But I was running in Sage mode. (It throws an error in plain Python mode, but that mode isn't really pure Python and it has flaky handling of string literals, eg you have to use
"\\n"
to get a newline). -
PM 2Ring over 3 yearsIt'd be interesting to see times without an extra layer of function calls, where applicable. Here's a short example of timing expressions rather than functions: stackoverflow.com/a/50212230 BTW, it's a good idea to do a few (3-5) repetitions of timeit loops, and use the minimum one, as I discuss here stackoverflow.com/a/43678107/4014959
-
Admin over 3 years
https://docs.python.org/3.10/library/stdtypes.html#int.bit_count
counts set bits. -
antonagestam about 3 yearsThis should be the accepted answer in a few months! 😉👍
-
Hunger almost 3 yearsI'm using python3.7 and gmpy2,
popcount
use 50% time ofcount('1')