Fast way of counting non-zero bits in positive integer

113,683

Solution 1

For arbitrary-length integers, bin(n).count("1") is the fastest I could find in pure Python.

I tried adapting Óscar's and Adam's solutions to process the integer in 64-bit and 32-bit chunks, respectively. Both were at least ten times slower than bin(n).count("1") (the 32-bit version took about half again as much time).

On the other hand, gmpy popcount() took about 1/20th of the time of bin(n).count("1"). So if you can install gmpy, use that.

To answer a question in the comments, for bytes I'd use a lookup table. You can generate it at runtime:

counts = bytes(bin(x).count("1") for x in range(256))  # py2: use bytearray

Or just define it literally:

counts = (b'\x00\x01\x01\x02\x01\x02\x02\x03\x01\x02\x02\x03\x02\x03\x03\x04'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x04\x05\x05\x06\x05\x06\x06\x07\x05\x06\x06\x07\x06\x07\x07\x08')

Then it's counts[x] to get the number of 1 bits in x where 0 ≤ x ≤ 255.

Solution 2

Python 3.10 introduces int.bit_count():

>>> n = 19
>>> bin(n)
'0b10011'
>>> n.bit_count()
3
>>> (-n).bit_count()
3

This is functionally equivalent to bin(n).count("1") but should be ~6 times faster. See also Issue29882.

Solution 3

You can adapt the following algorithm:

def CountBits(n):
  n = (n & 0x5555555555555555) + ((n & 0xAAAAAAAAAAAAAAAA) >> 1)
  n = (n & 0x3333333333333333) + ((n & 0xCCCCCCCCCCCCCCCC) >> 2)
  n = (n & 0x0F0F0F0F0F0F0F0F) + ((n & 0xF0F0F0F0F0F0F0F0) >> 4)
  n = (n & 0x00FF00FF00FF00FF) + ((n & 0xFF00FF00FF00FF00) >> 8)
  n = (n & 0x0000FFFF0000FFFF) + ((n & 0xFFFF0000FFFF0000) >> 16)
  n = (n & 0x00000000FFFFFFFF) + ((n & 0xFFFFFFFF00000000) >> 32) # This last & isn't strictly necessary.
  return n

This works for 64-bit positive numbers, but it's easily extendable and the number of operations growth with the logarithm of the argument (i.e. linearly with the bit-size of the argument).

In order to understand how this works imagine that you divide the entire 64-bit string into 64 1-bit buckets. Each bucket's value is equal to the number of bits set in the bucket (0 if no bits are set and 1 if one bit is set). The first transformation results in an analogous state, but with 32 buckets each 2-bit long. This is achieved by appropriately shifting the buckets and adding their values (one addition takes care of all buckets since no carry can occur across buckets - n-bit number is always long enough to encode number n). Further transformations lead to states with exponentially decreasing number of buckets of exponentially growing size until we arrive at one 64-bit long bucket. This gives the number of bits set in the original argument.

Solution 4

Here's a Python implementation of the population count algorithm, as explained in this post:

def numberOfSetBits(i):
    i = i - ((i >> 1) & 0x55555555)
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
    return (((i + (i >> 4) & 0xF0F0F0F) * 0x1010101) & 0xffffffff) >> 24

It will work for 0 <= i < 0x100000000.

Solution 5

I really like this method. Its simple and pretty fast but also not limited in the bit length since python has infinite integers.

It's actually more cunning than it looks, because it avoids wasting time scanning the zeros. For example it will take the same time to count the set bits in 1000000000000000000000010100000001 as in 1111.

def get_bit_count(value):
   n = 0
   while value:
      n += 1
      value &= value-1
   return n
Share:
113,683
zidarsk8
Author by

zidarsk8

Your average CS student.

Updated on July 08, 2022

Comments

  • zidarsk8
    zidarsk8 almost 2 years

    I need a fast way to count the number of bits in an integer in python. My current solution is

    bin(n).count("1")
    

    but I am wondering if there is any faster way of doing this?

    PS: (i am representing a big 2D binary array as a single list of numbers and doing bitwise operations, and that brings the time down from hours to minutes. and now I would like to get rid of those extra minutes.

    Edit: 1. it has to be in python 2.7 or 2.6

    and optimizing for small numbers does not matter that much since that would not be a clear bottleneck, but I do have numbers with 10 000 + bits at some places

    for example this is a 2000 bit case:

    12448057941136394342297748548545082997815840357634948550739612798732309975923280685245876950055614362283769710705811182976142803324242407017104841062064840113262840137625582646683068904149296501029754654149991842951570880471230098259905004533869130509989042199261339990315125973721454059973605358766253998615919997174542922163484086066438120268185904663422979603026066685824578356173882166747093246377302371176167843247359636030248569148734824287739046916641832890744168385253915508446422276378715722482359321205673933317512861336054835392844676749610712462818600179225635467147870208L
    
  • Marcin
    Marcin over 12 years
    @StevenRumbalski How is it unhelpful?
  • MrGomez
    MrGomez over 12 years
    That's clever. Looking this up instead of shooting an answer from the hip is completely appropriate!
  • Steven Rumbalski
    Steven Rumbalski over 12 years
    When I read your answer it contained only your first sentence: "The number of bits in an integer is constant in python."
  • zidarsk8
    zidarsk8 over 12 years
    I already have a bit count lookup table for all the the counts that it's possible to store, but having a large list of numbers and operating on them with a[i] & a[j] , makes your soltuion useless unless i have 10+GB of ram. array for & ^ | for tripples of 10000 numbers would be 3*10000^3 lookup table size. since i don't know what i will need, it makes more sense to just count the few thousand when i need them
  • zidarsk8
    zidarsk8 over 12 years
    I seriously have no idea how this would work with 10 000 bit numbers though, but I do like the solution. can you give me a hint if and how i can applay that to bigger numbers?
  • David Weldon
    David Weldon over 12 years
    Did you benchmark this? On my machine using python 2.7, I found this to actually be a bit slower than bin(n).count("1").
  • Marcin
    Marcin over 12 years
    @zidarsk8 Or, you could use some kind of database or persistent key-value store.
  • zidarsk8
    zidarsk8 over 12 years
    I have edited my question to make it clear I need this for Large numbers (10k bits and more). optimizing something for 32 bit integers would probablt not make that much of a difference since the number of counts would have to be really big, in which case that would cause the slow execute time.
  • Marcin
    Marcin over 12 years
    @zidarsk8 10+GB of ram is not shockingly large. If you want to perform fast numerical computation, it's not unreasonable to use medium-large iron.
  • zidarsk8
    zidarsk8 over 12 years
    @Marcin I am sorry but 10000^3 would make this a 1000 000 000 000 numbers . if each one is calculating 2000 bits that would make this a 500+ hour job. Look up tables are good when they can be used, in this case it's impossible and not the answer to my question. ps: for your method to work i would need an even faster way of counting these bits.
  • Adam Zalcman
    Adam Zalcman over 12 years
    I didn't see the number of bits you're dealing with here. Have you considered writing your data handling code in a low-level language like C? Perhaps as an extension to your python code? You can certainly improve performance by using large arrays in C compared to large numerals in python. That said, you can rewrite the CountBits() to handle 10k-bits numbers by adding just 8 lines of code. But it'll become unwieldy due to huge constants.
  • zidarsk8
    zidarsk8 over 12 years
    also this algorithm takes 3 minutes on my leptop with 2GB of ram (also that 10G was an optimistic guess), but let's not argue about this, since it clearly has no connection to the original question. Thank you anyway for trying. Every answer is a good one even if not usefull in this case. (maybe next time)
  • Marcin
    Marcin over 12 years
    @zidarsk8 You only need to generate a look-up table once; in fact, you can generate it both incrementally, and piece-meal, so once you have a lookup table that is complete over a given number of bits, you can take your integer in chunks of that length of bits, and count the set bits in each chunk. You seem to be specifically resisting any practical solution proposed in any answer here. Why?
  • Óscar López
    Óscar López over 12 years
    @DavidWeldon No I didn't, could you please post your benchmarks?
  • Karl Knechtel
    Karl Knechtel over 12 years
    You can write code to generate the sequence of constants, and set up a loop for the processing.
  • zidarsk8
    zidarsk8 over 12 years
  • the wolf
    the wolf over 12 years
    +1! The converse of this is not accurate, however, it should be stated: bin(n).count("0") is not accurate because of the '0b' prefix. Would need to be bin(n)[2:].count('0') for those counting naughts....
  • kindall
    kindall over 12 years
    You can't really count zero bits without knowing how many bytes you're filling, though, which is problematic with a Python long integer because it could be anything.
  • James Youngman
    James Youngman over 12 years
    But GMP is exactly for very large numbers, including numbers at and far beyond the sizes you mention.
  • gsnedders
    gsnedders over 9 years
    Memory usage will be better if you use array.array for POPCOUNT_TABLE16, as then it'll be stored as an array of integers, instead of as a dynamically sized list of Python int objects.
  • gerrit
    gerrit over 8 years
    %timeit numberOfSetBits(23544235423): 1000000 loops, best of 3: 818 ns per loop; %timeit bitCountStr(23544235423): 1000000 loops, best of 3: 577 ns per loop.
  • gerrit
    gerrit over 8 years
    Although those are fast options for single integers, note that algorithms presented in other answers may be potentially vectorised, thus much faster if run across many elements of a large numpy array.
  • gerrit
    gerrit over 8 years
    This answer has the great advantage that it can be vectorised for cases dealing with large numpy arrays.
  • gerrit
    gerrit over 8 years
    However, numberOfSetBits processes my 864×64 numpy.ndarray in 841 µs. With bitCountStr I have to loop explicitly, and it takes 40.7 ms, or almost 50 times longer.
  • kindall
    kindall over 8 years
    For numpy arrays I'd look into something like this: gist.github.com/aldro61/f604a3fa79b3dec5436a
  • northtree
    northtree over 7 years
    I have used bin(n).count("1"). However, only beats 60% of python submission. @ leetcode
  • hola
    hola over 5 years
    "arbitrary-length integers" -- What if it is given that the length is, say, 8 bits?
  • kindall
    kindall over 5 years
    @pushpen.paul For 8-bit integers I'd just use a lookup table. Could use a bytearray for it for space efficiency.
  • zidarsk8
    zidarsk8 about 5 years
    looks great, but it's only good for very "sparse" integers. on average it's quite slow. Still, it looks really useful in certain use cases.
  • Robotbugs
    Robotbugs about 5 years
    I'm not quite sure what you mean by "on average it's quite slow". Quite slow compared to what? Do you mean slow compared to some other python code that you're not quoting? It's twice as fast as counting bit by bit for the average number. In fact on my macbook it counts 12.6 million bits per second which is a lot faster than I can count them. If you have another generic python algorithm that works for any length of integer and is faster than this I'd like to hear about it.
  • Robotbugs
    Robotbugs about 5 years
    This works fast. There's an error, at least on p3, the [1:] should be [2:] because oct() returns '0o' before the string. The code runs a lot faster though if you use hex() instead of oct() and make a 16 entry dictionary,
  • Robotbugs
    Robotbugs about 5 years
    I do accept that it is actually slower than the answer by Manuel above.
  • zidarsk8
    zidarsk8 about 5 years
    Quite slow on average means, counting bits for 10000 numbers with 10000 digits takes 0.15s with bin(n).count("1") but it took 3.8s for your function. If the numbers had very few bits set it would work fast, but if you take any random number, on average the function above will be orders of magnitude slower.
  • Robotbugs
    Robotbugs about 5 years
    OK I will accept that. I guess I was just being a dick cos you're a little imprecise but you're totally right. I just hadn't tested the method using the method by Manuel above before my comment. It looks very clunky but it is actually very fast. I'm now using a version like that but with 16 values in the dictionary and that's even much faster than the one he quoted. But for the record I was using mine in an application with only a few bits that were set to 1. But for totally random bits yeah it's going to about 50:50 with a little variance decreasing with length.
  • Robotbugs
    Robotbugs about 5 years
    Also thanks for taking your time to actually type in and profile the function i quoted. That's appreciated.
  • user202729
    user202729 over 3 years
    (I know this answer is old but) the OP was using long (as it's called in Python 2) and (seems to) not aware that it's the case. So the first sentence is wrong -- at least in this question. -- -- By the way, Python 2's int type maximum value is 2^63-1, so that would definitely not fit in any kind of table; and if you have to use a disk, it's already slower than the slowest algorithm you would find reasonable, since in this case the task is so simple.
  • user202729
    user202729 over 3 years
    "the number of operations growth with the logarithm of the argument (i.e. linearly with the bit-size of the argument). " -- This is wrong, the number of (addition, bitwise) operation is (asymptotically) log(log n) or log(number of bits). The time complexity is log(n) log(log n) or (number of bits * log(number of bits)) -- because Python's integer is arbitrary-precision (technically int data type in Python 2 is 64 bit, but the OP was using long without knowing it) -- so it would almost-likely be slower than gmpy for single numbers.
  • PM 2Ring
    PM 2Ring over 3 years
    On second thoughts, the speeds may vary a fair bit, depending on the environment. I just tried your benchmark script in Python 3.7 on SageMathCell and f-strings were mostly faster. But I was running in Sage mode. (It throws an error in plain Python mode, but that mode isn't really pure Python and it has flaky handling of string literals, eg you have to use "\\n" to get a newline).
  • PM 2Ring
    PM 2Ring over 3 years
    It'd be interesting to see times without an extra layer of function calls, where applicable. Here's a short example of timing expressions rather than functions: stackoverflow.com/a/50212230 BTW, it's a good idea to do a few (3-5) repetitions of timeit loops, and use the minimum one, as I discuss here stackoverflow.com/a/43678107/4014959
  • Admin
    Admin over 3 years
    https://docs.python.org/3.10/library/stdtypes.html#int.bit_c‌​ount counts set bits.
  • antonagestam
    antonagestam about 3 years
    This should be the accepted answer in a few months! 😉👍
  • Hunger
    Hunger almost 3 years
    I'm using python3.7 and gmpy2, popcount use 50% time of count('1')