Python: Find longest binary gap in binary representation of an integer number

python python-3.x binary

16,579

Solution 1

Your implementation converts the integer to a base two string then visits each character in the string. Instead, you could just visit each bit in the integer using << and &. Doing so will avoid visiting each bit twice (first to convert it to a string, then to check if if it's a "1" or not in the resulting string). It will also avoid allocating memory for the string and then for each substring you inspect.

You can inspect each bit of the integer by visiting 1 << 0, 1 << 1, ..., 1 << (x.bit_length).

For example:

def max_gap(x):
    max_gap_length = 0
    current_gap_length = 0
    for i in range(x.bit_length()):
        if x & (1 << i):
            # Set, any gap is over.
            if current_gap_length > max_gap_length:
                max_gap_length = current_gap_length
            current_gap_length = 0
         else:
            # Not set, the gap widens.
            current_gap_length += 1
    # Gap might end at the end.
    if current_gap_length > max_gap_length:
        max_gap_length = current_gap_length
    return max_gap_length

Solution 2

I do realize that brevity does not mean readability nor efficiency.

However, ability to spell out solution in spoken language and implement it in Python in no time constitutes efficient use of my time.

For binary gap: hey, lets convert int into binary, strip trailing zeros, split at '1' to list, then find longest element in list and get this element lenght.

def binary_gap(N):
    return len(max(format(N, 'b').strip('0').split('1')))

Solution 3

def max_gap(N):
    xs = bin(N)[2:].strip('0').split('1')
    return max([len(x) for x in xs])

Explanation:

Both leading and trailing zeros are redundant with binary gap finding as they are not bounded by two 1's (left and right respectively)
So step 1 striping zeros left and right
Then splitting by 1's yields all sequences of 0'z
Solution: The maximum length of 0's sub-strings

Solution 4

As suggested in the comments, itertools.groupby is efficient in grouping elements of an iterable like a string. You could approach it like this:

from itertools import groupby

def count_gap(x):
    b = "{0:b}".format(x)
    return max(len(list(v)) for k, v in groupby(b.strip("0")) if k == "0")

number = 123456
print(count_gap(number))

First we strip all zeroes from the ends, because a gap has to have on both ends a one. Then itertools.groupby groups ones and zeros and we extract the key (i.e. "0" or "1") together with a group (i.e. if we convert it into a list, it looks like "0000" or "11"). Next we collect the length for every group v, if k is zero. And from this we determine the largest number, i.e. the longest gap of zeroes amidst the ones.

Solution 5

I think the accepted answer dose not work when the input number is 32 (100000). Here is my solution:

def solution(N):
    res = 0
    st = -1
    for i in range(N.bit_length()):
        if N & (1 << i):
            if st != -1:
                res = max(res, i - st - 1)
            st = i

    return res

View more solutions

16,579

Author by

Admin

Updated on June 07, 2022

Comments

Admin almost 2 years

I would like to know if my implementation is efficient. I have tried to find the simplest and low complex solution to that problem using python.

def count_gap(x):
    """
        Perform Find the longest sequence of zeros between ones "gap" in binary representation of an integer

        Parameters
        ----------
        x : int
            input integer value

        Returns
        ----------
        max_gap : int
            the maximum gap length

    """
    try:
        # Convert int to binary
        b = "{0:b}".format(x)
        # Iterate from right to lift 
        # Start detecting gaps after fist "one"
        for i,j in enumerate(b[::-1]):
            if int(j) == 1:
                max_gap = max([len(i) for i in b[::-1][i:].split('1') if i])
                break
    except ValueError:
        print("Oops! no gap found")
        max_gap = 0
    return max_gap

let me know your opinion.

Mark Dickinson about 6 years

I'd suggest using the int.bit_length method instead of the ceil(log2(...)) computation, to avoid corner-case errors due to rounding.
Admin about 6 years

You have said Your implementation converts the integer to a base two string then visits each character in the string but that is not completely correct as I break after detecting one then I split. Could you plz highlight why your implementation should be better in terms of time and memory complexity, please.
Jean-Paul Calderone about 6 years

You still have to visit each character. How else can split find the right place to split? It visits each character until it finds the value you supplied.
Admin about 6 years

Hi Jean, Your code is much slower than the one I provided. I will add the time complexity in the answer (as running time test code).
Jean-Paul Calderone about 6 years

Note that "complexity" is not the same as "wallclock runtime".
Jean-Paul Calderone about 6 years

Complexity wise, I think I've convinced myself that your solution is O(N) (so is mine) and I doubt it's possible to do better than that. So all that's left is constant factors and in Python those are often minimized by making sure you use built-in functionality (since the C implementation is typically much faster than anything you can do with Python code). On PyPy the story may be different.
Admin about 6 years

Thanks, Jean. In the beginning, i was afraid to publish my codes in StackOverflow (negative comments), but you really encouraged me to keep on going (publish and optimize). Good luck
Daniel McLaury about 6 years

@Jean-PaulCalderone: It looks to me like your algorithm is actually O((log x)^2) where the OP's is O(log x), which leads to dramatic differences in runtime as the OP's test case has log(x) around half a million. The problem appears to be the calculation of 1 << i, which requires zeroing i/8 bytes of memory and consequently runs in O(i) time. The .format OP uses, on the other hand, presumably just loops through the bytes of x, converts each to an 8-byte string, and writes the result of the conversion into a buffer, which is an O(log x) operation that's done once.
Jean-Paul Calderone about 6 years

Nice catch. I thought about taking the 1 << i out and replacing it with a loop variable and loop_var << 1. Even with this change, my version is still slower than the OPs (except for very very small integers). Often, complexity doesn't matter in Python (even if you can figure out - what about the range, what about the b[::-1], etc) because finding some way to push a "more" complex operation off into C (format) still beats a "less" complex operation in byte code. My version goes through a whole byte code interpretation loop for each bit! Seriously big k being applied there.
Kushal Kadam over 5 years

This works fine with any "Binary Gap " Lesson 1 codility
Nic3500 over 5 years

While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
hellow over 5 years

Your code does not even run, because of the missing indention. Please fix that!
Kushal Kadam over 5 years

@hellow, I had problems while posting . The code works absolutely fine .
Kushal Kadam over 5 years

@Nic3500 I acknowledge, this is my first time :)
hellow over 5 years

Then please read stackoverflow.com/editing-help and stackoverflow.com/help/how-to-answer . There is a ton of help available how to format your answer properly, so please don't use that as an excuse ;) (I don't drive a car, before I took some driving lessons. That's (nearly) the same. Read the instructions first and post afterwards)
sepehr over 5 years

Welcome to Stack Overflow! Thank you for the code snippet, which might provide some limited, immediate help. A proper explanation would greatly improve its long-term value by describing why this is a good solution to the problem, and would make it more useful to future readers with other similar questions. Please edit your answer to add some explanation, including the assumptions you've made.
Harsha Biyani over 5 years

This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
Brian Wylie over 5 years

Super nice :thumbsup:
Lutaaya Huzaifah Idris over 5 years

Add explanation.
Encipher about 5 years

This one liner really very efficient. But I got stuck in one point, what is the functionality of strip? I ran that code without using strip and it had same output. Thank you in advance.
Aivar Paalberg about 5 years

By definition binary gap is 'zeros between ones', therefore one should not consider trailing zeros as binary gap. Try to run both versions with int 32 (100000 in binary). With strip you get 0 (there is no binary gap) without strip you get 5 (which is incorrect as there is no 5 zeros between ones).
Mebin Joe over 4 years

You should edit your answer to add a minimal reproducible example demonstrating the problem.
pzaenger over 4 years

Can you explain your code? Dropping code without doing so might not be that helpful.
Brian Tompsett - 汤莱恩 about 4 years

Please explain why your answer is better than the others. This will help people learn from your answer.
quassy almost 4 years

If you are looking for performance: bin(), which @Orenico uses in his answers, is quite a bit faster than format(N, 'b').
Aivar Paalberg almost 4 years

@quassy - please define 'quite a bit faster'. Casual check with timeit & Python 3.8 shows that format() is little bit faster than bin()[2:]. Same casual check also indicates that one can claim f"{N:b}" to be "quite a bit faster".
quassy over 3 years

Testing my implementation I see small but probably not significant differences for testing all numbers from 1 to 21474836 (.format() took 26.1s; format() took 25.6s; bin() took 24.2s; also for bigger ranges). This seems to confirm other observations.
Admin over 2 years

Please provide additional details in your answer. As it's currently written, it's hard to understand your solution.
General Grievance over 2 years

Does this offer any benefit over other solutions offered here? Please edit to explain.
Henke over 2 years

If you want to run the corresponding Codility test, then just rename binary_gap as solution. This solution scores as 100% correct and it passes 15 of 15 tests.