Python: Find longest binary gap in binary representation of an integer number

16,579

Solution 1

Your implementation converts the integer to a base two string then visits each character in the string. Instead, you could just visit each bit in the integer using << and &. Doing so will avoid visiting each bit twice (first to convert it to a string, then to check if if it's a "1" or not in the resulting string). It will also avoid allocating memory for the string and then for each substring you inspect.

You can inspect each bit of the integer by visiting 1 << 0, 1 << 1, ..., 1 << (x.bit_length).

For example:

def max_gap(x):
    max_gap_length = 0
    current_gap_length = 0
    for i in range(x.bit_length()):
        if x & (1 << i):
            # Set, any gap is over.
            if current_gap_length > max_gap_length:
                max_gap_length = current_gap_length
            current_gap_length = 0
         else:
            # Not set, the gap widens.
            current_gap_length += 1
    # Gap might end at the end.
    if current_gap_length > max_gap_length:
        max_gap_length = current_gap_length
    return max_gap_length

Solution 2

I do realize that brevity does not mean readability nor efficiency.

However, ability to spell out solution in spoken language and implement it in Python in no time constitutes efficient use of my time.

For binary gap: hey, lets convert int into binary, strip trailing zeros, split at '1' to list, then find longest element in list and get this element lenght.

def binary_gap(N):
    return len(max(format(N, 'b').strip('0').split('1')))  

Solution 3

def max_gap(N):
    xs = bin(N)[2:].strip('0').split('1')
    return max([len(x) for x in xs])

Explanation:

  1. Both leading and trailing zeros are redundant with binary gap finding as they are not bounded by two 1's (left and right respectively)
  2. So step 1 striping zeros left and right
  3. Then splitting by 1's yields all sequences of 0'z
  4. Solution: The maximum length of 0's sub-strings

Solution 4

As suggested in the comments, itertools.groupby is efficient in grouping elements of an iterable like a string. You could approach it like this:

from itertools import groupby

def count_gap(x):
    b = "{0:b}".format(x)
    return max(len(list(v)) for k, v in groupby(b.strip("0")) if k == "0")

number = 123456
print(count_gap(number))

First we strip all zeroes from the ends, because a gap has to have on both ends a one. Then itertools.groupby groups ones and zeros and we extract the key (i.e. "0" or "1") together with a group (i.e. if we convert it into a list, it looks like "0000" or "11"). Next we collect the length for every group v, if k is zero. And from this we determine the largest number, i.e. the longest gap of zeroes amidst the ones.

Solution 5

I think the accepted answer dose not work when the input number is 32 (100000). Here is my solution:

def solution(N):
    res = 0
    st = -1
    for i in range(N.bit_length()):
        if N & (1 << i):
            if st != -1:
                res = max(res, i - st - 1)
            st = i

    return res
Share:
16,579
Admin
Author by

Admin

Updated on June 07, 2022

Comments

  • Admin
    Admin almost 2 years

    I would like to know if my implementation is efficient. I have tried to find the simplest and low complex solution to that problem using python.

    def count_gap(x):
        """
            Perform Find the longest sequence of zeros between ones "gap" in binary representation of an integer
    
            Parameters
            ----------
            x : int
                input integer value
    
            Returns
            ----------
            max_gap : int
                the maximum gap length
    
        """
        try:
            # Convert int to binary
            b = "{0:b}".format(x)
            # Iterate from right to lift 
            # Start detecting gaps after fist "one"
            for i,j in enumerate(b[::-1]):
                if int(j) == 1:
                    max_gap = max([len(i) for i in b[::-1][i:].split('1') if i])
                    break
        except ValueError:
            print("Oops! no gap found")
            max_gap = 0
        return max_gap
    

    let me know your opinion.

  • Mark Dickinson
    Mark Dickinson about 6 years
    I'd suggest using the int.bit_length method instead of the ceil(log2(...)) computation, to avoid corner-case errors due to rounding.
  • Admin
    Admin about 6 years
    You have said Your implementation converts the integer to a base two string then visits each character in the string but that is not completely correct as I break after detecting one then I split. Could you plz highlight why your implementation should be better in terms of time and memory complexity, please.
  • Jean-Paul Calderone
    Jean-Paul Calderone about 6 years
    You still have to visit each character. How else can split find the right place to split? It visits each character until it finds the value you supplied.
  • Admin
    Admin about 6 years
    Hi Jean, Your code is much slower than the one I provided. I will add the time complexity in the answer (as running time test code).
  • Jean-Paul Calderone
    Jean-Paul Calderone about 6 years
    Note that "complexity" is not the same as "wallclock runtime".
  • Jean-Paul Calderone
    Jean-Paul Calderone about 6 years
    Complexity wise, I think I've convinced myself that your solution is O(N) (so is mine) and I doubt it's possible to do better than that. So all that's left is constant factors and in Python those are often minimized by making sure you use built-in functionality (since the C implementation is typically much faster than anything you can do with Python code). On PyPy the story may be different.
  • Admin
    Admin about 6 years
    Thanks, Jean. In the beginning, i was afraid to publish my codes in StackOverflow (negative comments), but you really encouraged me to keep on going (publish and optimize). Good luck
  • Daniel McLaury
    Daniel McLaury about 6 years
    @Jean-PaulCalderone: It looks to me like your algorithm is actually O((log x)^2) where the OP's is O(log x), which leads to dramatic differences in runtime as the OP's test case has log(x) around half a million. The problem appears to be the calculation of 1 << i, which requires zeroing i/8 bytes of memory and consequently runs in O(i) time. The .format OP uses, on the other hand, presumably just loops through the bytes of x, converts each to an 8-byte string, and writes the result of the conversion into a buffer, which is an O(log x) operation that's done once.
  • Jean-Paul Calderone
    Jean-Paul Calderone about 6 years
    Nice catch. I thought about taking the 1 << i out and replacing it with a loop variable and loop_var << 1. Even with this change, my version is still slower than the OPs (except for very very small integers). Often, complexity doesn't matter in Python (even if you can figure out - what about the range, what about the b[::-1], etc) because finding some way to push a "more" complex operation off into C (format) still beats a "less" complex operation in byte code. My version goes through a whole byte code interpretation loop for each bit! Seriously big k being applied there.
  • Kushal Kadam
    Kushal Kadam over 5 years
    This works fine with any "Binary Gap " Lesson 1 codility
  • Nic3500
    Nic3500 over 5 years
    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
  • hellow
    hellow over 5 years
    Your code does not even run, because of the missing indention. Please fix that!
  • Kushal Kadam
    Kushal Kadam over 5 years
    @hellow, I had problems while posting . The code works absolutely fine .
  • Kushal Kadam
    Kushal Kadam over 5 years
    @Nic3500 I acknowledge, this is my first time :)
  • hellow
    hellow over 5 years
    Then please read stackoverflow.com/editing-help and stackoverflow.com/help/how-to-answer . There is a ton of help available how to format your answer properly, so please don't use that as an excuse ;) (I don't drive a car, before I took some driving lessons. That's (nearly) the same. Read the instructions first and post afterwards)
  • sepehr
    sepehr over 5 years
    Welcome to Stack Overflow! Thank you for the code snippet, which might provide some limited, immediate help. A proper explanation would greatly improve its long-term value by describing why this is a good solution to the problem, and would make it more useful to future readers with other similar questions. Please edit your answer to add some explanation, including the assumptions you've made.
  • Harsha Biyani
    Harsha Biyani over 5 years
    This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
  • Brian Wylie
    Brian Wylie over 5 years
    Super nice :thumbsup:
  • Lutaaya Huzaifah Idris
    Lutaaya Huzaifah Idris over 5 years
    Add explanation.
  • Encipher
    Encipher about 5 years
    This one liner really very efficient. But I got stuck in one point, what is the functionality of strip? I ran that code without using strip and it had same output. Thank you in advance.
  • Aivar Paalberg
    Aivar Paalberg about 5 years
    By definition binary gap is 'zeros between ones', therefore one should not consider trailing zeros as binary gap. Try to run both versions with int 32 (100000 in binary). With strip you get 0 (there is no binary gap) without strip you get 5 (which is incorrect as there is no 5 zeros between ones).
  • Mebin Joe
    Mebin Joe over 4 years
    You should edit your answer to add a minimal reproducible example demonstrating the problem.
  • pzaenger
    pzaenger over 4 years
    Can you explain your code? Dropping code without doing so might not be that helpful.
  • Brian Tompsett - 汤莱恩
    Brian Tompsett - 汤莱恩 about 4 years
    Please explain why your answer is better than the others. This will help people learn from your answer.
  • quassy
    quassy almost 4 years
    If you are looking for performance: bin(), which @Orenico uses in his answers, is quite a bit faster than format(N, 'b').
  • Aivar Paalberg
    Aivar Paalberg almost 4 years
    @quassy - please define 'quite a bit faster'. Casual check with timeit & Python 3.8 shows that format() is little bit faster than bin()[2:]. Same casual check also indicates that one can claim f"{N:b}" to be "quite a bit faster".
  • quassy
    quassy over 3 years
    Testing my implementation I see small but probably not significant differences for testing all numbers from 1 to 21474836 (.format() took 26.1s; format() took 25.6s; bin() took 24.2s; also for bigger ranges). This seems to confirm other observations.
  • Admin
    Admin over 2 years
    Please provide additional details in your answer. As it's currently written, it's hard to understand your solution.
  • General Grievance
    General Grievance over 2 years
    Does this offer any benefit over other solutions offered here? Please edit to explain.
  • Henke
    Henke over 2 years
    If you want to run the corresponding Codility test, then just rename binary_gap as solution. This solution scores as 100% correct and it passes 15 of 15 tests.