Convert a very large number from decimal string to binary representation?

12,248

If this is a genuine problem, there are plenty of BigNum libraries out there to assist, such as the MPIR library.

If it's something where you can't use a third-party library, it's still relatively easy. You don't actually need a complex BigNum library for this, you only need one operation: divide by two.

Here's how you do it. Start with an empty stack of binary digits. Then loop until the number is "0" (yes, that's still a string). If the last digit of the number is odd, push 1 on to the stack, otherwise push 0. Then divide the number by two and restart the loop.

Once the loop is finished (number is "0"), pop the digits off the stack one at a time and print them. There you go.


Oh, yeah, the divide-by-two, that is a rather important piece of the puzzle :-)

Let's start with "12345". Here's the process you follow, in pseudo-code.

Set next_additive to 0.
For every digit in number (starting at the left):
    Set additive to next_additive.
    If the digit is odd, set next_additive to 5, else set it to 0.
    Divide the digit by two (truncating) then add additive.
Remove leading zero if necessary (if it starts with 0 but is not just 0).

This can be done by processing the actual string one character at a time.

  • Starting with 1 (from 12345), additive is 0, number is odd, so next_additive is 5. Divide 1 by 2 and add additive of 0, you get 0: 02345.

  • Next digit 2, additive is 5, number is even, so next_additive is 0. Divide 2 by 2 and add additive of 5, you get 6: 06345.

  • Next digit 3, additive is 0, number is odd, so next_additive is 5. Divide 3 by 2 and add additive of 0, you get 1: 06145.

  • Next digit 4, additive is 5, number is even, so next_additive is 0. Divide 4 by 2 and add additive of 5, you get 7: 06175.

  • Next digit 5, additive is 0, number is odd, so next_additive is 5. Divide 5 by 2 and add additive of 0, you get 2: 06172.

  • Strip off leading zeros: 6172. Ignore the next additive since you're truncating the result.

And there you have it: 12345 / 2 = 6172.


By way of example, here's a Python approach to implementing this algorithm as follows. First the support routine for checking if a string-number is odd (keep in mind this isn't meant to be Pythonic code, it's just to show how it could be done - there's almost certainly better ways to do this in Python but that won't necessarily map well to another language):

def oddsToOne(s):
    if s.endswith('1'): return 1
    if s.endswith('3'): return 1
    if s.endswith('5'): return 1
    if s.endswith('7'): return 1
    if s.endswith('9'): return 1
    return 0

Then another support routine for dividing a string-number by two:

def divByTwo(s):
    new_s = ''
    add = 0

    for ch in s:
        new_dgt = (ord(ch) - ord('0')) // 2 + add
        new_s = '%s%d' % (new_s, new_dgt)
        add = oddsToOne(ch) * 5
    
    if new_s != '0' and new_s.startswith('0'):
        new_s = new_s[1:]
    
    return new_s

And, finally, some actual code to make a binary string from the decimal string:

num = '12345'

if num == '0':
    stack = '0'
else:
    stack = ''
    while num != '0':
        stack = '%d%s'%(oddsToOne(num), stack)
        num = divByTwo (num)

print(stack)

Note that if you wanted to actually use this to populate real bits (rather than make a string of bits), it's a simple matter to change what happens in the if and else clauses.

As stated, it's probably not the most efficient or beautiful Python code you could come up with but it's simply meant to show the process, not be some well-engineered production-ready piece of code. The output is (with some added stuff below to show what's going on):

12345
11000000111001
||      |||  |
||      |||  +-     1
||      ||+----     8
||      |+-----    16
||      +------    32
|+-------------  4096
+--------------  8192
                =====
                12345

Because this works on the string representation of the numbers, there is no arbitrary numeric limit such as the size of a 64-bit integer. Some example values are (slightly reformatted into 32-digit chunks for readability):

123456781234567812345678
=> 11010001001001001101100000011011
   01110110001110110010110110101111
   0111101001110

99999999999999999999999999999999
99999999999999999999999999999999
99999999999999999999999999999999
9999
=> 10010010010011010110100100101100
   10100110000110111110011101011000
   01011001001111000010011000100110
   01110000010111111001110001010110
   01110010000001000111000100001000
   11010011111001010101010110010010
   00011000010001010100000101110100
   01111000011111111111111111111111
   11111111111111111111111111111111
   11111111111111111111111111111111
   1111111111111
Share:
12,248
frodo
Author by

frodo

Updated on July 26, 2022

Comments

  • frodo
    frodo almost 2 years

    I have a very big number, on the order of a thousand decimal digits, and I have to convert this to its binary representation. The numbers are stored as strings.

    Since few languages have a basic data type to handle numbers this big, I see no easy way to turn this into an integral value for which I could convert it.

    Could someone please help me out here? What would be a viable approach for doing this?

  • ivenxu
    ivenxu over 10 years
    Do you have any idea to convert the binary back to decimal string?
  • Denis
    Denis about 10 years
    @paxdiablo, this is a great answer. Thank you.
  • derek
    derek over 8 years
    this wont work if the input decimal string is very large. Because if the decimal string is very large, it cannot be represented as a number and therefore it cannot be divided by 2. try this: "123456781234567812345678".
  • paxdiablo
    paxdiablo over 8 years
    @derek, all operations are performed on strings in this answer (see my text yes, that's still a string) so there is no arbitrary numeric limit placed on the values. The only limit is storage space. In fact, I'll show the output for your exact case (and a much bigger one).
  • derek
    derek over 8 years
    You are right. I did not notice you are operating on strings. Good solution!
  • Jan Jůna
    Jan Jůna over 7 years
    There is missing one small thing: if num == '0': stack = '0'
  • paxdiablo
    paxdiablo over 7 years
    Good point, @JanJůna, added that in.
  • Kerrek SB
    Kerrek SB about 7 years
    I think your pseudocode is missing the use of additive.
  • paxdiablo
    paxdiablo about 7 years
    @KerrekSB, good catch, I appear to have left it out of the pseudo-code yet still use it in the steps - have added that in.
  • Doot
    Doot about 4 years
    Is there full documentation to this algorithm? I can't seem to find one online...
  • poby
    poby about 4 years
    This is brilliant! I needed to convert a long decimal string to a bunch of int64s and this method worked beautifully!