How to convert a sha256 object to integer and pack it to bytearray in python?

15,365

Solution 1

The simplest way in Python 2 to get the integer value of the SHA-256 digest is via the hexdigest. Alternatively, you can loop over the bytearray constructed from the binary digest. Both methods are illustrated below.

import hashlib

hashobj = hashlib.sha256('something')
val_hex = hashobj.hexdigest()
print val_hex

# Build bytearray from binary digest
val_bytes = bytearray(hashobj.digest())
print ''.join(['%02x' % byte for byte in val_bytes])

# Get integer value of digest from the hexdigest
val_int = int(val_hex, 16)
print '%064x' % val_int

# Get integer value of digest from the bytearray
n = 0
for byte in val_bytes:
    n = n<<8 | byte
print '%064x' % n

output

3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb
3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb
3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb
3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb

In Python 3, we can't pass a plain text string to the hashlib hash function, we must pass a bytes string or a bytearray, eg

b'something' 

or

'something'.encode('utf-8')

or

bytearray('something', 'utf-8')

We can simplify the second version to

'something'.encode()

since UTF-8 is the default encoding for str.encode (and bytes.decode()).

To perform the conversion to int, any of the above techniques can be used, but we also have an additional option: the int.from_bytes method. To get the correct integer we need to tell it to interpret the bytes as a big-endian number:

import hashlib

hashobj = hashlib.sha256(b'something')
val = int.from_bytes(hashobj.digest(), 'big')
print('%064x' % val)

output

3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb

Solution 2

The point of a bytearray is not to fit the whole content in a single cell. That's why cells are only 1 byte big.

And .digest() returns a byte string, so you are fine just using it immediately:

>>> import hashlib
>>> hashobj = hashlib.sha256('something')
>>> val = hashobj.digest()
>>> print bytearray(val)
?ɶ�E�s������5Bkz@R���6��H�
>>> print repr(bytearray(val))
bytearray(b'?\xc9\xb6\x89E\x9ds\x8f\x8c\x88\xa3\xa4\x8a\xa9\xe35B\x01kz@R\xe0\x01\xaa\xa56\xfc\xa7H\x13\xcb')

Solution 3

I did it this way

import hashlib

x = 'input'
hash = int.from_bytes(hashlib.sha256(x.encode('utf-8')).digest(), 'big')
print(my_hash)

# 91106456816457796232999629894661022820411437165637657988648530670402435361824

lets check the size of the hash

print(len("{0:b}".format(my_hash)))

# 256

perfect!

Share:
15,365
Luke
Author by

Luke

Updated on July 25, 2022

Comments

  • Luke
    Luke over 1 year

    I want to convert a hash256 object to a 32-byte integer first, and then pack it into a bytearray.

    >>> import hashlib
    >>> hashobj = hashlib.sha256('something')
    >>> val_hex = hashobj.hexdigest()
    >>> print val_hex
    3fc9b689459d738f8c88a3a48aa9e33542016b7a4052e001aaa536fca74813cb
    >>> print len(val_hex)
    64
    

    The hex string is 64-byte instead of 32-byte, which is not what I want.

    >>> val = hashobj.digest()
    >>> print val
    ?ɶ?E?s????????5Bkz@R???6??H?
    >>> print len(val)
    32
    

    This is a 32-byte string and I want to convert it to a 32-byte integer.

    It gave me an error message when I try:

    >>> val_int = int(val, 10)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    ValueError: invalid literal for int() with base 10: '?\xc9\xb6\x89E\x9ds\x8f\x8c\x88\xa3\xa4\x8a\xa9\xe35B\x01kz@R\xe0\x01\xaa\xa56\xfc\xa7H\x13\xcb'
    

    What should I do to get my int_val?

    And how can I use struct to pack it (32-byte) to a bytearray? I found the longest format in python struct document is 'Q' which is only 8-byte.

    Thank you very much.

  • Luke
    Luke almost 8 years
    Great now I know how to put it into a bytearray. My other question is how can I get a 32-byte (256-bit) number from hashobj? I suppose sha256 output is a 256-bit number hash_num_in_int (from 0 to 2^32) and I can do some operation on it such as new_num = hash_num_in_int / 100. Thanks for your help.
  • Valentin Lorentz
    Valentin Lorentz almost 8 years
    I don't see an idiomatic way to do this, other than using a for loop to multiply and add items one by one. But I don't see any use case of doing arithmetic on a hash.
  • Luke
    Luke almost 8 years
    Oh I make it to be an accumulative crypto puzzle game. For example, take some data, hash it, and perform some math operations on it, denote as v1. Then take some new data, concatenate with the v1, hash it, and perform some math operations, denote as v2. So on and so forth...
  • Valentin Lorentz
    Valentin Lorentz almost 8 years
    Actually, I found an idiomatic way: int(hashobj.hexdigest(), 16)
  • Valentin Lorentz
    Valentin Lorentz almost 8 years
    And a more efficient / semantically correct one: int.from_bytes(hashobj.digest(), byteorder='little') (but only for Python 3)
  • Luke
    Luke almost 8 years
    I tried but this gives a 64-byte very very large number instead of 32-byte integer.
  • Valentin Lorentz
    Valentin Lorentz almost 8 years
    Use a modulo or a division
  • Luke
    Luke almost 8 years
    val_bytes is a 32-byte bytearray. So the conversion should be an integer number between 0 and 2**32(4294967296). Am I right? But n in your code is a very large number when I print it out.
  • PM 2Ring
    PM 2Ring almost 8 years
    @LuqinWang: 32 bytes is 256 bits, so the integer value of a SHA-256 checksum is in range(0, 2**256) == range(0, (2**8)**32).
  • Luke
    Luke almost 8 years
    This works fine to get a 32-byte integer if I just want a fixed length number but not care what the number is. However I highly doubt if int(hashobj.hexdigest(), 16) % (2 ** 32) equals to the original 32_byte_val from hashobj.digest().
  • PM 2Ring
    PM 2Ring almost 8 years
    @LuqinWang: My code prints the calculated integers in hexadecimal notation. As you can see, both calculation methods result in the same hex string as given by .hexdigest
  • Luke
    Luke almost 8 years
    Thanks for your clarification!
  • Valentin Lorentz
    Valentin Lorentz almost 8 years
    Why? You could also use slicing if you prefer: int(hashobj.hexdigest()[0:32*2], 16) (*2 is because there are 2 digits for a byte in hexadecimal notation)
  • Luke
    Luke almost 8 years
    I got it answered from PM 2Ring. I mistakenly thought the number is 32-bit. Instead it is 256-bit which is exactly int(hashobj.hexdigest(), 16) gives. Thank you Valentin Lorentz
  • PM 2Ring
    PM 2Ring almost 8 years
    BTW, that Python 3 version should be int.from_bytes(hashobj.digest(), byteorder='big')