Positive integer from Python hash() function

python hash cross-platform

29,667

Solution 1

Using sys.maxsize:

>>> import sys
>>> sys.maxsize
9223372036854775807L
>>> hash('asdf')
-618826466
>>> hash('asdf') % ((sys.maxsize + 1) * 2)
18446744073090725150L

Alternative using ctypes.c_size_t:

>>> import ctypes
>>> ctypes.c_size_t(hash('asdf')).value
18446744073090725150L

Solution 2

Just using sys.maxsize is wrong for obvious reasons (it being `2*n-1 and not 2*n), but the fix is easy enough:

h = hash(obj)
h += sys.maxsize + 1

for performance reasons you may want to split the sys.maxsize + 1 into two separate assignments to avoid creating a long integer temporarily for most negative numbers. Although I doubt this is going to matter much

Solution 3

(Edit: at first I thought you always wanted a 32-bit value)

Simply AND it with a mask of the desired size. Generally sys.maxsize will already be such a mask, since it's a power of 2 minus 1.

import sys
assert (sys.maxsize & (sys.maxsize+1)) == 0 # checks that maxsize+1 is a power of 2 

new_hash = hash & sys.maxsize

Solution 4

How about:

h = hash(o)
if h < 0:
  h += sys.maxsize

This uses sys.maxsize to be portable between 32- and 64-bit systems.

View more solutions

29,667

Author by

Exectron

Christian (Sabbath keeper too) piano player (gr 7 AMEB; like jazz) Linux user Embedded software developer (C and C++) Python

Updated on May 10, 2020

Comments

Exectron about 4 years

I want to use the Python hash() function to get integer hashes from objects. But built-in hash() can give negative values, and I want only positive. And I want it to work sensibly on both 32-bit and 64-bit platforms.

I.e. on 32-bit Python, hash() can return an integer in the range -2**31 to 2**31 - 1. On 64-bit systems, hash() can return an integer in the range -2**63 to 2**63 - 1.

But I want a hash in the range 0 to 2**32-1 on 32-bit systems, and 0 to 2**64-1 on 64-bit systems.

What is the best way to convert the hash value to its equivalent positive value within the range of the 32- or 64-bit target platform?

(Context: I'm trying to make a new random.Random style class. According to the random.Random.seed() docs, the seed "optional argument x can be any hashable object." So I'd like to duplicate that functionality, except that my seed algorithm can't handle negative integer values, only positive.)
Voo almost 11 years

There's no reason whatsoever to use a modulus here. I mean sure it works but it's less efficient and harder to read.
falsetru almost 11 years

You code could produce duplicated value. Try -0x7fffffffffffffff + sys.maxsize + 1 in 64bit system.
Voo almost 11 years

Ah yes true enough, shouldn't be conditional. Where's my head today?
falsetru almost 11 years

Why /2 instead of *2?
falsetru almost 11 years

@CraigMcQueen, I added an alternative way. Check it out.
Voo almost 11 years

Actually it's neither / nor * 2. We have a range of [-2**31, 2**31-1], but we want [-2**31+2**31, 2**31-1+2**31] (example is for 32bit system). So it's just an addition by the lower boundary (2**31).. really confused today I am.
Exectron almost 11 years

That's a good idea, although I want a 64-bit (or 32-bit) value, not a 63-bit (or 31-bit) value.
Mark Ransom almost 11 years

@CraigMcQueen, sorry I thought that maxsize was already the right size.