Python - class __hash__ method and set
Solution 1
Your reading is incorrect. The __eq__
method is used for equality checks. The documents just state that the __hash__
value must also be the same for 2 objects a
and b
for which a == b
(i.e. a.__eq__(b)
) is true.
This is a common logic mistake: a == b
being true implies that hash(a) == hash(b)
is also true. However, an implication does not necessarily mean equivalence, that in addition to the prior, hash(a) == hash(b)
would mean that a == b
.
To make all instances of MyClass
compare equal to each other, you need to provide an __eq__
method for them; otherwise Python will compare their identities instead. This might do:
class MyClass(object):
def __hash__(self):
return 0
def __eq__(self, other):
# another object is equal to self, iff
# it is an instance of MyClass
return isinstance(other, MyClass)
Now:
>>> result = set()
>>> result.add(MyClass())
>>> result.add(MyClass())
1
In reality you'd base the __hash__
on those properties of your object that are used for __eq__
comparison, for example:
class Person
def __init__(self, name, ssn):
self.name = name
self.ssn = ssn
def __eq__(self, other):
return isinstance(other, Person) and self.ssn == other.ssn
def __hash__(self):
# use the hashcode of self.ssn since that is used
# for equality checks as well
return hash(self.ssn)
p = Person('Foo Bar', 123456789)
q = Person('Fake Name', 123456789)
print(len({p, q}) # 1
Solution 2
Sets need two methods to make an object hashable: __hash__
and __eq__
. Two instances must return the same hash value when they are considered equal. An instance is considered already present in a set if both the hash is present in the set and the instance is considered equal to one of the instances with that same hash in the set.
Your class doesn't implement __eq__
, so the default object.__eq__
is used instead, which only returns true if obj1 is obj2
is also true. In other words, two instances are only considered equal if they are the exact same instance.
Just because their hashes match, doesn't make them unique as far as a set is concerned; even objects with different hashes can end up in the same hash table slot, as the modulus of the hash against the table size is used.
Add your a custom __eq__
method that returns True
when two instances are supposed to be equal:
def __eq__(self, other):
if not isinstance(other, type(self)):
return False
# all instances of this class are considered equal to one another
return True
![Aida.Mirabadi](https://i.stack.imgur.com/BidMx.jpg?s=256&g=1)
Comments
-
Aida.Mirabadi about 2 years
I'm using
set()
and__hash__
method ofpython
class to prevent adding same hash object in set. According to python data-model document,set()
consider same hash object as same object and just add them once.But it behaves different as below:
class MyClass(object): def __hash__(self): return 0 result = set() result.add(MyClass()) result.add(MyClass()) print(len(result)) # len = 2
While in case of string value, it works correctly.
result.add('aida') result.add('aida') print(len(result)) # len = 1
My question is: why the same hash objects are not same in set?