Elegant ways to support equivalence ("equality") in Python classes
Solution 1
Consider this simple problem:
class Number:
def __init__(self, number):
self.number = number
n1 = Number(1)
n2 = Number(1)
n1 == n2 # False -- oops
So, Python by default uses the object identifiers for comparison operations:
id(n1) # 140400634555856
id(n2) # 140400634555920
Overriding the __eq__
function seems to solve the problem:
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return False
n1 == n2 # True
n1 != n2 # True in Python 2 -- oops, False in Python 3
In Python 2, always remember to override the __ne__
function as well, as the documentation states:
There are no implied relationships among the comparison operators. The truth of
x==y
does not imply thatx!=y
is false. Accordingly, when defining__eq__()
, one should also define__ne__()
so that the operators will behave as expected.
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
return not self.__eq__(other)
n1 == n2 # True
n1 != n2 # False
In Python 3, this is no longer necessary, as the documentation states:
By default,
__ne__()
delegates to__eq__()
and inverts the result unless it isNotImplemented
. There are no other implied relationships among the comparison operators, for example, the truth of(x<y or x==y)
does not implyx<=y
.
But that does not solve all our problems. Let’s add a subclass:
class SubNumber(Number):
pass
n3 = SubNumber(1)
n1 == n3 # False for classic-style classes -- oops, True for new-style classes
n3 == n1 # True
n1 != n3 # True for classic-style classes -- oops, False for new-style classes
n3 != n1 # False
Note: Python 2 has two kinds of classes:
classic-style (or old-style) classes, that do not inherit from
object
and that are declared asclass A:
,class A():
orclass A(B):
whereB
is a classic-style class;new-style classes, that do inherit from
object
and that are declared asclass A(object)
orclass A(B):
whereB
is a new-style class. Python 3 has only new-style classes that are declared asclass A:
,class A(object):
orclass A(B):
.
For classic-style classes, a comparison operation always calls the method of the first operand, while for new-style classes, it always calls the method of the subclass operand, regardless of the order of the operands.
So here, if Number
is a classic-style class:
-
n1 == n3
callsn1.__eq__
; -
n3 == n1
callsn3.__eq__
; -
n1 != n3
callsn1.__ne__
; -
n3 != n1
callsn3.__ne__
.
And if Number
is a new-style class:
- both
n1 == n3
andn3 == n1
calln3.__eq__
; - both
n1 != n3
andn3 != n1
calln3.__ne__
.
To fix the non-commutativity issue of the ==
and !=
operators for Python 2 classic-style classes, the __eq__
and __ne__
methods should return the NotImplemented
value when an operand type is not supported. The documentation defines the NotImplemented
value as:
Numeric methods and rich comparison methods may return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.) Its truth value is true.
In this case the operator delegates the comparison operation to the reflected method of the other operand. The documentation defines reflected methods as:
There are no swapped-argument versions of these methods (to be used when the left argument does not support the operation but the right argument does); rather,
__lt__()
and__gt__()
are each other’s reflection,__le__()
and__ge__()
are each other’s reflection, and__eq__()
and__ne__()
are their own reflection.
The result looks like this:
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is NotImplemented:
return NotImplemented
return not x
Returning the NotImplemented
value instead of False
is the right thing to do even for new-style classes if commutativity of the ==
and !=
operators is desired when the operands are of unrelated types (no inheritance).
Are we there yet? Not quite. How many unique numbers do we have?
len(set([n1, n2, n3])) # 3 -- oops
Sets use the hashes of objects, and by default Python returns the hash of the identifier of the object. Let’s try to override it:
def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))
len(set([n1, n2, n3])) # 1
The end result looks like this (I added some assertions at the end for validation):
class Number:
def __init__(self, number):
self.number = number
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is not NotImplemented:
return not x
return NotImplemented
def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))
class SubNumber(Number):
pass
n1 = Number(1)
n2 = Number(1)
n3 = SubNumber(1)
n4 = SubNumber(4)
assert n1 == n2
assert n2 == n1
assert not n1 != n2
assert not n2 != n1
assert n1 == n3
assert n3 == n1
assert not n1 != n3
assert not n3 != n1
assert not n1 == n4
assert not n4 == n1
assert n1 != n4
assert n4 != n1
assert len(set([n1, n2, n3, ])) == 1
assert len(set([n1, n2, n3, n4])) == 2
Solution 2
You need to be careful with inheritance:
>>> class Foo:
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.__dict__ == other.__dict__
else:
return False
>>> class Bar(Foo):pass
>>> b = Bar()
>>> f = Foo()
>>> f == b
True
>>> b == f
False
Check types more strictly, like this:
def __eq__(self, other):
if type(other) is type(self):
return self.__dict__ == other.__dict__
return False
Besides that, your approach will work fine, that's what special methods are there for.
Solution 3
The way you describe is the way I've always done it. Since it's totally generic, you can always break that functionality out into a mixin class and inherit it in classes where you want that functionality.
class CommonEqualityMixin(object):
def __eq__(self, other):
return (isinstance(other, self.__class__)
and self.__dict__ == other.__dict__)
def __ne__(self, other):
return not self.__eq__(other)
class Foo(CommonEqualityMixin):
def __init__(self, item):
self.item = item
Solution 4
Not a direct answer but seemed relevant enough to be tacked on as it saves a bit of verbose tedium on occasion. Cut straight from the docs...
Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest. This simplifies the effort involved in specifying all of the possible rich comparison operations:
The class must define one of __lt__()
, __le__()
, __gt__()
, or __ge__()
. In addition, the class should supply an __eq__()
method.
New in version 2.7
@total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
Solution 5
You don't have to override both __eq__
and __ne__
you can override only __cmp__
but this will make an implication on the result of ==, !==, < , > and so on.
is
tests for object identity. This means a is
b will be True
in the case when a and b both hold the reference to the same object. In python you always hold a reference to an object in a variable not the actual object, so essentially for a is b to be true the objects in them should be located in the same memory location. How and most importantly why would you go about overriding this behaviour?
Edit: I didn't know __cmp__
was removed from python 3 so avoid it.
gotgenes
My name is Chris Lasher. I use computers to help other people study biology and improve human health. I enjoy TDD, pair programming and mobbing, and improving design and code organization one decision at a time.
Updated on July 03, 2021Comments
-
gotgenes almost 3 years
When writing custom classes it is often important to allow equivalence by means of the
==
and!=
operators. In Python, this is made possible by implementing the__eq__
and__ne__
special methods, respectively. The easiest way I've found to do this is the following method:class Foo: def __init__(self, item): self.item = item def __eq__(self, other): if isinstance(other, self.__class__): return self.__dict__ == other.__dict__ else: return False def __ne__(self, other): return not self.__eq__(other)
Do you know of more elegant means of doing this? Do you know of any particular disadvantages to using the above method of comparing
__dict__
s?Note: A bit of clarification--when
__eq__
and__ne__
are undefined, you'll find this behavior:>>> a = Foo(1) >>> b = Foo(1) >>> a is b False >>> a == b False
That is,
a == b
evaluates toFalse
because it really runsa is b
, a test of identity (i.e., "Isa
the same object asb
?").When
__eq__
and__ne__
are defined, you'll find this behavior (which is the one we're after):>>> a = Foo(1) >>> b = Foo(1) >>> a is b False >>> a == b True
-
Ed S. over 15 yearsBecause sometimes you have a different definition of equality for your objects.
-
ILIA BROUDNO over 15 yearsthe is operator gives you the interpreters answer to object identity, but you are still free to express you view on equality by overriding cmp
-
gotgenes over 15 yearsIn Python 3, "The cmp() function is gone, and the __cmp__() special method is no longer supported." is.gd/aeGv
-
gotgenes over 15 yearsMaybe, except that one can create a class that only compares the first two items in two lists, and if those items are equal, it evaluates to True. This is equivalence, I think, not equality. Perfectly valid in eq, still.
-
gotgenes over 15 yearsI do agree, however, that "is" is a test of identity.
-
user1066101 over 15 years+1: Strategy pattern to allow easy replacement in subclasses.
-
nosklo over 15 yearsisinstance sucks. Why check it? Why not just self.__dict__ == other.__dict__?
-
gotgenes over 14 yearsThis is a good point. I suppose it's worth noting that sub-classing built in types still allows for equality either direction, and so checking that it's the same type may even be undesirable.
-
spenthil over 13 yearsBit nitpicky, but 'is' tests using id() only if you haven't defined your own is_() member function (2.3+). [docs.python.org/library/operator.html]
-
mcrute over 13 yearsI assume by "override" you actually mean monkey-patching the operator module. In this case your statement is not entirely accurate. The operators module is provided for convenience and overriding those methods does not affect the behavior of the "is" operator. A comparison using "is" always uses the id() of an object for the comparison, this behavior can not be overridden. Also an is_ member function has no effect on the comparison.
-
spenthil over 13 yearsmcrute - I spoke too soon (and incorrectly), you are absolutely right.
-
max over 13 years@nosklo: I don't understand.. what if two objects from completely unrelated classes happen to have the same attributes?
-
gotgenes over 13 years@max nosklo makes a good point. Consider the default behavior when sub-classing the built in objects. The
==
operator does not care if you compare a built-in to a sub-class of the built-in. -
max over 13 yearsI thought nokslo suggested skipping isinstance. In that case you no longer know if
other
is of a subclass ofself.__class__
. -
nosklo over 13 years@max: yeah, but it doesn't matter if it is a subclass or not. It is irrelevant information. Consider what will happen if it is not a subclass.
-
max over 13 years@nosklo: if it's not subclass, but it just happens by accident to have same attributes as
self
(both keys and values),__eq__
might evaluate toTrue
, even though it's meaningless. Do I miss anything? -
cdleary over 13 years@nosklo: Yeah, maybe
hasattr(other, '__dict__') and self.__dict__ == other.__dict__
would be better in the general case. I guess I just prefer a stricter notion of equality, given the option. -
Adam Parkin about 12 yearsAnother issue with the
__dict__
comparison is what if you have an attribute that you don't want to consider in your definition of equality (say for example a unique object id, or metadata like a time created stamp). -
max over 11 years@cdleary: I can see that
hasattr
would prevent an exception whenother
doesn't have__dict__
(i.e., implemented with slots). But how is it related to @nosklo question? -
max over 11 yearsI'd suggest to return NotImplemented if the types are different, delegating the comparison to the rhs.
-
gotgenes almost 11 years@max comparison isn't necessarily done left hand side (LHS) to right hand side (RHS), then RHS to LHS; see stackoverflow.com/a/12984987/38140. Still, returning
NotImplemented
as you suggest will always causesuperclass.__eq__(subclass)
, which is the desired behavior. -
Wookie88 almost 11 yearsThis is a very nice solution, especially when the
__eq__
will be declared inCommonEqualityMixin
(see the other answer). I found this particularly useful when comparing instances of classes derived from Base in SQLAlchemy. To not compare_sa_instance_state
I changedkey.startswith("__")):
tokey.startswith("_")):
. I had also some backreferences in them and the answer from Algorias generated endless recursion. So I named all backreferences starting with'_'
so that they're also skipped during comparison. NOTE: in Python 3.x changeiteritems()
toitems()
. -
Lars over 10 yearsthis is faster too, because isinstance can be a bit slow
-
Lars over 10 yearsi think implementing this in all your classes could lead to infinite recursion with circular references (e.g. set a.b = b, b.a = a, a==a)
-
Dane White over 10 yearsIf you have a ton of members, and not many object copies sitting around, then it's usually good add an initial an identity test
if other is self
. This avoids the more lengthy dictionary comparison, and can be a huge savings when objects are used as dictionary keys. -
Dane White over 10 yearsAnd don't forget to implement
__hash__()
-
Robin about 10 yearsNote that this has some issues with inheritance, be sure to check this solution as well!
-
max about 9 years@mcrute Usually,
__dict__
of an instance doesn't have anything that starts with__
unless it was defined by the user. Things like__class__
,__init__
, etc. are not in the instance's__dict__
, but rather in its class'__dict__
. OTOH, the private attributes can easily start with__
and probably should be used for__eq__
. Can you clarify what exactly were you trying to avoid when skipping__
-prefixed attributes? -
max about 9 years
hash(tuple(sorted(self.__dict__.items())))
won't work if there are any non-hashable objects among the values of theself.__dict__
(i.e., if any of the attributes of the object is set to, say, alist
). -
Tal Weiss about 9 yearsTrue, but then if you have such mutable objects in your vars() the two objects are not really equal...
-
Sandy Chapman almost 9 yearsAlso, not adding the check to
hasattr(self, '__dict__')
will cause an exception when comparing to None. A pretty glaring hole in the implementation. -
Florian Brucker almost 9 yearsGreat summary, but you should implement
__ne__
using==
instead of__eq__
. -
soulmachine over 8 yearsTypeError: unhashable type: 'dict'
-
Mr_and_Mrs_D almost 8 yearsHowever total_ordering has subtle pitfalls: regebro.wordpress.com/2010/12/13/…. Be aware !
-
Maggyero over 6 yearsThree remarks: 1. In Python 3, no need to implement
__ne__
anymore: "By default,__ne__()
delegates to__eq__()
and inverts the result unless it isNotImplemented
". 2. If one still wants to implement__ne__
, a more generic implementation (the one used by Python 3 I think) is:x = self.__eq__(other); if x is NotImplemented: return x; else: return not x
. 3. The given__eq__
and__ne__
implementations are suboptimal:if isinstance(other, type(self)):
gives 22__eq__
and 10__ne__
calls, whileif isinstance(self, type(other)):
would give 16__eq__
and 6__ne__
calls. -
user2357112 over 6 yearsWhile this answer eventually reaches a correct implementation, it gets there in a very confusing way. It repeatedly shows wrong code in a way that invites readers to think their problems are solved and stop reading, and it says the other answers don't work and then launches into a description of problems other answers do handle before getting to the parts they don't. Also, it doesn't mention the cases where you should set
__hash__
toNone
instead of implementing it. -
brownmagik352 over 5 yearsThe isinstance(other, ...) check was super helpful for dealing with checks against None. Thanks!
-
GregNash about 5 yearsHe asked about elegance, but he got robust.
-
Bin almost 5 years
n1 == n3
should also beTrue
even for classic class? Because this caseother
should ben3
andisinstance(n3, Number)
is True? -
mrexodia about 4 yearsI think this should be the accepted answer, because it actually answer the specific question about __dict__.
-
mrexodia about 4 yearsThis does not answer the question.
-
Wouter Lievens over 3 yearsIt's important to note that for equality to be true, objects don't necessarily need to be the same type. For instance, one can argue that a linked list with the same elements as an array-backed list are in fact equal.