Elegant ways to support equivalence ("equality") in Python classes

265,315

Solution 1

Consider this simple problem:

class Number:

    def __init__(self, number):
        self.number = number


n1 = Number(1)
n2 = Number(1)

n1 == n2 # False -- oops

So, Python by default uses the object identifiers for comparison operations:

id(n1) # 140400634555856
id(n2) # 140400634555920

Overriding the __eq__ function seems to solve the problem:

def __eq__(self, other):
    """Overrides the default implementation"""
    if isinstance(other, Number):
        return self.number == other.number
    return False


n1 == n2 # True
n1 != n2 # True in Python 2 -- oops, False in Python 3

In Python 2, always remember to override the __ne__ function as well, as the documentation states:

There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false. Accordingly, when defining __eq__(), one should also define __ne__() so that the operators will behave as expected.

def __ne__(self, other):
    """Overrides the default implementation (unnecessary in Python 3)"""
    return not self.__eq__(other)


n1 == n2 # True
n1 != n2 # False

In Python 3, this is no longer necessary, as the documentation states:

By default, __ne__() delegates to __eq__() and inverts the result unless it is NotImplemented. There are no other implied relationships among the comparison operators, for example, the truth of (x<y or x==y) does not imply x<=y.

But that does not solve all our problems. Let’s add a subclass:

class SubNumber(Number):
    pass


n3 = SubNumber(1)

n1 == n3 # False for classic-style classes -- oops, True for new-style classes
n3 == n1 # True
n1 != n3 # True for classic-style classes -- oops, False for new-style classes
n3 != n1 # False

Note: Python 2 has two kinds of classes:

  • classic-style (or old-style) classes, that do not inherit from object and that are declared as class A:, class A(): or class A(B): where B is a classic-style class;

  • new-style classes, that do inherit from object and that are declared as class A(object) or class A(B): where B is a new-style class. Python 3 has only new-style classes that are declared as class A:, class A(object): or class A(B):.

For classic-style classes, a comparison operation always calls the method of the first operand, while for new-style classes, it always calls the method of the subclass operand, regardless of the order of the operands.

So here, if Number is a classic-style class:

  • n1 == n3 calls n1.__eq__;
  • n3 == n1 calls n3.__eq__;
  • n1 != n3 calls n1.__ne__;
  • n3 != n1 calls n3.__ne__.

And if Number is a new-style class:

  • both n1 == n3 and n3 == n1 call n3.__eq__;
  • both n1 != n3 and n3 != n1 call n3.__ne__.

To fix the non-commutativity issue of the == and != operators for Python 2 classic-style classes, the __eq__ and __ne__ methods should return the NotImplemented value when an operand type is not supported. The documentation defines the NotImplemented value as:

Numeric methods and rich comparison methods may return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.) Its truth value is true.

In this case the operator delegates the comparison operation to the reflected method of the other operand. The documentation defines reflected methods as:

There are no swapped-argument versions of these methods (to be used when the left argument does not support the operation but the right argument does); rather, __lt__() and __gt__() are each other’s reflection, __le__() and __ge__() are each other’s reflection, and __eq__() and __ne__() are their own reflection.

The result looks like this:

def __eq__(self, other):
    """Overrides the default implementation"""
    if isinstance(other, Number):
        return self.number == other.number
    return NotImplemented

def __ne__(self, other):
    """Overrides the default implementation (unnecessary in Python 3)"""
    x = self.__eq__(other)
    if x is NotImplemented:
        return NotImplemented
    return not x

Returning the NotImplemented value instead of False is the right thing to do even for new-style classes if commutativity of the == and != operators is desired when the operands are of unrelated types (no inheritance).

Are we there yet? Not quite. How many unique numbers do we have?

len(set([n1, n2, n3])) # 3 -- oops

Sets use the hashes of objects, and by default Python returns the hash of the identifier of the object. Let’s try to override it:

def __hash__(self):
    """Overrides the default implementation"""
    return hash(tuple(sorted(self.__dict__.items())))

len(set([n1, n2, n3])) # 1

The end result looks like this (I added some assertions at the end for validation):

class Number:

    def __init__(self, number):
        self.number = number

    def __eq__(self, other):
        """Overrides the default implementation"""
        if isinstance(other, Number):
            return self.number == other.number
        return NotImplemented

    def __ne__(self, other):
        """Overrides the default implementation (unnecessary in Python 3)"""
        x = self.__eq__(other)
        if x is not NotImplemented:
            return not x
        return NotImplemented

    def __hash__(self):
        """Overrides the default implementation"""
        return hash(tuple(sorted(self.__dict__.items())))


class SubNumber(Number):
    pass


n1 = Number(1)
n2 = Number(1)
n3 = SubNumber(1)
n4 = SubNumber(4)

assert n1 == n2
assert n2 == n1
assert not n1 != n2
assert not n2 != n1

assert n1 == n3
assert n3 == n1
assert not n1 != n3
assert not n3 != n1

assert not n1 == n4
assert not n4 == n1
assert n1 != n4
assert n4 != n1

assert len(set([n1, n2, n3, ])) == 1
assert len(set([n1, n2, n3, n4])) == 2

Solution 2

You need to be careful with inheritance:

>>> class Foo:
    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.__dict__ == other.__dict__
        else:
            return False

>>> class Bar(Foo):pass

>>> b = Bar()
>>> f = Foo()
>>> f == b
True
>>> b == f
False

Check types more strictly, like this:

def __eq__(self, other):
    if type(other) is type(self):
        return self.__dict__ == other.__dict__
    return False

Besides that, your approach will work fine, that's what special methods are there for.

Solution 3

The way you describe is the way I've always done it. Since it's totally generic, you can always break that functionality out into a mixin class and inherit it in classes where you want that functionality.

class CommonEqualityMixin(object):

    def __eq__(self, other):
        return (isinstance(other, self.__class__)
            and self.__dict__ == other.__dict__)

    def __ne__(self, other):
        return not self.__eq__(other)

class Foo(CommonEqualityMixin):

    def __init__(self, item):
        self.item = item

Solution 4

Not a direct answer but seemed relevant enough to be tacked on as it saves a bit of verbose tedium on occasion. Cut straight from the docs...


functools.total_ordering(cls)

Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest. This simplifies the effort involved in specifying all of the possible rich comparison operations:

The class must define one of __lt__(), __le__(), __gt__(), or __ge__(). In addition, the class should supply an __eq__() method.

New in version 2.7

@total_ordering
class Student:
    def __eq__(self, other):
        return ((self.lastname.lower(), self.firstname.lower()) ==
                (other.lastname.lower(), other.firstname.lower()))
    def __lt__(self, other):
        return ((self.lastname.lower(), self.firstname.lower()) <
                (other.lastname.lower(), other.firstname.lower()))

Solution 5

You don't have to override both __eq__ and __ne__ you can override only __cmp__ but this will make an implication on the result of ==, !==, < , > and so on.

is tests for object identity. This means a is b will be True in the case when a and b both hold the reference to the same object. In python you always hold a reference to an object in a variable not the actual object, so essentially for a is b to be true the objects in them should be located in the same memory location. How and most importantly why would you go about overriding this behaviour?

Edit: I didn't know __cmp__ was removed from python 3 so avoid it.

Share:
265,315
gotgenes
Author by

gotgenes

My name is Chris Lasher. I use computers to help other people study biology and improve human health. I enjoy TDD, pair programming and mobbing, and improving design and code organization one decision at a time.

Updated on July 03, 2021

Comments

  • gotgenes
    gotgenes almost 3 years

    When writing custom classes it is often important to allow equivalence by means of the == and != operators. In Python, this is made possible by implementing the __eq__ and __ne__ special methods, respectively. The easiest way I've found to do this is the following method:

    class Foo:
        def __init__(self, item):
            self.item = item
    
        def __eq__(self, other):
            if isinstance(other, self.__class__):
                return self.__dict__ == other.__dict__
            else:
                return False
    
        def __ne__(self, other):
            return not self.__eq__(other)
    

    Do you know of more elegant means of doing this? Do you know of any particular disadvantages to using the above method of comparing __dict__s?

    Note: A bit of clarification--when __eq__ and __ne__ are undefined, you'll find this behavior:

    >>> a = Foo(1)
    >>> b = Foo(1)
    >>> a is b
    False
    >>> a == b
    False
    

    That is, a == b evaluates to False because it really runs a is b, a test of identity (i.e., "Is a the same object as b?").

    When __eq__ and __ne__ are defined, you'll find this behavior (which is the one we're after):

    >>> a = Foo(1)
    >>> b = Foo(1)
    >>> a is b
    False
    >>> a == b
    True
    
  • Ed S.
    Ed S. over 15 years
    Because sometimes you have a different definition of equality for your objects.
  • ILIA BROUDNO
    ILIA BROUDNO over 15 years
    the is operator gives you the interpreters answer to object identity, but you are still free to express you view on equality by overriding cmp
  • gotgenes
    gotgenes over 15 years
    In Python 3, "The cmp() function is gone, and the __cmp__() special method is no longer supported." is.gd/aeGv
  • gotgenes
    gotgenes over 15 years
    Maybe, except that one can create a class that only compares the first two items in two lists, and if those items are equal, it evaluates to True. This is equivalence, I think, not equality. Perfectly valid in eq, still.
  • gotgenes
    gotgenes over 15 years
    I do agree, however, that "is" is a test of identity.
  • user1066101
    user1066101 over 15 years
    +1: Strategy pattern to allow easy replacement in subclasses.
  • nosklo
    nosklo over 15 years
    isinstance sucks. Why check it? Why not just self.__dict__ == other.__dict__?
  • gotgenes
    gotgenes over 14 years
    This is a good point. I suppose it's worth noting that sub-classing built in types still allows for equality either direction, and so checking that it's the same type may even be undesirable.
  • spenthil
    spenthil over 13 years
    Bit nitpicky, but 'is' tests using id() only if you haven't defined your own is_() member function (2.3+). [docs.python.org/library/operator.html]
  • mcrute
    mcrute over 13 years
    I assume by "override" you actually mean monkey-patching the operator module. In this case your statement is not entirely accurate. The operators module is provided for convenience and overriding those methods does not affect the behavior of the "is" operator. A comparison using "is" always uses the id() of an object for the comparison, this behavior can not be overridden. Also an is_ member function has no effect on the comparison.
  • spenthil
    spenthil over 13 years
    mcrute - I spoke too soon (and incorrectly), you are absolutely right.
  • max
    max over 13 years
    @nosklo: I don't understand.. what if two objects from completely unrelated classes happen to have the same attributes?
  • gotgenes
    gotgenes over 13 years
    @max nosklo makes a good point. Consider the default behavior when sub-classing the built in objects. The == operator does not care if you compare a built-in to a sub-class of the built-in.
  • max
    max over 13 years
    I thought nokslo suggested skipping isinstance. In that case you no longer know if other is of a subclass of self.__class__.
  • nosklo
    nosklo over 13 years
    @max: yeah, but it doesn't matter if it is a subclass or not. It is irrelevant information. Consider what will happen if it is not a subclass.
  • max
    max over 13 years
    @nosklo: if it's not subclass, but it just happens by accident to have same attributes as self (both keys and values), __eq__ might evaluate to True, even though it's meaningless. Do I miss anything?
  • cdleary
    cdleary over 13 years
    @nosklo: Yeah, maybe hasattr(other, '__dict__') and self.__dict__ == other.__dict__ would be better in the general case. I guess I just prefer a stricter notion of equality, given the option.
  • Adam Parkin
    Adam Parkin about 12 years
    Another issue with the __dict__ comparison is what if you have an attribute that you don't want to consider in your definition of equality (say for example a unique object id, or metadata like a time created stamp).
  • max
    max over 11 years
    @cdleary: I can see that hasattr would prevent an exception when other doesn't have __dict__ (i.e., implemented with slots). But how is it related to @nosklo question?
  • max
    max over 11 years
    I'd suggest to return NotImplemented if the types are different, delegating the comparison to the rhs.
  • gotgenes
    gotgenes almost 11 years
    @max comparison isn't necessarily done left hand side (LHS) to right hand side (RHS), then RHS to LHS; see stackoverflow.com/a/12984987/38140. Still, returning NotImplemented as you suggest will always cause superclass.__eq__(subclass), which is the desired behavior.
  • Wookie88
    Wookie88 almost 11 years
    This is a very nice solution, especially when the __eq__ will be declared in CommonEqualityMixin (see the other answer). I found this particularly useful when comparing instances of classes derived from Base in SQLAlchemy. To not compare _sa_instance_state I changed key.startswith("__")): to key.startswith("_")):. I had also some backreferences in them and the answer from Algorias generated endless recursion. So I named all backreferences starting with '_' so that they're also skipped during comparison. NOTE: in Python 3.x change iteritems() to items().
  • Lars
    Lars over 10 years
    this is faster too, because isinstance can be a bit slow
  • Lars
    Lars over 10 years
    i think implementing this in all your classes could lead to infinite recursion with circular references (e.g. set a.b = b, b.a = a, a==a)
  • Dane White
    Dane White over 10 years
    If you have a ton of members, and not many object copies sitting around, then it's usually good add an initial an identity test if other is self. This avoids the more lengthy dictionary comparison, and can be a huge savings when objects are used as dictionary keys.
  • Dane White
    Dane White over 10 years
    And don't forget to implement __hash__()
  • Robin
    Robin about 10 years
    Note that this has some issues with inheritance, be sure to check this solution as well!
  • max
    max about 9 years
    @mcrute Usually, __dict__ of an instance doesn't have anything that starts with __ unless it was defined by the user. Things like __class__, __init__, etc. are not in the instance's __dict__, but rather in its class' __dict__. OTOH, the private attributes can easily start with __ and probably should be used for __eq__. Can you clarify what exactly were you trying to avoid when skipping __-prefixed attributes?
  • max
    max about 9 years
    hash(tuple(sorted(self.__dict__.items()))) won't work if there are any non-hashable objects among the values of the self.__dict__ (i.e., if any of the attributes of the object is set to, say, a list).
  • Tal Weiss
    Tal Weiss about 9 years
    True, but then if you have such mutable objects in your vars() the two objects are not really equal...
  • Sandy Chapman
    Sandy Chapman almost 9 years
    Also, not adding the check to hasattr(self, '__dict__') will cause an exception when comparing to None. A pretty glaring hole in the implementation.
  • Florian Brucker
    Florian Brucker almost 9 years
  • soulmachine
    soulmachine over 8 years
    TypeError: unhashable type: 'dict'
  • Mr_and_Mrs_D
    Mr_and_Mrs_D almost 8 years
    However total_ordering has subtle pitfalls: regebro.wordpress.com/2010/12/13/…. Be aware !
  • Maggyero
    Maggyero over 6 years
    Three remarks: 1. In Python 3, no need to implement __ne__ anymore: "By default, __ne__() delegates to __eq__() and inverts the result unless it is NotImplemented". 2. If one still wants to implement __ne__, a more generic implementation (the one used by Python 3 I think) is: x = self.__eq__(other); if x is NotImplemented: return x; else: return not x. 3. The given __eq__ and __ne__ implementations are suboptimal: if isinstance(other, type(self)): gives 22 __eq__ and 10 __ne__ calls, while if isinstance(self, type(other)): would give 16 __eq__ and 6 __ne__ calls.
  • user2357112
    user2357112 over 6 years
    While this answer eventually reaches a correct implementation, it gets there in a very confusing way. It repeatedly shows wrong code in a way that invites readers to think their problems are solved and stop reading, and it says the other answers don't work and then launches into a description of problems other answers do handle before getting to the parts they don't. Also, it doesn't mention the cases where you should set __hash__ to None instead of implementing it.
  • brownmagik352
    brownmagik352 over 5 years
    The isinstance(other, ...) check was super helpful for dealing with checks against None. Thanks!
  • GregNash
    GregNash about 5 years
    He asked about elegance, but he got robust.
  • Bin
    Bin almost 5 years
    n1 == n3 should also be True even for classic class? Because this case other should be n3 and isinstance(n3, Number) is True?
  • mrexodia
    mrexodia about 4 years
    I think this should be the accepted answer, because it actually answer the specific question about __dict__.
  • mrexodia
    mrexodia about 4 years
    This does not answer the question.
  • Wouter Lievens
    Wouter Lievens over 3 years
    It's important to note that for equality to be true, objects don't necessarily need to be the same type. For instance, one can argue that a linked list with the same elements as an array-backed list are in fact equal.