Python: Is this an ok way of overriding __eq__ and __hash__?

29,023

The first one is fine. The second one is problematic for two reasons:

  1. there might be duplicates in .courses
  2. two entities with identical .courses but different .forwardLinks would compare equal but have different hashes

I would fix the second one by making equality depend on both courses and forward links, but both changes to sets (hence no duplicates), and the same for hashing. I.e.:

def __eq__(self, other):
    if not isinstance(other, DependencyArcTail):
        return False

    return (set(self.courses) == set(other.courses) and
            set(self.forwardLinks) == set(other.forwardLinks))

def __hash__(self):
    return hash((frozenset(self.courses), frozenset(self.forwardLinks)))

This of course is assuming that the forward links are crucial to an object's "real value", otherwise they should be omitted from both __eq__ and __hash__.

Edit: removed from __hash__ calls to tuple which were at best redundant (and possibly damaging, as suggested by a comment by @Mark [[tx!!!]]); changed set to frozenset in the hashing, as suggested by a comment by @Phillips [[tx!!!]].

Share:
29,023

Related videos on Youtube

Nick Heiner
Author by

Nick Heiner

JS enthusiast by day, horse mask enthusiast by night. Talks I've Done

Updated on July 09, 2022

Comments

  • Nick Heiner
    Nick Heiner almost 2 years

    I'm new to Python, and I wanted to make sure that I overrode __eq__ and __hash__ correctly, so as not to cause painful errors later:

    (I'm using Google App Engine.)

    class Course(db.Model):
        dept_code = db.StringProperty()
        number = db.IntegerProperty()
        title = db.StringProperty()
        raw_pre_reqs = db.StringProperty(multiline=True)
        original_description = db.StringProperty()
    
        def getPreReqs(self):
            return pickle.loads(str(self.raw_pre_reqs))
    
        def __repr__(self):
            title_msg = self.title if self.title else "Untitled"
            return "%s %s: %s" % (self.dept_code, self.number, title_msg)
    
        def __attrs(self):
            return (self.dept_code, self.number, self.title, self.raw_pre_reqs, self.original_description)
    
        def __eq__(self, other):
            return isinstance(other, Course) and self.__attrs() == other.__attrs()
    
        def __hash__(self):
            return hash(self.__attrs())
    

    A slightly more complicated type:

    class DependencyArcTail(db.Model):
        ''' A list of courses that is a pre-req for something else '''
        courses = db.ListProperty(db.Key)
    
        ''' a list of heads that reference this one '''
        forwardLinks = db.ListProperty(db.Key)
    
        def __repr__(self):
            return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))
    
        def __eq__(self, other):
            if not isinstance(other, DependencyArcTail):
                return False
    
            for this_course in self.courses:
                if not (this_course in other.courses):
                    return False
    
            for other_course in other.courses:
                if not (other_course in self.courses):
                    return False
    
            return True
    
        def __hash__(self):
            return hash((tuple(self.courses), tuple(self.forwardLinks)))
    

    Everything look good?

    Updated to reflect @Alex's comments

    class DependencyArcTail(db.Model):
        ''' A list of courses that is a pre-req for something else '''
        courses = db.ListProperty(db.Key)
    
        ''' a list of heads that reference this one '''
        forwardLinks = db.ListProperty(db.Key)
    
        def __repr__(self):
            return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))
    
        def __eq__(self, other):
            return isinstance(other, DependencyArcTail) and set(self.courses) == set(other.courses) and set(self.forwardLinks) == set(other.forwardLinks)
    
        def __hash__(self):
            return hash((tuple(self.courses), tuple(self.forwardLinks)))
    
  • Mark Dickinson
    Mark Dickinson about 14 years
    @Alex: doesn't that hash depend on the order of the elements in tuple(set(self.courses)), which might be somewhat arbitrary?
  • Alex Martelli
    Alex Martelli about 14 years
    @Mark, it's arbitrary but not capricious -- though I'm not 100% sure that arbitrarily shuffled lists with the same items would produce the same-ordered set (it may vary with versions of Python), so it's best to avoid the tuple calls I lazily left in -- let me edit accordingly, thanks.
  • Philipp
    Philipp about 14 years
    I think you should use frozenset instead of set, the latter being unhashable.
  • Alex Punnen
    Alex Punnen over 8 years
    when you override equal to should not you recommend the not equal to part be also overridden for consistent behavior stackoverflow.com/questions/390250/… ?
  • Guy
    Guy almost 8 years
    This is an incomplete answer. Please check here. Few points: 1. You should implement __ne__ (and probably use total_ordering. 2. When you don't know how to equal, you should return NotImplemented