Weird behavior of java.util.Set.contains(Object o)

11,249

Solution 1

When you override equals(Object) you also need to override hashcode().

Specifically, the methods must be implemented so that if a.equals(b) is true, then a.hashcode() == b.hashcode() is all true. If this invariant is not respected, then HashMap, HashSet and Hashtable will not work properly.

The technical details of how hashcode() and equals(Object) should behave are specified in the Object API.


So why do the hash-based data structures break if you get this wrong? Well basically because a hash table works by using the value of the hash function to narrow down the set of values to be compared with the "candidate". If the hashcode for the candidate is different to the hashcode for some object in the table, then the chances are that the lookup algorithm won't compare with the object in the table ... even if the objects are equal.

Solution 2

HashSet will only use equals() if the elements share the same hashCode(), thus you need to override both. Here's the relevant part of the code that is used by HashSet#contains() (note that HashSet is backed by a HashMap):

  355       /**
  356        * Returns the entry associated with the specified key in the
  357        * HashMap.  Returns null if the HashMap contains no mapping
  358        * for the key.
  359        */
  360       final Entry<K,V> getEntry(Object key) {
  361           int hash = (key == null) ? 0 : hash(key.hashCode());
  362           for (Entry<K,V> e = table[indexFor(hash, table.length)];
  363                e != null;
  364                e = e.next) {
  365               Object k;
  366               if (e.hash == hash &&
  367                   ((k = e.key) == key || (key != null && key.equals(k))))
  368                   return e;
  369           }
  370           return null;
  371       }

Not doing so, violates the contract of Object#hashCode(), which states that:

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

Share:
11,249
sp00m
Author by

sp00m

Updated on June 05, 2022

Comments

  • sp00m
    sp00m almost 2 years

    The doc about java.util.Set.contains(Object o) says:

    Returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e)).

    That said, here is a POJO (as you can see, I overwrote its equals method):

    public class MonthAndDay {
    
        private int month;
        private int day;
    
        public MonthAndDay(int month, int day) {
            this.month = month;
            this.day = day;
        }
    
        @Override
        public boolean equals(Object obj) {
            MonthAndDay monthAndDay = (MonthAndDay) obj;
            return monthAndDay.month == month && monthAndDay.day == day;
        }
    
    }
    

    So please, why does the following code prints false instead of true?

    Set<MonthAndDay> set = new HashSet<MonthAndDay>();
    set.add(new MonthAndDay(5, 1));
    System.out.println(set.contains(new MonthAndDay(5, 1)));
    // prints false
    

    A solution is to rewrite the contains(Object o) method, but the original one should be (almost) exactly the same, am I wrong?

    Set<MonthAndDay> set = new HashSet<MonthAndDay>() {
    
        private static final long serialVersionUID = 1L;
    
        @Override
        public boolean contains(Object obj) {
            MonthAndDay monthAndDay = (MonthAndDay) obj;
            for (MonthAndDay mad : this) {
                if (mad.equals(monthAndDay)) {
                    return true;
                }
            }
            return false;
        }
    
    };
    set.add(new MonthAndDay(5, 1));
    System.out.println(set.contains(new MonthAndDay(5, 1)));
    // prints true