HashSet contains problem with custom objects

23,685

Solution 1

Because p1.hashCode() changes when you modify p1, so it can't be found at its original index in the hash table anymore. Never let a hash value depend on a mutable field.

(You're quite lucky that it fails during testing; it might just as well have succeeded, only to fail in production.)

Solution 2

HashSet implements Set. The ApiDoc specifies:

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.

In your example this is the case, because changing name or age on p1 affects equal-comparison. So according the ApiDoc, the behavior of Set in your case is unspecified.

Solution 3

Hashes are simple pairing of key and values. Here's how the state of your code would look before and after the renaming in pseudo-code:

Before:

personSet => {
    SOME_NUM1 => Person(name=>"raghu", 12),
    SOME_NUM2 => Person(name=>"rimmu", 21)
}

p1.setName("raghus"); #p1.hashcode() = SOME_NEW_NUM
p1.setAge(13);#p1.hashcode() = SOME_OTHER_NEW_NUM

After:

personSet => {
    SOME_NUM1 => Person(name=>"raghu", 13),
    SOME_NUM2 => Person(name=>"rimmu", 21)
}

Since you have direct access to the p1 the object within the HashSet is updated correctly, but HashSet does not pay attention to contained objects hashcodes being updated. When a call to personSet.contains(p1) is called, the HashSet looks for an entry with the new value of p1.hashcode().

The p1 object is associated with its previous hashcode at the time when it was added to the HashSet.

Solution 4

I think you need to have hashCode depends on mutable fields quite often indeed: when you override equals that depends on mutable fields.

From hashCode's contract: "If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result."

So, if you create two objects such that A.equals(B) is true, and then modify A such a way that you get A.equals(B) became false, you need to have hashCodes change too.

It's true that in hashCode's documentation is stated that "It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.", but I don't know how this can help.

Share:
23,685
sridhar
Author by

sridhar

Updated on September 04, 2020

Comments

  • sridhar
    sridhar over 3 years

    My Custom class that will be contained by HashSet

    public class Person {
        String name;
        int age;
    
        public Person(String name, int age) {
            this.name = name;
            this.age = age;
        }
    
        @Override
        public String toString() {
            return "Person{" +
                    "hashcode='" + this.hashCode() + '\'' +
                    "name='" + name + '\'' +
                    ", age=" + age +
                    '}';
        }
    
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (!(o instanceof Person)) return false;
    
            Person person = (Person) o;
    
            if (age != person.age) return false;
            if (!name.equals(person.name)) return false;
    
            return true;
        }
    
        @Override
        public int hashCode() {
            int result = name.hashCode();
            result = 31 * result + age;
            return result;
        }
    
        public String getName() {
            return name;
        }
    
        public void setName(String name) {
            this.name = name;
        }
    
        public int getAge() {
            return age;
        }
    
        public void setAge(int age) {
            this.age = age;
        }
    }
    

    My HashSet test that fails

       public void hashSetTest() {
            Set<Person>  personSet = new HashSet<Person>();
            Person p1 = new Person("raghu", 12);
            Person p2 = new Person("rimmu", 21);
    
            personSet.add(p1);
            personSet.add(p2);
    
    
           p1.setName("raghus");
           p1.setAge(13);
    
           int i2 =p1.hashCode();
           System.out.println(personSet.size() + ": "+ p1.hashCode()+" : "+personSet.contains(p1)+ " : "+i2);
        }
    

    Iam expecting personSet.contains(p1) to pass. Why is it returning false? Thanks sri

  • Voo
    Voo about 13 years
    Well "never" is a bit harsh because there are situations you'll need it. Just remember that you've got to remove such an object, modify it and add anew to the set. But obviously it's preferred to just use immutable fields if possible.
  • Fred Foo
    Fred Foo about 13 years
    @Voo: this may be a matter of style and preference, but I'd still say never. Hash functions should be defined on the basis of the immutable parts of an object. (I'd make a lightweight object such as this completely immutable, but then I don't like setters at all.)
  • Voo
    Voo about 13 years
    It surely is a pitfall you've got to be aware of, but if you need efficient access (~O(1)) to mutable objects there's not much you can do. I mean you could make the objects immuteable and create a new one and add that one and delete the old one but then is that any better than the other approach?
  • Voo
    Voo about 13 years
    You should mark his answer as correct (the green flag besides the up/downvote).
  • sridhar
    sridhar about 13 years
    Thanks for quick responses. Are you guys saying that oringinal p1 object with HashCode xxxx at location say '1' in HashTable is still indexed at the same location '1' in HashTable with a different HashCode yyyy after going through a modification? That is why contains method is actually not finding the object p1 in its set because now HashCode yyyy is showing a different index? --sri
  • Fred Foo
    Fred Foo about 13 years
    @Voo: There's often an immutable part in an otherwise mutable object that can serve as the base of its hashCode(). If not, it's not hashable. Harsh, maybe, but I prefer to call it "preventing hard-to-find bugs" :)
  • Fred Foo
    Fred Foo about 13 years
    @sri: How would the hashtable know that the hashCode() value has changed? Who would notify it?
  • Voo
    Voo about 13 years
    @larsmans: I agree that avoiding it is usually an extremely good idea, but then that's the usual tradeof between performance and maintainability/readability. I'd still use it in cases where the extra performance is necessary (after benchmarking other solutions obviously) but surely not without a good reason (and limiting the complexity with some clever abstraction so I call setters of those methods only in one specific place).