Why are immutable objects in hashmaps so effective?

15,027

Solution 1

String#hashCode:

private int hash;

...

public int hashCode() {
    int h = hash;
    if (h == 0 && count > 0) {
        int off = offset;
        char val[] = value;
        int len = count;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

Since the contents of a String never change, the makers of the class chose to cache the hash after it had been calculated once. This way, time is not wasted recalculating the same value.

Solution 2

Quoting the linked blog entry:

final object with proper equals () and hashcode () implementation would act as perfect Java HashMap keys and improve performance of Java hashMap by reducing collision.

I fail to see how both final and equals() have anything to do with hash collisions. This sentence raises my suspicion about the credibility of the article. It seems to be a collection of dogmatic Java "wisdoms".

Immutability also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g Integer provided by Java Collection API are very good HashMap keys.

I see two possible interpretations of this sentence, both of which are wrong:

  • HashMap caches hash codes of immutable objects. This is not correct. The map doesn't have the possibility to find out if an object is "immutable".
  • Immutability is required for an object to cache its own hash code. Ideally, an object's hash value should always just rely on non-mutating state of the object, otherwise the object couldn't be sensibly used as a key. So in this case, too, the author fails to make a point: If we assume that our object is not changing its state, we also don't have to recompute the hash value every time, even if our object is mutable!

Example

So if we are really crazy and actually decide to use a List as a key for a HashMap and make the hash value dependent on the contents, rather than the identity of the list, we could just decide to invalidate the cached hash value on every modification, thus limiting the number of hash computations to the number of modifications to the list.

Solution 3

It's very simple. Since an immutable object doesn't change over time, it only needs to perform the calculation of the hash code once. Calculating it again will yield the same value. Therefore it is common to calculate the hash code in the constructor (or lazily) and store it in a field. The hashcode function then returns just the value of the field, which is indeed very fast.

Solution 4

Think of the hashmap as a big array of numbered boxes. The number is the hashcode, and the boxes are ordered by number.

Now if the object can't change, the hash function will always reproduce the same value. Therefore the object will always stay in it's box.

Now suppose a changeable object. It is changed after adding it to the hash, so now it is sitting in the wrong box, like a Mrs. Jones which happened to marry Mister Doe, and which is now named Doe too, but in many registers still named Jones.

Solution 5

Immutable classes are unmodifiable, that's why those are used as keys in a Map.

For an example -

    StringBuilder key1=new StringBuilder("K1");
    StringBuilder key2=new StringBuilder("K2");
    
    Map<StringBuilder, String> map = new HashMap<>();
    map.put(key1, "Hello");
    map.put(key2, "World");
    
    key1.append("00");
    System.out.println(map); // This line prints - {K100=Hello, K2=World}

You see the key K1 (which is an object of mutable class StringBuilder) inserted in the map is lost due to an inadvertent change to it. This won't happen if you use immutable classes as keys for the Map family members.

Share:
15,027
Toskan
Author by

Toskan

Updated on June 14, 2022

Comments

  • Toskan
    Toskan almost 2 years

    So I read about HashMap. At one point it was noted:

    "Immutability also allows caching the hashcode of different keys which makes the overall retrieval process very fast and suggest that String and various wrapper classes (e.g., Integer) provided by Java Collection API are very good HashMap keys."

    I don't quite understand... why?

  • Jeffrey
    Jeffrey about 12 years
    You can use mutable keys for a HashMap as long as you ensure they are never modified. However, mutable objects cannot reliably cache their hash value, so time will be wasted upon every access to the map.
  • Niklas B.
    Niklas B. about 12 years
    @Jeffrey: If the data structure is never modified, we can easily design it so that we never need to compute the hash more than once. Check my edit.
  • Jeffrey
    Jeffrey about 12 years
    If you design a data structure so that it is never modified, you are designing an immutable data structure. If you have to recalculate the hash of a mutable object every time you modify it, the overhead could become worse than just recalculating it every time you called hashCode.
  • Niklas B.
    Niklas B. about 12 years
    @Jeffrey: Every sane implementation would just compute hashCode on demand and set a flag to signal that the hash code has been computed. Next time the structure is modified, the flag is reset, so that hashCode knows that it has to recompute next time. I don't know if this is the way Java does it, but lots of other languages work this way. It does not introduce overhead and still doesn't require the hash value to be recomputed every single time. The assumption that immutability is required for this kind of caching is just plain wrong.
  • Jeffrey
    Jeffrey about 12 years
    Sorry, must have misread what you wrote. I could've sworn you were talking about recalculating the hash on every modification. Your idea is sound, +1
  • Louis Wasserman
    Louis Wasserman about 12 years
    Just for the record: if the string were mutable, and you used it in a HashMap, then after you modified the string, the hash table would become corrupted and more or less unusable. The hash table doesn't know to move the entry to its new position based on the new hash code.
  • Andreas
    Andreas almost 9 years
    Should the hash field have the transient modifier?
  • Jeffrey
    Jeffrey almost 9 years
    @Andreas Nope. I was quoting from the JDK's String class, and it doesn't have a transient modifier.