Performance considerations for keySet() and entrySet() of Map

59,775

Solution 1

First of all, this depends entirely on which type of Map you're using. But since the JavaRanch thread talks about HashMap, I'll assume that that's the implementation you're referring to. And lets assume also that you're talking about the standard API implementation from Sun/Oracle.

Secondly, if you're concerned about performance when iterating through your hash map, I suggest you have a look at LinkedHashMap. From the docs:

Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity.

HashMap.entrySet()

The source-code for this implementation is available. The implementation basically just returns a new HashMap.EntrySet. A class which looks like this:

private final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
    public Iterator<Map.Entry<K,V>> iterator() {
        return newEntryIterator(); // returns a HashIterator...
    }
    // ...
}

and a HashIterator looks like

private abstract class HashIterator<E> implements Iterator<E> {
    Entry<K,V> next;    // next entry to return
    int expectedModCount;   // For fast-fail
    int index;      // current slot
    Entry<K,V> current; // current entry

    HashIterator() {
        expectedModCount = modCount;
        if (size > 0) { // advance to first entry
            Entry[] t = table;
            while (index < t.length && (next = t[index++]) == null)
                ;
        }
    }

    final Entry<K,V> nextEntry() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        Entry<K,V> e = next;
        if (e == null)
            throw new NoSuchElementException();

        if ((next = e.next) == null) {
            Entry[] t = table;
            while (index < t.length && (next = t[index++]) == null)
                ;
        }
    current = e;
        return e;
    }

    // ...
}

So there you have it... Thats the code dictating what will happen when you iterate through an entrySet. It walks through the entire array which is as long as the maps capacity.

HashMap.keySet() and .get()

Here you first need to get hold of the set of keys. This takes time proportional to the capacity of the map (as opposed to size for the LinkedHashMap). After this is done, you call get() once for each key. Sure, in the average case, with a good hashCode-implementation this takes constant time. However, it will inevitably require lots of .hashCode and .equals calls, which will obviously take more time than just doing a entry.value() call.

Solution 2

The most common case where using entrySet is preferable over keySet is when you are iterating through all of the key/value pairs in a Map.

This is more efficient:

for (Map.Entry entry : map.entrySet()) {
    Object key = entry.getKey();
    Object value = entry.getValue();
}

than:

for (Object key : map.keySet()) {
    Object value = map.get(key);
}

Because in the second case, for every key in the keySet the map.get() method is called, which - in the case of a HashMap - requires that the hashCode() and equals() methods of the key object be evaluated in order to find the associated value*. In the first case that extra work is eliminated.

Edit: This is even worse if you consider a TreeMap, where a call to get is O(log2(n)), i.e. the comparator for will may need to run log2(n) times (n = size of the Map) before finding the associated value.

*Some Map implementations have internal optimisations that check the objects' identity before the hashCode() and equals() are called.

Solution 3

Here is the link to an article comparing the performance of entrySet(), keySet() and values(), and advice regarding when to use each approach.

Apparently the use of keySet() is faster (besides being more convenient) than entrySet() as long as you don't need to Map.get() the values.

Share:
59,775

Related videos on Youtube

name_masked
Author by

name_masked

Primary programming languages: Java, C Inclined on learning: Python, Android app Development

Updated on July 05, 2022

Comments

  • name_masked
    name_masked almost 2 years

    All,

    Can anyone please let me know exactly what are the performance issues between the 2? The site : CodeRanch provides a brief overview of the internal calls that would be needed when using keySet() and get(). But it would be great if anyone can provide exact details about the flow when keySet() and get() methods are used. This would help me understand the performance issues better.

  • ILMTitan
    ILMTitan over 13 years
    Additionally, if the map is a TreeMap rather than a HashMap, get() is a O(log(n)) operation.
  • name_masked
    name_masked over 13 years
    @ILMIian and Michael: Why the difference between TreeMap and HashMap?
  • Michael Barker
    Michael Barker over 13 years
    TreeMap and HashMap are different data structures, TreeMap is based on a Red/Black tree. A HashMap is a bucket and list hashtable. In both cases the call to get() is not free and its cost depends on the type of data structure.
  • metdos
    metdos almost 12 years
    +1 "Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity."
  • sactiw
    sactiw over 10 years
    But if you just need to access the keys or just need to access the values of the Map then prefer iterating over Set returned by keySet() and Collection returned values(). One more point, the Set returned by keySet() and Collection returned by values() are both backed by the original Map. That is, if you make any modification in them they will be reflected back in the Map, however, both of them don't support add() and addAll() methods i.e. you can't add new key to the Set or new value in the Collection.
  • dantuch
    dantuch over 9 years
    You said in that article "This method (using keySet or values instead of entrySet) gives slight performance advantage over entrySet iteration (about 10% faster) and is more clean." May I know how did you get that value of "10%"? You showed no measurement nor any external data that contains such value.
  • Stefan L
    Stefan L over 9 years
    @dantuch I'm not Sergiy, but the article makes sense IMHO. The article is old though, from 2008. You could always create a microbenchmark using Google's Caliper e.g. for the latest JDK if you're curious, please post the results.
  • Sumit Kumar Saha
    Sumit Kumar Saha over 8 years
    @aioobe AS you have written "Thats the code dictating what will happen when you iterate through an entrySet. It walks through the entire array which is as long as the map's capacity." Shouldn't it be ".....which is as long as the map's size" ?
  • ACV
    ACV over 6 years
    Good good answer. I always prefer to refer to the source code as it is the ultimate source of truth
  • Novice User
    Novice User over 5 years
    Java 8 ( & above ) has the value in HashMap implemented as binary search tree instead of LinkedList. See openjdk.java.net/jeps/180