FindBugs warning: Inefficient use of keySet iterator instead of entrySet iterator

38,080

Solution 1

You are retrieving all the keys (accessing the whole map) and then for some keys, you access the map again to get the value.

You can iterate over the map to get map entries (Map.Entry) (couples of keys and values) and access the map only once.

Map.entrySet() delivers a set of Map.Entrys each one with the key and corresponding value.

for ( Map.Entry< String, LIMSGridCell > entry : cellsMap.entrySet() ) {
    if ( entry.getKey().startsWith( columnIndex ) ) {
        cells.add( entry.getValue() );
    }
}

Note: I doubt that this will be much of an improvement since if you use map entries you will instantiate an object for each entry. I don't know if this is really faster than calling get() and retrieving the needed reference directly.

Solution 2

If somebody is still interested in a detailed and number-backed answer: yes, you should use entrySet() vs. keySet() in case you are iterating over the whole Map. See this Gist for detailed numbers. I run a benchmark with JMH for the default implementations of Map with Oracle JDK8.

The main finding is: it is always a bit slower to iterate over the keySet and re-query for every key. As soon as you have bigger maps, the multiplier can become quite big (e.g., for a ConcurrentSkipListMap it is always 5-10x; while for HashMaps it is not bigger than 2x for up to a million entries).

However, these are still very small numbers. The slowest way to iterate over 1 million entries is with a ConcurrentSkipListMap.keySet(), which is around 500-700 milliseconds; while iterating over over IdentityHashMap.entrySet() is just 25-30 milliseconds with LinkedHashMap.entrySet() just behind with 40-50 milliseconds (not surprising, as it has a LinkedList inside it, which helps with iteration). As an overview from the above linked Gist:

Map type              | Access Type | Δ for 1M entries
----------------------+-------------+-----------------
HashMap               | .entrySet() |     69-72  ms
HashMap               |   .keySet() |     86-94  ms
ConcurrentHashMap     | .entrySet() |     72-76  ms
ConcurrentHashMap     |   .keySet() |     87-95  ms
TreeMap               | .entrySet() |    101-105 ms
TreeMap               |   .keySet() |    257-279 ms
LinkedHashMap         | .entrySet() |     37-49  ms
LinkedHashMap         |   .keySet() |     89-120 ms
ConcurrentSkipListMap | .entrySet() |     94-108 ms
ConcurrentSkipListMap |   .keySet() |    494-696 ms
IdentityHashMap       | .entrySet() |     26-29  ms
IdentityHashMap       |   .keySet() |     69-77  ms

So the bottom line is: it depends on your use-case. While it is definitely faster to iterate over the entrySet() the numbers are not huge, especially for reasonably small Maps. However, if you are iterating over a Map with 1 million entries quite regularly, better use the faster way ;)

The numbers of course are just to compare with each other, not absolutes.

Solution 3

You're getting the set of keys in the map, then using each key to get the value out of the map.

Instead, you can simply iterate through the Map.Entry key/value pairs returned to you via entrySet(). That way you avoid the relatively expensive get() lookup (note the use of the word relatively here)

e.g.

for (Map.Entry<String,LIMSGridCell> e : map.entrySet()) {
   // do something with...
   e.getKey();
   e.getValue();
}

Solution 4

This is the suggestion; not really an answer for your question. When you are working with ConcurrentHashMap; below is the iterator behavior mentioned in javadoc

The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.

So if you use EntrySet iterator; this may contain stale key/value pair; so it would be better; get the key from keySet iterator(); and check with collection for value. this will ensure you are getting the recent change from the collection.

If you are OK with fail-safe iterator; then check this link ; it states using entrySet; little improving the performance.

Share:
38,080
Geek
Author by

Geek

Updated on May 11, 2020

Comments

  • Geek
    Geek almost 4 years

    Please refer to the following method :

    public Set<LIMSGridCell> getCellsInColumn(String columnIndex){
        Map<String,LIMSGridCell> cellsMap = getCellsMap();
        Set<LIMSGridCell> cells = new HashSet<LIMSGridCell>();
        Set<String> keySet = cellsMap.keySet();
        for(String key: keySet){
          if(key.startsWith(columnIndex)){
            cells.add(cellsMap.get(key));
          }
        }
        return cells;
      }
    

    FindBugs give this waring message :

    "Inefficient use of keySet iterator instead of entrySet iterator This method accesses the value of a Map entry, using a key that was retrieved from a keySet iterator. It is more efficient to use an iterator on the entrySet of the map, to avoid the Map.get(key) lookup."

  • Geek
    Geek over 11 years
    In this case the map implementation is HashMap . Isn't get() for a HashMap O(1) ?
  • Geek
    Geek over 11 years
    but isn't get() on hashMap O(1) ?
  • Matteo
    Matteo over 11 years
    @Geek: yes. See my added note. I doubt that FindBugs suggestion really makes sense. Both instantiation and get() are O(1)
  • a_horse_with_no_name
    a_horse_with_no_name over 11 years
    @Geek: it is, but with using entrySet() you completely remove the call to get()
  • Brian Agnew
    Brian Agnew over 11 years
    O(1) doesn't specify how long it will take, merely that it's constant
  • Heiko Schmitz
    Heiko Schmitz over 11 years
    But he doesn't access each value by get(). Only those, whose keys match the condition are used. I think there's no general rule which way to prefer. It depends on the fraction of keys matching the condition. Obviously, FindBugs can't check that.
  • TimK
    TimK over 11 years
    The map may store Entry's (e.g. Sun's HashMap implementation) so no instantiation is required. And get() may be more than O(1), e.g. a TreeMap or a HashMap with a bad hash function. But you're right that in most cases it won't make a noticeable difference.
  • Kanagavelu Sugumar
    Kanagavelu Sugumar almost 10 years
    @Matteo Could you please review my answer? Please let me know if any comments.
  • Holger
    Holger almost 3 years
    if you use map entries you will instantiate an object for each entry”—Surely not. Most map implementations are already a map of entries. Most notably when iterating over a HashMap, the entry instances are identical to the internally stored entry objects. So calling getValue (and likewise setValue) on an Entry is accessing the value directly, whereas calling get on the map implies calling hashCode on the key, calculating an array index, and calling equals at least once on the key, to get to the same entry object you’d have already in the first place when using entrySet().
  • Gen
    Gen almost 3 years
    @Matteo can you please look into my question stackoverflow.com/questions/67880418/…