How does ConcurrentHashMap work internally?

47,796

Solution 1

I would read the source of ConcurrentHashMap as it is rather complicated in the detail. In short it has

  • Multiple partitions which can be locked independently. (16 by default)
  • Using concurrent Locks operations for thread safety instead of synchronized.
  • Has thread safe Iterators. synchronizedCollection's iterators are not thread safe.
  • Does not expose the internal locks. synchronizedCollection does.

Solution 2

The ConcurrentHashMap is very similar to the java.util.HashTable class, except that ConcurrentHashMap offers better concurrency than HashTable or synchronizedMap does. ConcurrentHashMap does not lock the Map while you are reading from it. Additionally,ConcurrentHashMap does not lock the entire Mapwhen writing to it. It only locks the part of the Map that is being written to, internally.

Another difference is that ConcurrentHashMap does not throw ConcurrentModificationException if the ConcurrentHashMap is changed while being iterated. The Iterator is not designed to be used by more than one thread though whereas synchronizedMap may throw ConcurrentModificationException

Solution 3

This is the article that helped me understand it Why ConcurrentHashMap is better than Hashtable and just as good as a HashMap

Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.

This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.

Solution 4

The "scalability issues" for Hashtable are present in exactly the same way in Collections.synchronizedMap(Map) - they use very simple synchronization, which means that only one thread can access the map at the same time.

This is not much of an issue when you have simple inserts and lookups (unless you do it extremely intensively), but becomes a big problem when you need to iterate over the entire Map, which can take a long time for a large Map - while one thread does that, all others have to wait if they want to insert or lookup anything.

The ConcurrentHashMap uses very sophisticated techniques to reduce the need for synchronization and allow parallel read access by multiple threads without synchronization and, more importantly, provides an Iterator that requires no synchronization and even allows the Map to be modified during interation (though it makes no guarantees whether or not elements that were inserted during iteration will be returned).

Solution 5

Returned by synchronizedCollection() is an object all methods of which are synchronized on this, so all concurrent operations on such wrapper are serialized. ConcurrentHashMap is a truly concurrent container with fine grained locking optimized to keep contention as low as possible. Have a look at the source code and you will see what it is inside.

Share:
47,796
Adam Arold
Author by

Adam Arold

I do programming for a living. I like tinkering with stuff from bare metal to designing architectures. I also contribute to the Open Source community. Currently I'm based on the JVM (Kotlin/Clojure/Java) but I'm still looking for an Elixir (pun intended). Currently I'm hacking simulation-based games and world/flora/fauna/civilization generation (Dwarf Fortress style). If you think we can work together (programming/pixel art) drop me a mail. Check out my Hexagonal Grid library if interested: Hexameter If you need a multiplatform messaging bus: Riptide or a multiplatform terminal emulator for games: Zircon You can find my LinkedIn profile here I also created the unicorns tag on meta and #SOreadytohelp.

Updated on November 20, 2020

Comments

  • Adam Arold
    Adam Arold over 3 years

    I was reading the official Oracle documentation about Concurrency in Java and I was wondering what could be the difference between a Collection returned by

    public static <T> Collection<T> synchronizedCollection(Collection<T> c);
    

    and using for example a

    ConcurrentHashMap. I'm assuming that I use synchronizedCollection(Collection<T> c) on a HashMap. I know that in general a synchronized collection is essentially just a decorator for my HashMap so it is obvious that a ConcurrentHashMap has something different in its internals. Do you have some information about those implementation details?

    Edit: I realized that the source code is publicly available: ConcurrentHashMap.java