Java HashMap race condition

16,490

Solution 1

The answer is yes, there are potential race conditions:

  • when resizing an HashMap by two threads at the same time
  • when collisions happens. Collision can happen when two elements map to the same cell even if they have a different hashcode. During the conflict resolution, there can be a race condition and one added key/value pair could be overwritten by another pair inserted by another thread.

To explain better what I mean on the second point, I was looking at the source code of HashMap in OpenJdk 7

389        int hash = hash(key.hashCode());
390        int i = indexFor(hash, table.length);

First it calculates an Hash of your key (combining two hash functions), then it maps to a cell with indexFor, then it checks if that cell contains the same key or is already occupied by another one. If it's the same key, it just overwrite the value and there is no problem here.

If it's occupied it looks at the next cell and then the next until it finds an empty position and call addEntry(), which could even decide to resize the array if the array is more loaded than a certain loadFactor.

Our table containing the entries is just a vector of Entry which holds key and value.

146    /**
147     * The table, resized as necessary. Length MUST Always be a power of two.
148     */
149    transient Entry[] table;

In a concurrent environment, all sort of evil things can happen, for instance one thread gets a collision for cell number 5 and looks for the next cell (6) and finds it empty.

Meanwhile another thread gets an index of 6 as a result of indexFor and both decide to use that cell at the same time, one of the two overwriting the other.

Solution 2

Without getting into specific details of the Hashmap implementations, I would say that there is still the possibility of an error, given the fact that the Hashmap class is not safe for concurrent access.

While I agree that there should be only 1 modification to a single key at a time, because you are using currentThread(), there is still the possibility that multiple threads will be modifying the Hashmap concurrently. Unless you look at the specific implementation, you should not assume that only concurrent access to the same key would cause a problem on the Hashmap, and that concurrent modification to different keys would not.

Imagine a case where two different keys generate to the same hash value, and its easy to see that there can still be errors with concurrent modification.

Solution 3

Yes, that is not a safe thing to do (as the other answers already point out). A better solution entirely might be to use a ThreadLocal which is a more natural way to keep thread local data than using a Map. It's got a couple of nice features including default values and that the values are removed when a thread terminates.

Share:
16,490
JavaLearner
Author by

JavaLearner

Updated on June 04, 2022

Comments

  • JavaLearner
    JavaLearner almost 2 years

    I am trying to find out if there is going to be any race condition in this piece of code. If the key weren't 'Thread.currentThread' then I would think yes. But since the thread itself is key, how is it possible to have a race condition? No other thread can possibly update the same key in the HashMap!

    public class SessionTracker {
    
         static private final Map<Thread,Session>  threadSessionMap = new HashMap<Thread,Session>();
    
         static public Session get() {
             return threadSessionMap.get(Thread.currentThread());
         }
    
         static public void set(Session s) {
             threadSessionMap.put(Thread.currentThread(),s);
         }
    
         static public void reset() {
             threadSessionMap.remove(Thread.currentThread());
         }
    }
    
  • JavaLearner
    JavaLearner over 12 years
    Awesome, hadn't thought of hash collisions! I guess using 'ConcurrentHashMap' should fix this.
  • JavaLearner
    JavaLearner over 12 years
    Thanks. So I am thinking of using 'ConcurrentHashMap' from 'java.util.concurrent' instead.
  • stivlo
    stivlo over 12 years
    better, but read the Javadoc. The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default 16), which is used as a hint for internal sizing. The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because placement in hash tables is essentially random, the actual concurrency will vary.
  • JavaLearner
    JavaLearner over 12 years
    How is that going to help me? Can you elaborate?
  • JavaLearner
    JavaLearner over 12 years
    Hmm, how about if I make the 'HashMap' variable 'threadSessionMap' to be a 'volatile' variable. Will that lock the entire Hashtable?
  • Jason
    Jason over 12 years
    It looks like you are trying to associate a session with a particular thread. ThreadLocal provides this kind of storage to a thread. So you could use that to keep the session attached to the thread that owns it.
  • Steven Schlansker
    Steven Schlansker over 12 years
    @JavaLearner: No, not by a long shot. When dealing with multiple threads you must lock the data structure, or use one that is designed for concurrency. Otherwise you risk subtle bugs which will be nearly impossible to track down and haunt you forever...
  • Bruno Reis
    Bruno Reis over 12 years
    @Steven, actually, you can achieve thread-safety without locking through the use of volatile fields and atomic compare-and-set loops (usually implemented with AtomicInteger & friends).
  • Bruno Reis
    Bruno Reis over 12 years
    @JavaLearner: you can use the ConcurrentHashMap. It is safe to be used by multiple threads. The citation above should not discourage you: it just says that ConcurrentHashMap provides a configuration parameter to improve on its concurrency -- that is, with the default configuration you can have 100, 1000, as many threads as you want accessing it concurrently. The thing is that there will be some contention (some threads will block).If you keep under concurrencyLevel threads, there's a possibility that no thread will ever block, improving performance.It works in any case
  • Steven Schlansker
    Steven Schlansker over 12 years
    @BrunoReis: I would say that that falls under a "data structure designed for concurrency", and additionally that if you are asking a question like this, you should under no circumstances attempt to design such a thing. At least not yet :-) It's devilishly hard to make something that is both correct and faster than the naïve locking solution...