Per-key blocking Map in Java

14,312

Solution 1

Creating a lock per key sounds tempting, but it may not be what you want, especially when the number of keys is large.

As you would probably need to create a dedicated (read-write) lock for each key, it has impact on your memory usage. Also, that fine granularity may hit a point of diminishing returns given a finite number of cores if concurrency is truly high.

ConcurrentHashMap is oftentimes a good enough solution in a situation like this. It provides normally full reader concurrency (normally readers do not block), and updates can be concurrent up to the level of concurrency level desired. This gives you pretty good scalability. The above code may be expressed with ConcurrentHashMap like the following:

ConcurrentMap<Key,Foo> cache = new ConcurrentHashMap<>();
...
Foo result = cache.get(key);
if (result == null) {
  result = createFooExpensively(key);
  Foo old = cache.putIfAbsent(key, result);
  if (old != null) {
    result = old;
  }
}

The straightforward use of ConcurrentHashMap does have one drawback, which is that multiple threads may find that the key is not cached, and each may invoke createFooExpensively(). As a result, some threads may do throw-away work. To avoid this, you would want to use the memoizer pattern that's mentioned in "Java Concurrency in Practice".

But then again, the nice folks at Google already solved these problems for you in the form of CacheBuilder:

LoadingCache<Key,Foo> cache = CacheBuilder.newBuilder().
  concurrencyLevel(32).
  build(new CacheLoader<Key,Foo>() {
    public Foo load(Key key) {
      return createFooExpensively(key);
    }
  });

...
Foo result = cache.get(key);

Solution 2

You can use funtom-java-utils - PerKeySynchronizedExecutor.

It will create a lock for each key but will clear it for you immediately when it becomes unused.

It will also grantee memory visibility between invocations with the same key, and is designed to be very fast and minimize the contention between invocations off different keys.

Declare it in your class:

final PerKeySynchronizedExecutor<KEY_CLASS> executor = new PerKeySynchronizedExecutor<>();

Use it:

Foo foo = executor.execute(key, () -> createFooExpensively());

Solution 3

public class Cache {

    private static final Set<String> lockedKeys = new HashSet<>();

    private void lock(String key) {
        synchronized (lockedKeys) {
            while (!lockedKeys.add(key)) {
                try {
                    lockedKeys.wait();
                } catch (InterruptedException e) {
                    log.error("...");
                    throw new RuntimeException(e);
                }
            }
        }
    }

    private void unlock(String key) {
        synchronized (lockedKeys) {
            lockedKeys.remove(key);
            lockedKeys.notifyAll();
        }
    }

    public Foo getFromCache(String key) {
        try {
            lock(key);

            Foo result = cache.get(key);
            if (result == null) {
                result = createFooExpensively(key);
                cache.put(key, result);
            }
            return result;
            //For different keys it is executed in parallel.
            //For the same key it is executed synchronously.

        } finally {
            unlock(key);
        }
    }

}

key can be not only a 'String' but any class with correctly overridden 'equals' and 'hashCode' methods.
try-finally - is very important - you must guarantee to unlock waiting threads after your operation even if your operation threw exception.
It will not work if your back-end is distributed across multiple servers/JVMs.

14,312

Author by

prashant

David Moles writes code for money and sometimes for fun.

Updated on June 04, 2022

Comments

prashant almost 2 years
I'm dealing with some third-party library code that involves creating expensive objects and caching them in a Map. The existing implementation is something like
```
lock.lock()
try {
    Foo result = cache.get(key);
    if (result == null) {
        result = createFooExpensively(key);
        cache.put(key, result);
    }
    return result;
} finally {
    lock.unlock();
}
```
Obviously this is not the best design when Foos for different keys can be created independently.

My current hack is to use a Map of Futures:
```
lock.lock();
Future<Foo> future;
try {
    future = allFutures.get(key);
    if (future == null) {
        future = executorService.submit(new Callable<Foo>() {
            public Foo call() {
                return createFooExpensively(key);
            }
        });
        allFutures.put(key, future);
    }
} finally {
    lock.unlock();
}

try {
    return future.get();
} catch (InterruptedException e) {
    throw new MyRuntimeException(e);
} catch (ExecutionException e) {
    throw new MyRuntimeException(e);
}
```
But this seems... a little hacky, for two reasons:
1. The work is done on an arbitrary pooled thread. I'd be happy to have the work done on the first thread that tries to get that particular key, especially since it's going to be blocked anyway.
2. Even when the Map is fully populated, we still go through Future.get() to get the results. I expect this is pretty cheap, but it's ugly.
What I'd like is to replace cache with a Map that will block gets for a given key until that key has a value, but allow other gets meanwhile. Does any such thing exist? Or does someone have a cleaner alternative to the Map of Futures?