Which Java blocking queue is most efficient for single-producer single-consumer scenarios

35,530

Solution 1

Well, there really aren't too many options. Let me go through the listed subclasses:

DelayQueue, LinkedBlockingDeque, PriorityBlockingQueue, and SynchronousQueue are all made for special cases requiring extra functionality; they don't make sense in this scenario.

That leaves only ArrayBlockingQueue and LinkedBlockingQueue. If you know how to tell whether you need an ArrayList or a LinkedList, you can probably answer this one yourself.

Note that in LinkedBlockingQueue, "linked nodes are dynamically created upon each insertion"; this might tend to push you toward ArrayBlockingQueue.

Solution 2

You can write a latency sensitive system in Java but you have to be careful of memory allocations, information passing between threads and locking. You should write some simple tests to see how much time these things cost on your server. (They vary bases on the OS and the memory architecture)

If you do this you can get stable timings for operations up to 99.99% of the time. This is not proper real-time, but it may be close enough. On a trading system, using Java instead of C costs might cost £100/d, however the cost of developing in C/C++ rather than Java is likely to be much higher than this. e.g. in terms of the flexibility it gives us and the number of bugs it saves.

You can get fairly close the same amount of jitter you would see in a C program doing the same thing. Some call this "C-like" Java.

Unfortunately it takes about 8 micro-second to pass an object between threads in Java via a ArrayBlockingQueue on the servers I work on and I suggest you test this on your servers. A simple test is to pass the System.nanoTime() between threads and see how long it takes.

ArrayBlockingQueue has a "feature" where it creates an object on each element added (though there is not much of a good reason to do so) So if you find a standard implementation which doesn't do this, let me know.

Solution 3

LinkedBlockingQueue will have O(1) insertion cost unless there is a memory allocation delay. A very large ArrayBlockingQueue will have O(1) insertion cost unless the system is blocked due to a garbage collection; it will, however, block on insert when at capacity.

Even with concurrent garbage collection I'm not sure if you should be writing a real-time system in a managed language.

Solution 4

I agree with Andrew Duffy's remark that implementing such a time-constrained system in java may be a mistake. Regardless, assuming you're married to the JVM, and you don't expect this queue to be running full (ie, the consumer can handle the load), I think you might be best served by a custom implementation, similar to an ArrayBlockingQueue but trimmed down/optimized for the single producer/consumer scenario. In particular, I'm liking the notion of spinning on the producer-side to wait for space, rather than blocking.

I referred to the java.concurrent sources for guidance, and drafted up this algorithm...

It looks pretty good to me, but it's entirely untested and may not be any faster in practice (probably not revolutionary, but I did crank it out myself :-). I had fun with it, anyway... Can you find a flaw?

Pseudocode:

private final ReentrantLock takeLock = new ReentrantLock();
private final Condition notEmpty = takeLock.newCondition();
private final E[] buffer;
private int head = 0
private int tail = 0;

public void put(E e) {
  if (e == null) throw NullPointerException();

  while (buffer[tail] != null) {
    // spin, wait for space...  hurry up, consumer!
    // open Q: would a tight/empty loop be superior here?
    Thread.sleep(1);
  }

  buffer[tail] = e;
  tail = (tail + 1) % buffer.length;

  if (count.getAndIncrement() == 0)  {
    sync(takeLock) { // this is pseudocode -- should be lock/try/finally/unlock
      notEmpty.signal();
    }
  }
}

public E take() {
  sync(takeLock) {
    while (count.get() == 0) notEmpty.await();
  }
  E item = buffer[head];
  buffer[head] = null;
  count.decrement();
  head = (head + 1) % buffer.length;

  return item;
}

Solution 5

If the timing requirements are that tight, you'll probably need to do extensive benchmarking on exactly the same hardware to determine the best fit.

If I were to guess, I'd go with ArrayBlockingQueue because array-based collections tend to have nice locality of reference.

Share:
35,530
Uri
Author by

Uri

I am a software engineer at Google's Cloud Logging group in Pittsburgh (https://cloud.google.com/logging/docs/) I received a Ph.D. in Software Engineering from Carnegie Mellon University. My dissertation focused on the usability of API documentation and on memory and knowledge sharing in collaborative development. My studies demonstrated that developers often fail to learn the most important details about methods they invoke even if these details appear in the JavaDoc. As part of my work, I developed an Eclipse plugin named eMoose that decorates calls with important associated information to attract the reader's attention. I hold an M.S. and B.S. in Computer Science from the Technion in Israel, and have previously worked for IBM research, Intel, and Thomson Reuters.

Updated on July 09, 2022

Comments

  • Uri
    Uri almost 2 years

    I'm working on a standard Java system with critical timing requirements for my producers (1/100s of ms matters).

    I have a producer placing stuff in a blocking queue, and a single consumer later picking up that stuff and dumping it to a file. The consumer blocks when data is not available.

    Obviously, blocking queue is the appropriate interface, but which actual implementation should I choose if I want to minimize the cost to the producer? I wan to play as little as possible on things like locking and allocating when I am putting stuff in the queue, and I don't mind if the consumer has to wait a lot longer or work a lot harder.

    Is there an implementation that can be faster because I only have a single consumer and single producer ?

    • james raygan
      james raygan almost 15 years
      Did you considered buying Real-Time JVM? It has special extensions helping to achieve such critical time frames. java.sun.com/javase/technologies/realtime/index.jsp
    • Uri
      Uri almost 15 years
      @Rastislv: I agree that RT JVM would be better. However, this is an existing system and most clients use standard JVMs; I need to do benchmarking with minimal impact, and I'm measuring sections that take 1/100 ms
    • Matt Ball
      Matt Ball over 6 years
      Consider using LMAX disruptor instead of a conventional queue.
  • Andrew Duffy
    Andrew Duffy almost 15 years
    The documentation says that nodes are dynamically allocated: "Linked nodes are dynamically created upon each insertion unless this would bring the queue above capacity"
  • Hank Gay
    Hank Gay almost 15 years
    Here's the source for LinkedBlockingQueue. fuseyism.com/classpath/doc/java/util/concurrent/… It doesn't appear to be doing any sort prealloction.
  • Uri
    Uri almost 15 years
    @mmyers: Wouln't ArrayBlockingQueue lock the whole array while LinkedBlockingQueue may be able to settle for locking the head and the tail?
  • palantus
    palantus almost 15 years
    That is a good point. They both use ReentrantLocks, but ArrayBlockingQueue has only one while LinkedBlockingQueue has one for each end.
  • palantus
    palantus almost 15 years
    The Javadocs say, "Linked queues typically have higher throughput than array-based queues but less predictable performance in most concurrent applications." So it kind of depends.
  • Matt
    Matt over 14 years
    IME it is worth determining what the resolution of the timer underlying nanoTime is on your hardware if you're timing like this, things can get very jittery on some OS/CPU combinations.
  • http8086
    http8086 over 4 years
    What if there are multiple consumers VS one producer, is this conclusion still stands?
  • Christoph John
    Christoph John over 4 years
    @PeterLawrey: sorry for digging up this old comment, but is the part about the "feature" of the ArrayBlockingQueue still true? At first sight I did not find anything in the current JDK code but maybe I was not looking hard enough. Thanks
  • Merijn Vogel
    Merijn Vogel about 2 years
    Measuring in Java 17, LinkedBlockingQueue is 3x faster than ArrayBlockingQueue, in a scenario with one producer and multiple consumers. Processing 46M elements in 16 vs 16 seconds respectively. Your mileage may vary, when optimizing, always measure.