What is Mapped Buffer Pool / Direct Buffer Pool and how to increase their size?

java performance scala

15,296

Direct Buffer

A direct buffer is a chunk of memory typically used to interface Java to the OS I/O subsystems, for example as a place where the OS writes data as it receives it from a socket or disk, and from which Java can read directly.

Sharing the buffer with the OS is much more efficient than the original approach of copying data from the OS into Java's memory model, which then makes the data subject to Garbage Collection and inefficiencies such as the re-copying of data as it migrates from eden -> survivor -> tenured -> to the permanent generation.

In the screenshot you have just one buffer of 16KB of direct buffer. Java will grow this pool as required so the fact the blue area is at the top of the block is merely a statement that all buffer memory allocated so far is in use. I don't see this as an issue.

Mapped buffer pool

The mapped buffer pool is all the memory used by Java for its FileChannel instances.

Each FileChannel instance has a buffer shared with the OS (similar to the direct buffer with all the efficiency benefits). The memory is essentially an in-RAM window onto a portion of the file. Depending on the mode (read, write or both), Java can either read and/or modify the file's contents directly and the OS can directly supply data to or flush modified data to disk.

Additional advantages of this approach is that the OS can flush this buffer directly to the disk as it sees fit, such as when the OS is shutting down, and the OS can lock that portion of the file from other processes on the computer.

The screenshot indicates you have about 680MB in use by 12 FileChannel objects. Again, Java will grow this is Scala needs more (and the JVM can get additional memory from the OS), so the fact that all 680MB is all in use is not important. Given their size, it certainly seems to me that the program has already been optimized to use these buffers effectively.

Increasing the size of the mapped buffer pool

Java allocates memory outside the Garbage Collection space for the FileChannel buffers. This means the normal heap size parameters such as -Xmx are not important here

The size of the buffer in a FileChannel is set with the map method. Changing this would entail changing your Scala program

Once the buffer has reached a threshold size, of the order 10s-100s of KB, increasing FileChannel buffer size may or may not increase performance - it depends on how the program uses the buffer:

No: If the file is read precisely once from end to end: Almost all the time is either waiting for the disk or the processing algorithm
Maybe: If, however, the algorithm frequently scans the file revisiting portions many times, increasing the size might improve performance:
- If modifying or writing the file, a larger buffer can consolidate more writes into a single flush.
- If reading the file, the operating system will likely have already cached the file (the disk cache) and so any gains are likely marginal. Perversely increasing the size of the JVM might decrease performance by shrinking the effective disk cache size
- In any case the application would have to be specifically coded to get any benefits, for example by implementing its own logical record pointer onto the cache.

Try profiling the application and look for I/O waits (Jprofiler and YourKit are good at this). It may be that file I/O is not actually a problem - don't be a victim of premature optimization. If I/O waits are a significant portion of the total elapsed time, then it might be worth trying out a larger buffer size

Further information

https://blogs.oracle.com/alanb/entry/monitoring_direct_buffers

Also be aware that there is a bug reported on the JVM saying that FileChannel is not good at releasing memory. It's detailed in Prevent OutOfMemory when using java.nio.MappedByteBuffer

15,296

Andrew Alcock

I started programming on the ZX81 - 1kb RAM with a 3.5MHz Z80A CPU. Most recently I was developing and running a cluster processing 60 billion transactions daily with a 99th percentile latency less than 10ms, with data being visible in a Hadoop cluster within seconds. I now program in a wide variety of languages: HTML5/JavaScript/CSS3, Python, Java. Over the years I have also extensively used SQL, ML and rules-based rete engines. I have spent the last few years deploying and operating large-scale solution in the AWS cloud and managing a talented and highly effective Agile Devops team. My full bio is on LinkedIn

Updated on June 27, 2022

Comments

Andrew Alcock almost 2 years

The screenshot of VisualVM was taken when I ran an IO intensive JVM program (written in Scala), heap size was 4 GB and only 2 GB were in-use. The JVM program uses memory mapped file.

What does "mapped buffer pool" and "direct buffer pool" mean?

Those pools seem to be very full. Since the JVM program uses memory mapped file, will I see increased performance if the pools were larger? If so, how to increase their size?

The size of all mapped files are about 1.1GB in size.
Admin about 11 years

This is the most awesome answer I've received so far on stackoverflow. Thank you very much indeed!