Tomcat process killed by Linux kernel after running out of swap space; don't get any JVM OutOfMemory error
Solution 1
Why would this issue happen? When JVM runs out of memory why is there no OutOfMemoryException thrown?
It is not the JVM that has run out of memory. It is the Host Operating System that has run out of memory-related resources, and is taking drastic action. The OS has no way of knowing that the process (in this case the JVM) is capable of shutting down in an orderly fashion when told "No" in response to a request for more memory. It HAS to hard-kill something or else there is a serious risk of the entire OS hanging.
Anyway, the reason you are not seeing OOMEs is that this is not an OOME situation. In reality, the JVM has already been given too much memory by the OS, and there is no way to take it back. That's the problem the OS has to deal with by hard-killing processes.
And why does it go straight to using swap?
It uses swap because the total virtual memory demand of the entire system won't fit in physical memory. This is NORMAL behaviour for a UNIX / Linux operating system.
Why top RES shows that java is using 5.3G memory, there's much more memory consumed
The RES numbers can be a little misleading. What they refer to is the amount of physical memory that the process is currently using ... excluding stuff that is shared or shareable with other processes. The VIRT number is more relevant to your problem. It says your JVM is using 10.4g of virtual ... which is more than the available physical memory on your system.
As the other answer says, it is concerning that it concerns you that you don't get an OOME. Even if you did get one, it would be unwise to do anything with it. An OOME is liable to do collateral damage to your application / container that is hard to detect and harder to recover from. That's why OOME is an Error
not an Exception
.
Recommendations:
Don't try to use significantly more virtual memory than you have physical memory, especially with Java. When a JVM is running a full garbage collection, it will touch most of its VM pages, multiple times in random order. If you have over-allocated your memory significantly this is liable to cause thrashing which kills performance for the entire system.
Do increase your system's swap space. (But that might not help ...)
Don't try to recover from OOMEs.
Solution 2
You probably have other processes on the same computer that also use memory. It looks like your java process reaches around 5.3GB before the machine is desperately out of RAM and swap. (Other processes are then probably using 12GB-5.3GB = 6.7GB) So your linux kernel sacrifices your java process to keep other processes running. The java memory limit is never reached so you're not getting an OutOfMemoryException.
Consider all the processes you need running on the entire machine, and adjust your Xmx setting accordingly (enough to leave room for all the other processes). Perhaps 5gb?
In any case, counting of OutOfMemoryExceptions being delivered is a pretty bad code smell. If I recall correctly, getting even a single OutOfMemoryException can leave the JVM in an "all-bets-are-off" state and should probably be restarted to not become unstable.
Related videos on Youtube
baggiowen
Updated on June 25, 2022Comments
-
baggiowen almost 2 years
I was performing load testing against a tomcat server. The server has 10G physical memory and 2G swap space. The heap size (xms and xmx) was set to 3G before, and the server just worked fine. Since I still saw a lot free memory left and the performance was not good, I increased heap size to 7G and ran the load testing again. This time I observed physical memory was eaten up very quickly, and the system started consuming swap space. Later, tomcat crashed after running out of swap space. I included
-XX:+HeapDumpOnOutOfMemoryError
when starting tomcat, but I didn't get any heap dump. When I checked/var/log/messages
, I sawkernel: Out of memory: Kill process 2259 (java) score 634 or sacrifice child
.To provide more info, here's what I saw from Linux
top
command when heap size set to 3G and 7Gxms&xmx = 3G (which worked fine):
Before starting tomcat:
Mem: 10129972k total, 1135388k used, 8994584k free, 19832k buffers Swap: 2097144k total, 0k used, 2097144k free, 56008k cached
After starting tomcat:
Mem: 10129972k total, 3468208k used, 6661764k free, 21528k buffers Swap: 2097144k total, 0k used, 2097144k free, 143428k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2257 tomcat 20 0 5991m 1.9g 19m S 352.9 19.2 3:09.64 java
After starting load for 10 min:
Mem: 10129972k total, 6354756k used, 3775216k free, 21960k buffers Swap: 2097144k total, 0k used, 2097144k free, 144016k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2257 tomcat 20 0 6549m 3.3g 10m S 332.1 34.6 16:46.87 java
xms&xmx = 7G (which caused tomcat crash):
Before starting tomcat:
Mem: 10129972k total, 1270348k used, 8859624k free, 98504k buffers Swap: 2097144k total, 0k used, 2097144k free, 74656k cached
After starting tomcat:
Mem: 10129972k total, 6415932k used, 3714040k free, 98816k buffers Swap: 2097144k total, 0k used, 2097144k free, 144008k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2310 tomcat 20 0 9.9g 3.5g 10m S 0.3 36.1 3:01.66 java
After starting load for 10 min (right before tomcat was killed):
Mem: 10129972k total, 9960256k used, 169716k free, 164k buffers Swap: 2097144k total, 2095056k used, 2088k free, 3284k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2310 tomcat 20 0 10.4g 5.3g 776 S 9.8 54.6 14:42.56 java
Java and JVM Version:
Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Tomcat Version:
6.0.36
Linux Server:
Red Hat Enterprise Linux Server release 6.4 (Santiago)
So my questions are:
- Why would this issue happen? When JVM runs out of memory why is there no OutOfMemoryError thrown? And why does it go straight to using swap?
- Why
top
RES
shows that java is using 5.3G memory, there's much more memory consumed?
I have been investigating and searching for a while, still cannot find the root cause for this issue. Thanks a lot!
-
AngerClownBetter question is why is Tomcat using so much memory? You can still get a heap dump by sending the process SIGQUIT (kill -3) or with
jmap
. Eclipse MAT is probably the easiest way to analyze the dump if most of the memory is all coming from one place.
-
baggiowen almost 11 yearsthanks for you reply! but when I did
top -a
(sort by memory usage), I didn't see any other process consuming a lot memory. And if you look at the memory usage before I started tomcat, there's only about 1G memory been used. -
brady almost 11 yearsThis is pretty good advice, but the "OutOfMemoryException can leave the JVM in an 'all-bets-are-off' state" part is incorrect. First, it's
OutOfMemoryError
, but more importantly, this an orderly process that doesn't create inherent instability. The problem is that your program isn't likely to do anything useful without more memory. But it isn't damaged or unstable in any way. -
faffaffaff almost 11 yearsThe biggest problem with OutOfMemoryError is that it can happen at runtime deep within some java.* framework class or another third-party library, which perhaps doesn't have an exception handler ready to do cleanup, for example. At least that's the explanation I got many years ago while I was battling some unstability in third party libraries triggered by such errors.
-
baggiowen almost 11 yearsthank you! that makes sense. but what I still don't understand is why there's so much difference when heap size set to 3G and 7G. by looking at the memory usage before starting tomcat, I thought OS should be capable of handling 7G heap.
-
Stephen C almost 11 yearsYour JVM is actually using 10.4G. Maybe you've got a lot of off-heap memory usage going on under the covers. Note also there is a similar ~3.5G difference between the requested heap size and the observed VIRT size in the case where you used the smaller heap.
-
Stephen C almost 11 years@faffaffaff - Or the OOME might happen on some worker thread ... which then dies, leaving other parts of the application in limbo waiting for notifications etc that will never arrive.
-
faffaffaff almost 11 years@StephenC excellent point, come to think of it that was a big part of the problem way back when: Worker threads that exit because nothing catches the unexpected OOME, and things start piling up or getting stuck.
-
baggiowen almost 11 yearsthanks again. I just realized they both have 3.5G difference. So is there a way that I can find out those off-heap memory usage under the covers? Also, I was reading another post, which indicates that RSS more relevant than VIRT. I'm confused now... stackoverflow.com/questions/561245/…
-
Stephen C over 10 years@baggiowen - the relevance of
RES
andVIRT
depends on the question you are asking / problem you are trying to solve. For this purposeVIRT
is more relevant, butUSED
is the best measure. But either way, if you are going to draw accurate conclusions from the stats, you need to understand how Linux virtual memory works, and what those numbers actually mean. For the latter, read "man top" ... for example.