Troubleshooting Java process with very high CPU usage - Tomcat application

12,741

This isn't an easy problem. You are doing all of the basics to see if it something jumps out. It sounds like there is either a slow leak that builds up over time to the point where it can't operate. That sounds like GC is thrashing and app comes unresponsive. It could also be runaway background job(s) eating on the CPU and just doesn't complete, that might explain the long delay. You could try turning off any quartz to see if it stays up longer that might help lead you in a direction, or crank it up so it shows up sooner.

I know you've done some jconsole watching, but I think you need to revisit and watch your memory usage, the threads run time, how much time you're spending in GC, and watching what portions of memory are being eaten up (is it Eden, Tenure that's running out?).

I'd make sure you are writing out start and end messages for your background jobs running in Quartz. Then you can correlate when they start and finish with when this problem starts. Also will tell you if your jobs are finishing or not.

It's probably time to drop it into a profiler (instead of jconsole) so you can see where in the code it's spending time or what's blowing up memory. A real profiler will let you see all that data mashed up on your code and classes. My favorites is JProfiler, but YourKit is also good. You can get a 7-30 day trial so you'll have plenty of time to profile and figure your issue out without having to buy it.

Start this early in the morning so you'll hopefully see something by early night.

Share:
12,741
Nicole S.
Author by

Nicole S.

Updated on June 04, 2022

Comments

  • Nicole S.
    Nicole S. almost 2 years

    I have a java application that runs on Tomcat (which runs as a service on Windows), the java process for which continues to eat up CPU before eventually requiring me to restart the Tomcat service.

    First my setup: Windows 2003 server Tomcat 6, running as service using Wrapper JDK: 1.6.0_20

    I was seeing catch issues here and there leading up to yesterday. I had to restart midday yesterday, then at 2:30 this morning, then today I could barely restart the application and open jconsole to monitor it before it was hitting 99% CPU usage again. Through a combination of things I'm not quite sure of, it seems like I got the JVM to cycle itself and the app was hovering in the 10-30% CPU usage range for a couple hours. However, then it started to creep up again, finally going into its 99% CPU usage breakdown. I was also having trouble with high memory usage, but that has stayed fairly normal and steady since I so-called got the JVM to "cycle" (bad terminology perhaps, but this is really what it seemed to do - and in the wrapper log there was a dump of all the classes it was reloading after).

    Then I was digging around some more and found a JRE 6 Update 24 installed on the server (I didn't install it as I do thorough testing with each java update - but maybe my server admin did the update). I attempted, but can't uninstall this. Thus, I get different versions when I do a java -version versus javac -version

    java -version
        java version "1.6.0_24"
        Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
        Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)
    
    javac -version
        javac 1.6.0_20
    

    Could this difference be causing a JVM conflict of sorts? JAVA_HOME and my PATH variables both point to the correct JDK installation.

    Hoping for more stability, I decided to change my app to run on the previous JDK that was still installed - JDK 1.6.0_04. I changed the wrapper.conf, set env variables, cleaned and rebuilt, and started. This does seem more stable and has been up for about 4 hours. The CPU usage has climbed to the 90s, then it seems to clear itself out again.

    I've done heapdumps then ran them through the Memory Analyzer in Eclipse (nothing new found there), I've used jconsole with jtop to look at threads - nothing jumps out, thus why I continue to be curious if it's a java/jvm issue. So, I know this is a long post - but I don't really know where to go from here. Any ideas?

    (I've done exhaustive web searching on this and some articles have pointed to possibly a Quartz issue or Hibernate queries not flushing. Nothing has changed in the app since I started seeing the CPU issues, so I'm not sure where to start troubleshooting if it could indeed be linked to either.)

  • Nicole S.
    Nicole S. over 12 years
    Thanks for responding. I identified an application code issue, which I think contributed. It's been stable since shortly after this save for one time, so I'm hoping for the best though I'm still profiling it and trying to find any other possible code problems.