Tracing slow Java I/O in Solaris 10

6,370

Solution 1

If your using Solaris, you are lucky to be able to use dtrace. This will allow you to profile to the kernel level and get more clues about how the jvm is interacting with the kernel.

More information here

http://www.devx.com/Java/Article/33943

If you want find out what you jvm is doing then run dtrace with the jvm probes.

http://java.sun.com/javase/6/docs/technotes/guides/vm/dtrace.html

http://www.solarisinternals.com/wiki/index.php/DTrace_Topics_Java#DTrace_Topics:_Java

this will give you much more meaningful output in relation to your program. Have a look at 'Method Times' section.

http://www.princeton.edu/~unix/Solaris/troubleshoot/diskio.html

is a great guide for finding i/o bottle necks.

this may also help http://prefetch.net/blog/index.php/2007/03/03/viewing-busy-code-paths-with-dtrace/

There are no hard and fast rules when tracking problems like this, but information is the key!!. If you follow these guides , You are well on the way to becoming a ninja system engineer.

You may be able to use jprofiler http://www.ej-technologies.com/products/jprofiler/overview.html

it's not open source, however i've had great success in tracking down java performance problems with it.

You should also run your java vm and your application with full debug logging. If you have access to the kernel logs then check them for unusual events.

Good luck.

Does anyone else on server fault have some ninja tips for fault finding problems such as this. I've got my own way of debugging, but it would be interesting to know what other think?

-----------------UPDATE--------------

I've looked at the trace again, seems that you seem to be doing a lot of resolvepaths.

This needs to be fixed. Are you using a very long path's, or running from a directory that is symlinked. Try removing symlinks, or using hardlinks and see if you get better results.

Does the app do lot's of directory traversal , then try using absolute path's , rather than relative paths.

Again, just an idea' I had over eating a cheese cake. Run dtrace with the java probes, that should drill down enough to see what java class/methods are taking the most time.

Good Luck (again). Don't give up now, I think we are very close to solution.

Solution 2

On your solaris system just run sys_diag -G -I1 -l -v and it will aggregate all performance info (CPU/Memory/Network/Disk IO/Dtrace/kernel...) and analyze the output with a single color-coded .html report of findings/bottlenecks characterizing the workload by sub-system.. This will show any/all bottlenecks as well as locking that might be occuring (lockstat,..). The latest version is v8.1 HTH.

Solution 3

Just to post the final result.

The server (Sun T Series) is just that slow for a single-threaded Java application. Go figure.

Thank you all for the answers and support.

Solution 4

Your dtrace output shows that your applications spends most of the time on writes. You should check (using dtrace of course :-)) where these writes go. If they go to solaris filesystem then you should check if the filesystem is not a bottleneck.

Share:
6,370
fglez
Author by

fglez

Updated on September 17, 2022

Comments

  • fglez
    fglez over 1 year

    We have a Java application which is significantly slower in Solaris 10 server than in Windows PC.

    We have profiled (-Xprof) the application and observed that UnixFileSystem.getBooleanAttributes0 method consumes about 40% CPU time with native calls.

    How can we follow our search to identify which is the cause of the slow behaviour?

    Update:

    We have executed dtrace and we observe the CPU time is 0.489 sec, while the time results are

    
    real 0m40.759s
    user 0m47.669s
    sys 0m2.287s
    

    It seems time is not spent on CPU but in another place. How can we proceed with further investigation having this dtrace results? Can we use another dtrace options to show more info?

    Thank you very much.

    
    # /dtrace-toolkit/DTraceToolkit-0.99/procsystime -aTp 5695
    Tracing... Hit Ctrl-C to end...
    dtrace: 1032 dynamic variable drops with non-empty dirty list
    ^C
    
    Elapsed Times for PID 5695,
    
             SYSCALL          TIME (ns)
               gtime               7799
           sysconfig               8018
            lwp_kill              10105
         getsockname              10215
          systeminfo              10984
           getrlimit              11093
            lwp_self              20210
            schedctl              26361
        lwp_continue              27349
         priocntlsys              33390
              unlink              53050
         lwp_sigmask             136303
          setcontext             143447
       so_socketpair             202427
          lwp_create             223626
                 brk             227141
           so_socket             263276
             memcntl             461091
               ioctl            1094955
    lwp_cond_broadcast            1946728
    lwp_mutex_wakeup            2579492
     lwp_cond_signal            2903277
                mmap            3082973
              access            4446250
            mprotect            4534229
          getdents64            5713309
              mmap64            6272480
                fsat            6925239
                stat            6926115
                open            8324983
               fcntl            8840868
               close           11264404
              llseek           11678791
              munmap           11867401
             fstat64           17852129
            lwp_park           22098570
              open64           23549922
              stat64           39673214
                read           41876097
         resolvepath           81905859
               yield          175946072
               write          225507878
    lwp_mutex_timedlock          887174392
             pollsys        20393318412
       lwp_cond_wait       345099369278
              TOTAL:       367108549202
    
    CPU Times for PID 5695,
    
             SYSCALL          TIME (ns)
               gtime               1208
           sysconfig               1647
            lwp_self               2198
          systeminfo               2856
            lwp_kill               3186
         getsockname               3625
           getrlimit               4613
        lwp_continue              13839
            schedctl              14059
         priocntlsys              14498
         lwp_sigmask              20541
              unlink              45801
          setcontext              96874
                 brk             121804
            lwp_park             158601
       so_socketpair             195178
          lwp_create             209017
           so_socket             241199
             memcntl             365972
               ioctl             606397
    lwp_cond_broadcast            1367556
    lwp_mutex_wakeup            1561103
     lwp_cond_signal            1803831
              access            1885436
                mmap            2655710
              llseek            3266410
                open            3614795
                fsat            4043369
          getdents64            4253373
            mprotect            4345864
               close            4547395
               fcntl            4871095
                stat            5183012
              mmap64            5308789
             fstat64            8141116
              munmap           10774529
             pollsys           11928811
    lwp_mutex_timedlock           19212980
              open64           19443958
       lwp_cond_wait           23144761
                read           28043647
              stat64           31710269
         resolvepath           70641662
               yield           77731698
               write          137678654
              TOTAL:          489282936
    
    Syscall Counts for PID 5695,
    
             SYSCALL              COUNT
           getrlimit                  1
         getsockname                  1
               gtime                  1
            lwp_kill                  1
               rexit                  1
       so_socketpair                  1
           sysconfig                  1
          systeminfo                  1
              unlink                  1
        lwp_continue                  2
          lwp_create                  2
            schedctl                  2
            lwp_self                  3
         priocntlsys                  3
           so_socket                  3
            lwp_exit                  4
          setcontext                  7
            lwp_park                 11
             memcntl                 14
                 brk                 16
         lwp_sigmask                 19
            mprotect                 25
                mmap                 56
              access                 67
    lwp_cond_broadcast                 86
               ioctl                 87
                open                100
                fsat                129
              mmap64                133
    lwp_mutex_wakeup                148
              munmap                153
     lwp_cond_signal                162
          getdents64                224
                stat                283
             pollsys                348
       lwp_cond_wait                537
    lwp_mutex_timedlock                558
              open64                590
               fcntl                625
               close                777
              stat64               1146
              llseek               1368
                read               1404
             fstat64               1559
         resolvepath               1644
               yield              10049
               write              13416
              TOTAL:              35769
    

    Update 2:

    Here is the source of the dtraced Java application. It just loads several thousand classes from one single .jar file, reproducing the same behaviour observed in a bigger app.

    import java.io.IOException;
    import java.io.InputStream;
    import java.util.Enumeration;
    import java.util.Properties;
    public class TestCL {
        public static void main(String[] args) {
            Properties properties = new Properties();
            InputStream in = TestCL.class.getClassLoader().getResourceAsStream("TestCL.properties");
            if (in == null) {
                System.err.println("TestCL.properties not found");
                System.exit(-1);
            }
            try {
                properties.load(in);
            } catch (IOException e) {
                System.err.println("TestCL.properties loading error");
                e.printStackTrace();
                System.exit(-2);
            } finally {
                try {
                    in.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            Enumeration e = properties.keys();
            while (e.hasMoreElements()) {
                String className = (String) e.nextElement();
                try {
                    Class.forName(className);
                    System.out.println("OK " + className);
                } catch (Throwable t) {
                   System.err.println("KO " + className + " " + t.getClass().getName() + " " + t.getMessage());
                }
            }
        }
    }
    

    Update 3:

    All classes are packed in a single .jar file. We have also run the test with all classes unpacked and there's no significant improvement on performance.

    java -cp library.jar:. TestCL

    • user3789902
      user3789902 about 14 years
      could you configure dtrace to be more verbose about the resolvepath. If you have truss you can get out put like------- resolvepath("/export/home/sean/getlock", "/export/home/sean/getlock", 1023) = 25 please post this.
    • fglez
      fglez about 14 years
      Just for updating, this problem is now in Sun's hands. I'll try to post the results.
  • fglez
    fglez about 14 years
    Thanks for your response. We'll try to get permissions for running dtrace.
  • fglez
    fglez about 14 years
    Updated question with dtrace results.
  • user3789902
    user3789902 about 14 years
    I've updated with probes specific for the java vm. This should give you more clues. Don't forget jprofiler also. Personally I'm a ninja at tracking down bugs and performance problems such as this. I wish i was there to help you :-(
  • fglez
    fglez about 14 years
    The dtraced application is a little Java program which loads (Class.forName) several thousand classes from one .jar. It reproduces the same problem we have in a bigger aplication.
  • user3789902
    user3789902 about 14 years
    Right, so it's performance bottle neck in java's class loader. Things to try could you provide a dump of your $CLASSPATH path? is it large? In the java tool, the -classpath option is a shorthand way to set the java.class.path property. can you place classes to be loaded in a ram disk? and rerun the tests. For some reason the class loader is trying to lots of resolve paths. You may be able to directly force java to use a more direct path. make sure you using full paths like /a/b/c/d rather than ../../../a/b/c/d or a/b/c/ are you classes mounted on nfs?
  • user3789902
    user3789902 about 14 years