Any way to boost JVM Startup Speed?

17,682

Solution 1

Try Nailgun.

Note: I don't use it personally.

Solution 2

Just learned about drip today, as an alternative replacement to nailgun: https://github.com/flatland/drip Also see this page for some general hints: see also https://github.com/jruby/jruby/wiki/Improving-startup-time

Solution 3

I refer you to Matthew Gilliard's (mjg) blog post on the topic. Any code examples below come straight from there. I won't include timing examples partly to keep this short and partly to induce you to visit his page. Matthew works on the Fn Project so he's very interested in figuring out how to keep startup times low.

Apparently there are a few ways to do it, and some are pretty easy as well. The core idea is that you cache the JVM's initialization cycle instead of executing it on every startup.

Class Data Sharing (CDS)

CDS caches the deterministic (hardware dependant) startup process of the JDK. It's the easiest and oldest (since 1.5 I believe) trick in the book (and not very well-known).

From Oracle

When the JVM starts, the shared archive is memory-mapped to allow sharing of read-only JVM metadata for these classes among multiple JVM processes. The startup time is reduced thus saving the cost because restoring the shared archive is faster than loading the classes.

You can create the dump manually by running

⇒ java -Xshare:dump
Allocated shared space: 50577408 bytes at 0x0000000800000000
Loading classes to share ...
// ...snip ...
total   :  17538717 [100.0% of total] out of  46272512 bytes [ 37.9% used]

...and then use it with

java -Xshare:on HelloJava

AOT: Ahead of Time Compilation (Java 9+)

From mjg's blog

Where CDS does some parts of classloading of core classes in advance, AOT actually compiles bytecode to native code (an ELF-format shared-object file) in advance, and can be applied to any bytecode.

Use SubstrateVM (Java 8+)

Not in the blog post but demonstrated during the talk he gave a few days ago.

From the readme:

Substrate VM is a framework that allows ahead-of-time (AOT) compilation of Java applications under closed-world assumption into executable images or shared objects (ELF-64 or 64-bit Mach-O).

Solution 4

Change your program to a client/server model, where the Java part is a persistent server that is started only once, fed by a client that tells it what to do. The client could be a small Python script telling the server process what files to consume. Maybe send commands via a socket, or signals, up to you.

Solution 5

Um... write the documents to a directory (if they're not already) and have the Java program process all of them in one go?

Share:
17,682
Phyo Arkar Lwin
Author by

Phyo Arkar Lwin

Will Add Later!

Updated on June 05, 2022

Comments

  • Phyo Arkar Lwin
    Phyo Arkar Lwin almost 2 years

    It is said that Java is 10x faster than python in terms of performance. That's what I see from benchmarks too. But what really brings down Java is the JVM startup time.

    This is a test I made:

    $time xlsx2csv.py Types\ of\ ESI\ v2.doc-emb-Package-9
    ...
    <output skipped>
    real    0m0.085s
    user    0m0.072s
    sys     0m0.013s
    
    
    $time java  -jar -client /usr/local/bin/tika-app-0.7.jar -m Types\ of\ ESI\ v2.doc-emb-Package-9
    
    real    0m2.055s
    user    0m2.433s
    sys     0m0.078s
    

    Same file , a 12 KB ms XLSX embedded file inside Docx and Python is 25x faster !! WTH!!

    It takes 2.055 sec for Java.

    I know it is all due to startup time, but what i need is i need to call it via a script to parse some documents which i do not want to re-invent the wheel in python.

    But as to parse 10k+ files , it is just not practical..

    Anyway to speed it up (I already tried -client option and it only speed up by so little(20%) ).

    My another idea? Run it as a long-running daemon , communicate using UDP or Linux-ICP sockets locally?

  • Phyo Arkar Lwin
    Phyo Arkar Lwin over 13 years
    Sounds Perfect!! thats what i need!! Let me try it out and will let u know.
  • Phyo Arkar Lwin
    Phyo Arkar Lwin over 13 years
    Thanks but , What i am doing is server side , web-app , ajaxed. Yes i already have process all button , directory browser , search engine everything already written , in Python (search engine is Sphinx in C) .
  • Phyo Arkar Lwin
    Phyo Arkar Lwin over 13 years
    The problem is , everytime Parsed need to communicate back (for processing , put inside DB) , so thats not a point , thanks tho , i already consider this option.
  • Phyo Arkar Lwin
    Phyo Arkar Lwin over 13 years
    PERFECT solution for me. I had tested and amazed how simple it is , without ever need to write a single line of code in java , it give directly Client-server long-running process! nailgun rocks!
  • Thilo-Alexander Ginkel
    Thilo-Alexander Ginkel about 11 years
    Does drip work for you with JRuby >= 1.7.2? My attempts to measure any significant speedup have not been successful so far (even rake environment executed on a trivial project generated via rails new <name> does not benefit).
  • rogerdpack
    rogerdpack about 11 years
    [I haven't tried it ever.] What OS? does nailgun work/help? (maybe ask the drip people?)
  • rogerdpack
    rogerdpack almost 11 years
    Apparently it "can" work with jruby, and should have a dripMain method since 1.7.1 I believe crashruby.com/2013/01/21/drip-with-jruby
  • rogerdpack
    rogerdpack almost 11 years
    stackoverflow.com/questions/1491325/… also mentions drip etc...
  • Gaurav
    Gaurav over 4 years
    @Zan great answer! P.S. Nailgun is not secure.