when is java faster than c++ (or when is JIT faster then precompiled)?

java performance optimization compiler-construction jit

20,799

Solution 1

In practice, you're likely to find your naively written Java code outperform your naively written C++ code in these situations (all of which I've personally observed):

Lots of little memory allocations/deallocations. The major JVMs have extremely efficient memory subsystems, and garbage collection can be more efficient than requiring explicit freeing (plus it can shift memory addresses and such if it really wants to).
Efficient access through deep hierarchies of method calls. The JVM is very good at eliding anything that is not necessary, usually better in my experience than most C++ compilers (including gcc and icc). In part this is because it can do dynamic analysis at runtime (i.e. it can overoptimize and only deoptimize if it detects a problem).
Encapsulation of functionality into small short-lived objects.

In each case, if you put the effort in, C++ can do better (between free lists and block-allocated/deallocated memory, C++ can beat the JVM memory system in almost every specific case; with extra code, templates, and clever macros, you can collapse call stacks very effectively; and you can have small partially-initialized stack-allocated objects in C++ that outperform the JVM's short-lived object model). But you probably don't want to put the effort in.

Solution 2

Some examples:

The JIT compiler can produce very CPU-specific machine code using e.g. the newest SSE extensions that one would not use in precompiled code that needs to run one a wide range of CPUs.
The JIT knows when a virtual method (the default in Java) is not overwritten anywhere, and thus can be inlined (though this requires the ability to un-inline it when a new class is loaded that does overwrite the method; current Java JIT compilers actually do this).
Related to that, escape analysis allows seeveral situation-specific optimizations.

Solution 3

Wikipedia: http://en.wikipedia.org/wiki/Just-in-time_compilation#Overview

In addition, it can in some cases offer better performance than static compilation, as many optimizations are only feasible at run-time:

The compilation can be optimized to the targeted CPU and the operating system model where the application runs. For example JIT can choose SSE2 CPU instructions when it detects that the CPU supports them. To obtain this level of optimization specificity with a static compiler, one must either compile a binary for each intended platform/architecture, or else include multiple versions of portions of the code within a single binary.

The system is able to collect statistics about how the program is actually running in the environment it is in, and it can rearrange and recompile for optimum performance. However, some static compilers can also take profile information as input.

The system can do global code optimizations (e.g. inlining of library functions) without losing the advantages of dynamic linking and without the overheads inherent to static compilers and linkers. Specifically, when doing global inline substitutions, a static compilation process may need run-time checks and ensure that a virtual call would occur if the actual class of the object overrides the inlined method, and boundary condition checks on array accesses may need to be processed within loops. With just-in-time compilation in many cases this processing can be moved out of loops, often giving large increases of speed.

Although this is possible with statically compiled garbage collected languages, a bytecode system can more easily rearrange executed code for better cache utilization.

20,799

Author by

kostja

We are here to learn :) Works and plays with scala, kafka, data, linux

Updated on July 09, 2022

Comments

kostja almost 2 years

Possible Duplicate:
JIT compiler vs offline compilers

I have heard that under certain circumstances, Java programs or rather parts of java programs are able to be executed faster than the "same" code in C++ (or other precompiled code) due to JIT optimizations. This is due to the compiler being able to determine the scope of some variables, avoid some conditionals and pull similar tricks at runtime.

Could you give an (or better - some) example, where this applies? And maybe outline the exact conditions under which the compiler is able to optimize the bytecode beyond what is possible with precompiled code?

NOTE : This question is not about comparing Java to C++. Its about the possibilities of JIT compiling. Please no flaming. I am also not aware of any duplicates. Please point them out if you are.
Vishy over 13 years

The first point is additionally valid as many Java libraries where written before new CPU archectures were available. These old libraries still make use of the latest CPu improvements. To make use of the latest architecture in C++ you have to be able to compile from source which many not be possible/practical with third party libraries. esp. if the developer is not the end user. e.g. you have an application which must be deployed to many different types of PCs, it can be a nightmare to release every possible platform so the lowest common denomiator is often chosen.
Vishy over 13 years

I believe the JIT can perform polymorphic inlining. i.e it knows up to two possible "virtual" method which are usually called and these can be inlined and if the object is not one of these classes there is a fall back. This means even virtual methods with multiple possible implementations can be inlined based on runtime behaviour.
kostja over 13 years

a good link is sometimes as good as an answer can get. very insightful. thank you
Ben Voigt over 13 years

C++ profile-guided optimizers use these same tricks.
Ben Voigt over 13 years

Great information, but reading closely reveals that precompilation actually can and does do many of the "JIT-only" optimizations.
Ben Voigt over 13 years

Nice straw man argument. C++ doesn't require you to use malloc, see for example wireshark's pool allocator.
kostja over 13 years

thank you. this is the level of detail i hoped for.
kostja over 13 years

so the argument doesn't apply to C++ allocation with new then? Is it because heap allocated by HeapAlloc and alike is used for allocation with new?
Aleksandr Dubinsky about 8 years

@BenVoigt That's a good point. The main remaining argument is that the JIT has access to information specific to the running process, instead of a pre-created profile or similar. Therefore, it can perform bold optimizations more frequently and with greater chances of success.