Why is Erlang slower than Java on all these small math benchmarks?

23,272

Solution 1

Erlang was not built for math. It was built with communication, parallel processing and scalability in mind, so testing it for math tasks is a bit like testing if your jackhammer gives you refreshing massage experience.

That said, let's offtop a little:
If you want Erlang-style programming in JVM, take a look at Scala Actors or Akka framework or Vert.x.

Solution 2

Benchmarks are never good for saying anything else than what they are really testing. If you feel that a benchmark is only testing primitives and a classic threading model, that is what you get knowledge about. You can now with some confidence say that Java is faster than Erlang on mathematics on primitives as well as the classic threading model for those types of problems. You don't know anything about the performance with large number of threads or for more involved problems because the benchmark didn't test that.

If you are doing the types of math that the benchmark tested, go with Java because it is obviously the right tool for that job. If you want to do something heavily scalable with little to no shared state, find a benchmark for that or at least re-evaluate Erlang.

If you really need to do heavy math in Erlang, consider using HiPE (consider it anyway for that matter).

Solution 3

As pointed in other answers - Erlang is designed to solve effectively real life problems, which are bit opposite to benchmark problems.

But I'd like to enlighten one more aspect - pithiness of erlang code (in some cases means rapidness of development), which could be easily concluded, after comparing benchmarks implementations.

For example, k-nucleotide benchmark:
Erlang version: http://benchmarksgame.alioth.debian.org/u64q/program.php?test=knucleotide&lang=hipe&id=3
Java version: http://benchmarksgame.alioth.debian.org/u64q/program.php?test=knucleotide&lang=java&id=3

If you want more real-life benchmarks, I'd suggest you Comparing C++ And Erlang For Motorola Telecoms Software

Solution 4

I took interest to this as some of the benchmarks are a perfect fit for erlang, such as gene sequencing. So on http://benchmarksgame.alioth.debian.org/ the first thing I did was look at reverse-complement implementations, for both C and Erlang, as well as the testing details. I found that the test is biased because it does not discount the time it takes erlang to start the VM /w the schedulers, natively compiled C is started much faster. The way those benchmarks measure is basically: time erl -noshell -s revcomp5 main < revcomp-input.txt

Now the benchmark says Java took 1.4 seconds and erlang /w HiPE took 11. Running the (Single threaded) Erlang code took me 0.15 seconds, and if you discount the time it took to start the vm, the actual workload took only 3000 microseconds (0.003 seconds).

So I have no idea how that is benchmarked, if its done 100 times then it makes no sense as the cost of starting the erlang VM will be x100. If the input is a lot longer than given, it would make sense, but I see no details on the webpage of that. To make the benchmarks more fair for Managed languages, have the code (Erlang/Java) send a Unix signal to the python (that is doing the benchmarking) that it hit the startup function.

Now benchmark aside, the erlang VM essentially just executes machine code at the end, as well as the Java VM. So there is no way a math operation would take longer in Erlang than in Java.

What Erlang is bad at is data that needs to mutate often. Such as a chained block cypher. Say you have the chars "0123456789", now your encryption xors the first 2 chars by 7, then xors the next two chars by the result of the first two added, then xors the previous 2 chars by the results of the current 2 subtracted, then xors the next 4 chars.. etc

Because objects in Erlang are immutable this means that the entire char array needs to be copied each time you mutate it. That is why erlang has support for things called NIFS which is C code you can call into to solve this exact problem. In fact all the encryption (ssl,aes,blowfish..) and compression (zlib,..) that ship with Erlang are implemented in C, also there is near 0 cost associated with calling C from Erlang.

So using Erlang you get the best of both worlds, you get the speed of C with the parallelism of Erlang.

If I were to implement the reverse-complement in the FASTEST way possible, I would write the mutating code using C but the parallel code using Erlang. Assuming infinite input, I would have Erlang split on the > <<Line/binary, ">", Rest/binary>> = read_stream Dispatch the block to the first available scheduler via round robin, consisting of infinite EC2 private networked hidden nodes, being added in real time to the cluster every millisecond.

Those nodes then call out to C via NIFS for processing (C was the fastest implementation for reverse-compliment on alioth website), then send the output back to the node master to send out to the inputer.

To implement all this in Erlang I would have to write code as if I was writing a single threaded program, it would take me under a day to create this code.

To implement this in Java, I would have to write the single threaded code, I would have to take the performance hit of calling from Managed to Unmanaged (as we will be using the C implementation for the grunt work obviously), then rewrite to support 64 cores. Then rewrite it to support multiple CPUS. Then rewrite it again to support clustering. Then rewrite it again to fix memory issues.

And that is Erlang in a nutshell.

Solution 5

The Erlang solution uses ETS, Erlang Term Storage, which is like an in-memory database running in a separate process. Consequent to it being in a separate process, all messages to and from that process must be serialized/deserialized. This would account for a lot of the slowness, I should think. For example, if you look at the "regex-dna" benchmark, Erlang is only slightly slower than Java there, and it doesn't use ETS.

Share:
23,272
yetanothercoder
Author by

yetanothercoder

java coder

Updated on December 09, 2020

Comments

  • yetanothercoder
    yetanothercoder over 3 years

    While considering alternatives for Java for a distributed/concurrent/failover/scalable backend environment I discovered Erlang. I've spent some time on books and articles where nearly all of them (even Java addicted guys) says that Erlang is a better choice in such environments, as many useful things are out of the box in a less error prone way.

    I was sure that Erlang is faster in most cases mainly because of a different garbage collection strategy (per process), absence of shared state (b/w threads and processes) and more compact data types. But I was very surprised when I found comparisons of Erlang vs Java math samples where Erlang is slower by several orders, e.g. from x10 to x100.

    Even on concurrent tasks, both on several cores and a single one.

    What's the reasons for that? These answers came to mind:

    • Usage of Java primitives (=> no heap/gc) on most of the tasks
    • Same number of threads in Java code and Erlang processes so the actor model has no advantage here
    • Or just that Java is statically typed, while Erlang is not
    • Something else?

    If that's because these are very specific math algorithms, can anybody show more real/practice performance tests?

    UPDATE: I've got the answers so far summarizing that Erlang is not the right tool for such specific "fast Java case", but the thing that is unclear to me - what's the main reason for such Erlang inefficiency here: dynamic typing, GC or poor native compiling?

  • Emil Vikström
    Emil Vikström over 11 years
    Ericsson, the creators and maintainers of Erlang, is one of the biggest providers of telecommunications equipment worldwide, and one of the largest companies in Sweden. Erlang is more than two decades old. None of your reasons are relevant to Java outperforming Erlang in this benchmark.
  • yetanothercoder
    yetanothercoder over 11 years
    OK, good point, but what do you think the main reason for poor math here: dynamic typing? scala, akka as on the top of jvm - has the same "jvm architecture issues": global GC, which is a very serious issue for big heaps, and no valid hot redeploy option, only restart if you want to update the PROD with minimal "strange" issues
  • yetanothercoder
    yetanothercoder over 11 years
    btw I see why java is faster here - almost all is compile to native +no heap, +static. What about erlang here: as it's HiPE isn't it compiled to native? what about heap? Or only static vs dynamic plays crucial role here?
  • rvirding
    rvirding over 11 years
    @EmilVikström Well, they sort of are. While Ericsson is a large company it is only a small group within Ericsson, about 20 people, who support, maintain and develop Erlang. What is probably more important is that Erlang was designed for a different type of application that Java. Specifically massively concurrent fault tolerant applications. There are real-live products who run millions of TCP connections on one machine, blog.whatsapp.com/index.php/2012/01/1-million-is-so-2011
  • npe
    npe over 11 years
    I do not know Erlang enough to tell why. And of course JVM has its issues. What I meant was: "use the tool proper to your problem". Erlang is great for sending messages. Not necessarily so great for processing them. If you do math calculations, use Matlab, or C, or Assembler. If you do statistics, use R, and so on, and so on.
  • Blake
    Blake almost 11 years
    You made a false comparison -- that Java program isn't included, it's listed under "wrong" (different) algorithm / less comparable programs". The Java program from the comparison shows 13 secs and source code 1630, versus Erlang program 157 secs and source code 932. benchmarksgame.alioth.debian.org/u64q/…
  • SudoKid
    SudoKid over 7 years
    Here is a great example of why not to trust benchmarks from one of the performance guys at Netflix. dtrace.org/blogs/brendan/2014/02/11/another-10-performance-w‌​ins
  • Emil Vikström
    Emil Vikström over 7 years
    @EmettSpeer Interesting post but I fail to see the connection to benchmarking.
  • zenw0lf
    zenw0lf over 7 years
    How can you comment on something if you don't know anything about it?
  • Lothar
    Lothar almost 7 years
    And the reason why Erlang has to do this is because it is not statically typed.
  • Peter R
    Peter R over 6 years
    Not all terms in and out of ETS have to be serialized/deserialized.