Can C/C++ software be compiled into bytecode for later execution? (Architecture independent unix software.)

11,211

Solution 1

There are several C to JVM compilers listed on Wikipedia's JVM page. I've never tried any of them, but they sound like an interesting exercise to build.

Because of its close association with the Java language, the JVM performs the strict runtime checks mandated by the Java specification. That requires C to bytecode compilers to provide their own "lax machine abstraction", for instance producing compiled code that uses a Java array to represent main memory (so pointers can be compiled to integers), and linking the C library to a centralized Java class that emulates system calls. Most or all of the compilers listed below use a similar approach.

Solution 2

C compiled to LLVM bit code is not platform independent. Have a look at Google portable native client, they are trying to address that.

Adobe has alchemy which will let you compile C to flash.

There are C to Java or even JavaScript compilers. However, due to differences in memory management, they aren't very usable.

Solution 3

Web Assembly is trying to address that now by creating a standard bytecode format for the web, but unlike the JVM bytecode, Web Assembly is more low level, working at the abstraction level of C/C++, and not Java, so it's more like what's typically called an "assembly language", which is what C/C++ code is normally compiled to.

Solution 4

LLVM is not a good solution for this problem. As beautiful as LLVM IR is, it is by no means machine independent, nor was it intended to be. It is very easy, and indeed necessary in some languages, to generate target dependent LLVM IR: sizeof(void*), for example, will be 4 or 8 or whatever when compiled into IR.

LLVM also does nothing to provide OS independence.

One interesting possibility might be QEMU. You could compile a program for a particular architecture and then use QEMU user space emulation to run it on different architectures. Unfortunately, this might solve the target machine problem, but doesn't solve the OS problem: QEMU Linux user mode emulation only works on Linux systems.

JVM is probably your best bet for both target and OS independence if you want to distribute binaries.

Solution 5

As Ankur mentions, C++/CLI may be a solution. You can use Mono to run it on Linux, as long as it has no native bits. But unless you already have a code base you are trying to port at minimal cost, maybe using it would be counter productive. If it makes sense in your situation, you should go with Java or C#.

Most people who go with C++ do it for performance reasons, but unless you play with very low level stuff, you'll be done coding earlier in a higher level language. This in turn gives you the time to optimize so that by the time you would have been done in C++, you'll have an even faster version in whatever higher level language you choose to use.

Share:
11,211
jkj
Author by

jkj

A nerd, programmer and sysadmin. @jkjuopperigooglefacebooklinkedin

Updated on June 26, 2022

Comments

  • jkj
    jkj almost 2 years

    I would want to compile existing software into presentation that can later be run on different architectures (and OS).

    For that I need a (byte)code that can be easily run/emulated on another arch/OS (LLVM IR? Some RISC assemby?)

    Some random ideas:

    • Compiling into JVM bytecode and running with java. Too restricting? C-compilers available?
    • MS CIL. C-Compilers available?
    • LLVM? Can Intermediate representation be run later?
    • Compiling into RISC arch such as MMIX. What about system calls?

    Then there is the system call mapping thing, but e.g. BSD have system call translation layers.

    Are there any already working systems that compile C/C++ into something that can later be run with an interpreter on another architecture?


    Edit

    Could I compile existing unix software into not-so-lowlevel binary, which could be "emulated" more easily than running full x86 emulator? Something more like JVM than XEN HVM.

  • jkj
    jkj almost 13 years
    User Mode QEMU could work. It would just have to system call mapping by itself and not just pass them to the host system.
  • Richard Pennington
    Richard Pennington almost 13 years
    That's what it does for Linux. It could also do it for other OSes. A big job, though.
  • jkj
    jkj almost 13 years
    My hidden agenda is to be able to run somewhat virtualized unix environment without going deep into emulation. It'd be okay to compile for single architecture as long as it can be emulated in userland. I'd really want to find some not-so-lowlevel arch and interpreter/JIT-compiler for emurunning it's binaries on x86.
  • echristo
    echristo over 12 years
    I understand, you'll just have to limit your choice of language to something that's designed to work on multiple platforms or design a subset of C/C++ that has the same effect.
  • Suici Doga
    Suici Doga almost 7 years
    If you use one of these will you have the GC and memory problems / lag of Java ?
  • Don Kirkby
    Don Kirkby almost 7 years
    I've never tried any of them, but I don't think you would. The description says that they use an array to represent main memory, so there wouldn't be any garbage collection. I'm almost certain it would be less memory efficient than natively compiled C, though.
  • Suici Doga
    Suici Doga almost 7 years
    Will it be able to use optimizations for the user's CPU like normal Java applications (SSE4 , AVX , etc) ?
  • Don Kirkby
    Don Kirkby almost 7 years
    I don't know, @SuiciDoga, I've never tried any of them.