Linux cannot compile without GCC optimizations; implications?

8,530

You've combined together several different (but related) questions. A few of them aren't really on-topic here (e.g., coding standards), so I'm going to ignore those.

I'm going to start with if the kernel is "technically incorrect C code". I'm starting here because the answer explains the special position a kernel occupies, which is critical to understanding the rest.

Is the Kernel Technically Incorrect C Code?

The answer is definitely its "incorrect".

There are a few ways in which a C program can be said to be incorrect. Let's get a few simple ones out of the way first:

  • A program which doesn't follow the C syntax (i.e., has a syntax error) is incorrect. The kernel uses various GNU extensions to the C syntax. Those are, as far as the C standard is concerned, syntax errors. (Of course, to GCC, they are not. Try compiling with -std=c99 -pedantic or similar...)
  • A program which doesn't do what its designed to do is incorrect. The kernel is a huge program and, as even a quick check of its changelogs will prove, surely does not. Or, as we'd commonly say, it has bugs.

What Optimization means in C

[NOTE: This section contains a very lose restatement of the actual rules; for details, see the standard and search Stack Overflow.]

Now for the one that takes more explanation. The C standard says that certain code must produce certain behavior. It also says certain things which are syntactically valid C have "undefined behavior"; an (unfortunately common!) example is to access beyond the end of an array (e.g., a buffer overflow).

Undefined behavior is powerfully so. If a program contains it, even a tiny bit, the C standard no longer cares what behavior the program exhibits or what output a compiler produces when faced with it.

But even if the program contains only defined behavior, C still allows the compiler a lot of leeway. As a trivial example (note: for my examples, I'm leaving out #include lines, etc., for brevity):

void f() {
    int *i = malloc(sizeof(int));
    *i = 3;
    *i += 2;
    printf("%i\n", *i);
    free(i);
}

That should, of course, print 5 followed by a newline. That's what's required by the C standard.

If you compile that program and disassemble the output, you'd expect malloc to be called to get some memory, the pointer returned stored somewhere (probably a register), the value 3 stored to that memory, then 2 added to that memory (maybe even requiring a load, add, and store), then the memory copied to the stack and the also a point string "%i\n" put on the stack, then the printf function called. A fair bit of work. But instead, what you might see is as if you'd written:

/* Note that isn't hypothetical; gcc 4.9 at -O1 or higher does this. */
void f() { printf("%i\n", 5) }

and here's the thing: the C standard allows that. The C standard only cares about the results, not the way they are achieved.

That's what optimization in C is about. The compiler comes up with a smarter (generally either smaller or faster, depending on the flags) way to achieve the results required by the C standard. There are a few exceptions, such as GCC's -ffast-math option, but otherwise the optimization level does not change the behavior of technically correct programs (i.e., ones containing only defined behavior).

Can You Write a Kernel Using Only Defined Behavior?

Let's continue to examine our example program. The version we wrote, not what the compiler turned it in to. The first thing we do is call malloc to get some memory. The C standard tells us what malloc does, but not how it does it.

If we look at an implementation of malloc aimed at clarity (as opposed to speed), we'd see that it makes some syscall (such as mmap with MAP_ANONYMOUS) to get a large chunk of memory. It internally keeps some data structures telling it which parts of that chunk are used vs. free. It finds a free chunk at least as large as what you asked for, carves out the amount you asked for, and returns a pointer to it. It's also entirely written in C, and contains only defined behavior. If its thread-safe, it may contain some pthread calls.

Now, finally, if we look at what mmap does, we see all kinds of interesting stuff. First, it does some checks to see if the system has enough free RAM and/or swap for the mapping. Next, it find some free address space to put the block in. Then it edits a data structure called the page table, and probably makes a bunch of inline assembly calls along the way. It may actually find some free pages of physical memory (i.e., actual bits in actual DRAM modules)---a process which may require forcing other memory out to swap---as well. If it doesn't do that for the entire requested block, it'll instead set things up so that'll happen when said memory is first accessed. Much of this is accomplished with bits of inline assembly, writing to various magic addresses, etc. Note also it also uses large parts of the kernel, especially if swapping is required.

The inline assembly, writing to magic addresses, etc. is all outside the C specification. This isn't surprising; C runs across many different machine architectures—including a bunch that were barely imaginable in the early 1970s when C was invented. Hiding that machine-specific code is a core part of what a kernel (and to some extent C library) is for.

Of course, if you go back to the example program, it becomes clear printf must be similar. It's pretty clear how to do all the formatting, etc. in standard C; but actually getting it on the monitor? Or piped to another program? Once again, a lot of magic done by the kernel (and possibly X11 or Wayland).

If you think of other things the kernel does, a lot of them are outside C. For example, the kernel reads data from disks (C knows nothing of disks, PCIe buses, or SATA) into physical memory (C knows only of malloc, not of DIMMs, MMUs, etc.), makes it executable (C knows nothing of processor execute bits), and then calls it as functions (not only outside C, very much disallowed).

The Relationship Between a Kernel and its Compiler(s)

If you remember from before, if a program contains undefined behavior, so far as the C standard is concerned, all bets are off. But a kernel really has to contain undefined behavior. So there has to be some relationship between the kernel and its compiler, at least enough that the kernel developers can be confident the kernel will work despite violating the C standard. At least in the case of Linux, this includes the kernel having some knowledge of how GCC works internally.

How likely is it to break?

Future GCC versions will probably break the kernel. I can say this pretty confidently as its happened several times before. Of course, things like the strict aliasing optimizations in GCC broke plenty of things besides the kernel, too.

Note also that the inlining that the Linux kernel is depending on is not automatic inlining, it's inlining that the kernel developers have manually specified. There are various people who have compiled the kernel with -O0 and report it basically works, after fixing a few minor problems. (One is even in the thread you linked to). Mostly, it's the kernel developers see no reason to compile with -O0, and requiring optimization as a side effect makes some tricks work, and no one tests with -O0, so it's not supported.

As an example, this compiles and links with -O1 or higher, but not with -O0:

void f();

int main() {
    int x = 0, *y;
    y = &x;

    if (*y)
        f();
    return 0;
}

With optimization, gcc can figure out that f() will never be called, and omits it. Without optimization, gcc leaves the call in, and the linker fails because there isn't a definition of f(). The kernel developers rely on similar behavior to make the kernel code easier to read/write.

Share:
8,530
DanL4096
Author by

DanL4096

Updated on September 18, 2022

Comments

  • DanL4096
    DanL4096 almost 2 years

    One can find several threads on the Internet such as this:

    http://www.gossamer-threads.com/lists/linux/kernel/972619

    where people complain they cannot build Linux with -O0, and are told that this is not supported; Linux relies on GCC optimizations to auto-inline functions, remove dead code, and otherwise do things that are necessary for the build to succeed.

    I've verified this myself for at least some of the 3.x kernels. The ones I've tried exit after a few seconds of build time if compiled with -O0.

    Is this generally considered acceptable coding practice? Are compiler optimizations, such as automatic inlining, predictable enough to rely on; at least when dealing with only one compiler? How likely is it that future versions of GCC might break builds of current Linux kernels with default optimizations (i.e. -O2 or -Os)?

    And on a more pedantic note: since 3.x kernels cannot compile without optimizations, should they be considered technically incorrect C code?

  • DanL4096
    DanL4096 almost 10 years
    Are you sure the entire system must be compiled with the same optimization level? -Os kernels work fine with a -O2 userspace in my experience, and -Os turns off a lot of optimizations enabled in -O2.
  • eyoung100
    eyoung100 almost 10 years
    @DanL4096 In a Gentoo System, at least, this isn't advised on a per package level to avoid the situation the OP is asking about. On a binary only system like Debian the -O level is determined by the OS maintainers, and cannot be changed AFAIK. -O0 is the baseline but -O2 is the recommended setting , because as the Wiki States not all programs will compile with -O0