GCC: why constant variables not placed in .rodata

28,213

Solution 1

The compiler has made it a common, which can be merged with other compatible symbols, and which can go in bss (taking no space on disk) if it ends up with no explicitly initialized definition. Putting it in rodata would be a trade-off; you'd save memory (commit charge) at runtime, but would use more space on disk (potentially a lot for a huge array).

If you'd rather it go in rodata, use the -fno-common option to GCC.

Solution 2

Why GCC does it? Can't really answer that question without asking the developers themselves. If I'm allowed to speculate, I'd wager it has to do with optimization--compilers don't have to enforce const.

That said, I think it's better if we look at the language itself, particularly undefined behavior. There are a few mentions of undefined behavior, but none of them go in-depth.

Modifying a constant is undefined behavior. Const is a contract, and that is especially true in C (and C++).

"But what if I const_cast away the const and modify y anyway?" Then you have undefined behavior.

What undefined behavior means is that the compiler is allowed to do quite literally anything it wants, and whatever the compiler decides to do will not be considered a violation of the ISO 9899 standard.

3.4.3

1 undefined behavior

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

2 NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

ISO/IEC 9899:1999, §3.4.3

What this means is that, because you have invoked undefined behavior, anything the compiler does is technically correct by way of not being incorrect. Ergo, it is correct for GCC to take...

static const int a = 0;

...and turn it into a .rodata symbol, while taking...

static const int a; // guaranteed to be zero

...and turning it into a .bss symbol.

In the former case, any attempt to modify a--even by proxy--will typically result in a segmentation violation, causing the kernel to force-kill the running program. In the latter case, the program will probably run without crashing.

That said, it is not reasonable to guess which one the compiler will do. Const is a contract, and it is up to you, the programmer, to uphold that contract by not modifying data that is supposed to be constant. Violating that contract means undefined behavior, and all the portability issues and program bugs that come with it.

So GCC can do a couple things.

It might write the symbol to .rodata, giving it protection under the OS kernel

It might write the object to somewhere where memory protection is not guaranteed, in which case...

It might change the value

It might change the value and immediately change it back

It might completely delete the offending code under the rationale that the value isn't changing (0 -> 0), essentially optimizing...

int main(){
    int *p = &a;
    *p = 0;
    return 0;
}

...to...

int main(void){
    return 0;
}

It might even send a model T-800 back in time to terminate your parents before you're born.

All of these behaviors are legal (well, legal in the sense of adhering to the standard), so the bug report was not warranted.

Solution 3

writing to an object that has been declared const qualified is undefined behavior: anything can happen, even that.

There is no way in C to declare the object itself to be unmutable, you only forbid it to be mutable through the particular access that you have to it. Here you have an int*, so modification is "allowed" in the sense that the compiler is not forced to issue a diagnostic. Doing a cast in C means that you suppose to know what you are doing.

Solution 4

Are there any reasons that GCC can't place a const variable in .rodata?

Your program is optimized by the compiler (even in -O0 some optimizations are done). Constant propagation is done: http://en.wikipedia.org/wiki/Constant_folding

Try to deceive the compiler like this (note that this program is still technically undefined behavior):

#include <stdio.h>

static const int a;

int main(void)
{
    *(int *) &a = printf("");  // compiler cannot assume it is 0

    printf("%d\n", a);

    return 0;
}
Share:
28,213
starrify
Author by

starrify

..for science, you monster.

Updated on May 15, 2020

Comments

  • starrify
    starrify about 4 years

    I've always believed that GCC would place a static const variable to .rodata segments (or to .text segments for optimizations) of an ELF or such file. But it seems not that case.

    I'm currently using gcc (GCC) 4.7.0 20120505 (prerelease) on a laptop with GNU/Linux. And it does place a static constant variable to .bss segment:

    /*
     * this is a.c, and in its generated asm file a.s, the following line gives:
     *   .comm a,4,4 
     * which would place variable a in .bss but not .rodata(or .text)
     */
    static const int a;
    
    int main()
    {
        int *p = (int*)&a;
        *p = 0;  /* since a is in .data, write access to that region */
                 /* won't trigger an exception */
        return 0;
    }
    

    So, is this a bug or a feature? I've decided to file this as a bug to bugzilla but it might be better to ask for help first.

    Are there any reasons that GCC can't place a const variable in .rodata?

    UPDATED:

    As tested, a constant variable with an explicit initialization(like const int a = 0;) would be placed into .rodata by GCC, while I left the variable uninitialized. Thus this question might be closed later -- I didn't present a correct question maybe.

    Also, in my previous words I wrote that the variable a is placed in '.data' section, which is incorrect. It's actually placed into .bss section since not initialized. Text above now is corrected.