Does printf() allocate memory in C?

10,409

Solution 1

Strictly, to answer the question in the title, the answer would be that it depends on the implementation. Some implementations might allocate memory, while others might not.

Though there are other problems inherent in your code, which I will elaborate on below.


Note: this was originally a series of comments I made on the question. I decided that it was too much for a comment, and moved them to this answer.


When you check the output you will see that it will print some numbers as expected but the last ones are gibberish.

I believe on systems using a segmented memory model, allocations are "rounded up" to a certain size. I.e. if you allocate X bytes, your program will indeed own those X bytes, however, you'll also be able to (incorrectly) run past those X bytes for a while before the CPU notices that you're violating bounds and sends a SIGSEGV.

This is most likely why your program isn't crashing in your particular configuration. Note that the 8 bytes you allocated will only cover two ints on systems where sizeof (int) is 4. The other 24 bytes needed for the other 6 ints do not belong to your array, so anything can write to that space, and when you read from that space, you are going to get garbage, if your program doesn't crash first, that is.

The number 6 is important. Remember it for later!

The magic part is that the resulting array will then have the correct numbers inside, the printf actually just prints each number another time. But this does change the array.

Note: The following is speculation, and I'm also assuming you're using glibc on a 64-bit system. I'm going to add this because I feel it might help you understand possible reasons why something might appear to work correctly, while actually being incorrect.

The reason it's "magically correct" most likely has to do with printf receiving those numbers through va_args. printf is probably populating the memory area just past the array's physical boundary (because vprintf is allocating memory to perform the "itoa" operation needed to print i). In other words, those "correct" results are actually just garbage that "appears to be correct", but in reality, that's just what happens to be in RAM. If you try changing int to long while keeping the 8 byte allocation, your program will be more likely to crash because long is longer than int.

The glibc implementation of malloc has an optimization where it allocates a whole page from the kernel every time it runs out of heap. This makes it faster because rather than ask the kernel for more memory on every allocation, it can just grab available memory from the "pool" and make another "pool" when the first one fills up.

That said, like the stack, malloc's heap pointers, coming from a memory pool, tend to be contiguous (or at least very close together). Meaning that printf's calls to malloc will likely appear just after the 8 bytes you allocated for your int array. No matter how it works, though, the point is that no matter how "correct" the results may seem, they are actually just garbage and you're invoking undefined behavior, so there's no way of knowing what's going to happen, or whether the program will do something else under different circumstances, like crash or produce unexpected behavior.


So I tried running your program with and without the printf, and both times, the results were wrong.

# without printf
$ ./a.out 
0 1 2 3 4 5 1041 0 

For whatever reason, nothing interfered with the memory holding 2..5. However, something interfered with the memory holding 6 and 7. My guess is that this is vprintf's buffer used to create a string representation of the numbers. 1041 would be the text, and 0 would be the null terminator, '\0'. Even if it's not a result of vprintf, something is writing to that address between the population and the printing of the array.

# with printf
$ ./a.out
*** Error in `./a.out': free(): invalid next size (fast): 0x0000000000be4010 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f9e5a720725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f9e5a728f4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f9e5a72cabc]
./a.out[0x400679]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f9e5a6c9830]
./a.out[0x4004e9]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:02 1573060                            /tmp/a.out
00600000-00601000 r--p 00000000 08:02 1573060                            /tmp/a.out
00601000-00602000 rw-p 00001000 08:02 1573060                            /tmp/a.out
00be4000-00c05000 rw-p 00000000 00:00 0                                  [heap]
7f9e54000000-7f9e54021000 rw-p 00000000 00:00 0 
7f9e54021000-7f9e58000000 ---p 00000000 00:00 0 
7f9e5a493000-7f9e5a4a9000 r-xp 00000000 08:02 7995396                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f9e5a4a9000-7f9e5a6a8000 ---p 00016000 08:02 7995396                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f9e5a6a8000-7f9e5a6a9000 rw-p 00015000 08:02 7995396                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f9e5a6a9000-7f9e5a869000 r-xp 00000000 08:02 7999934                    /lib/x86_64-linux-gnu/libc-2.23.so
7f9e5a869000-7f9e5aa68000 ---p 001c0000 08:02 7999934                    /lib/x86_64-linux-gnu/libc-2.23.so
7f9e5aa68000-7f9e5aa6c000 r--p 001bf000 08:02 7999934                    /lib/x86_64-linux-gnu/libc-2.23.so
7f9e5aa6c000-7f9e5aa6e000 rw-p 001c3000 08:02 7999934                    /lib/x86_64-linux-gnu/libc-2.23.so
7f9e5aa6e000-7f9e5aa72000 rw-p 00000000 00:00 0 
7f9e5aa72000-7f9e5aa98000 r-xp 00000000 08:02 7999123                    /lib/x86_64-linux-gnu/ld-2.23.so
7f9e5ac5e000-7f9e5ac61000 rw-p 00000000 00:00 0 
7f9e5ac94000-7f9e5ac97000 rw-p 00000000 00:00 0 
7f9e5ac97000-7f9e5ac98000 r--p 00025000 08:02 7999123                    /lib/x86_64-linux-gnu/ld-2.23.so
7f9e5ac98000-7f9e5ac99000 rw-p 00026000 08:02 7999123                    /lib/x86_64-linux-gnu/ld-2.23.so
7f9e5ac99000-7f9e5ac9a000 rw-p 00000000 00:00 0 
7ffc30384000-7ffc303a5000 rw-p 00000000 00:00 0                          [stack]
7ffc303c9000-7ffc303cb000 r--p 00000000 00:00 0                          [vvar]
7ffc303cb000-7ffc303cd000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
012345670 1 2 3 4 5 6 7 Aborted

This is the interesting part. You didn't mention in your question whether your program crashed. But when I ran it, it crashed. Hard.

It's also a good idea to check with valgrind, if you have it available. Valgrind is a helpful program that reports how you're using your memory. Here is valgrind's output:

$ valgrind ./a.out
==5991== Memcheck, a memory error detector
==5991== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5991== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==5991== Command: ./a.out
==5991== 
==5991== Invalid write of size 4
==5991==    at 0x4005F2: make_array (in /tmp/a.out)
==5991==    by 0x40061A: main (in /tmp/a.out)
==5991==  Address 0x5203048 is 0 bytes after a block of size 8 alloc'd
==5991==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5991==    by 0x4005CD: make_array (in /tmp/a.out)
==5991==    by 0x40061A: main (in /tmp/a.out)
==5991== 
==5991== Invalid read of size 4
==5991==    at 0x40063C: main (in /tmp/a.out)
==5991==  Address 0x5203048 is 0 bytes after a block of size 8 alloc'd
==5991==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5991==    by 0x4005CD: make_array (in /tmp/a.out)
==5991==    by 0x40061A: main (in /tmp/a.out)
==5991== 
0 1 2 3 4 5 6 7 ==5991== 
==5991== HEAP SUMMARY:
==5991==     in use at exit: 0 bytes in 0 blocks
==5991==   total heap usage: 2 allocs, 2 frees, 1,032 bytes allocated
==5991== 
==5991== All heap blocks were freed -- no leaks are possible
==5991== 
==5991== For counts of detected and suppressed errors, rerun with: -v
==5991== ERROR SUMMARY: 12 errors from 2 contexts (suppressed: 0 from 0)

As you can see, valgrind reports that you have an invalid write of size 4 and an invalid read of size 4 (4 bytes is the size of an int on my system). It's also mentioning that you're reading a block of size 0 that comes after a block of size 8 (the block that you malloc'd). This tells you that you're going past the array and into garbage land. Another thing you might notice is that it generated 12 errors from 2 contexts. Specifically, that's 6 errors in a writing context and 6 errors in a reading context. Exactly the amount of un-allocated space I mentioned earlier.

Here's the corrected code:

#include <stdio.h>
#include <stdlib.h>

int *make_array(size_t n) {
    int *result = malloc(n * sizeof (int)); // Notice the sizeof (int)

    for (int i = 0; i < n; ++i)
        result[i] = i;

    return result;
}

int main() {
    int *result = make_array(8);

    for (int i = 0; i < 8; ++i)
        printf("%d ", result[i]);

    free(result);
    return 0;
}

And here's valgrind's output:

$ valgrind ./a.out
==9931== Memcheck, a memory error detector
==9931== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9931== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==9931== Command: ./a.out
==9931== 
0 1 2 3 4 5 6 7 ==9931== 
==9931== HEAP SUMMARY:
==9931==     in use at exit: 0 bytes in 0 blocks
==9931==   total heap usage: 2 allocs, 2 frees, 1,056 bytes allocated
==9931== 
==9931== All heap blocks were freed -- no leaks are possible
==9931== 
==9931== For counts of detected and suppressed errors, rerun with: -v
==9931== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Notice that it reports no errors and that the results are correct.

Solution 2

Whether printf() allocates any memory in the course of performing its work is unspecified. It would not be surprising if any given implementation did so, but there is no reason to assume that it does. Moreover, if one implementation does, that says nothing about whether a different implementation does.

That you see different behavior when the printf() is inside the loop tells you nothing. The program exhibits undefined behavior by overrunning the bounds of an allocated object. Once it does that, all subsequent behavior is undefined. You cannot reason about undefined behavior, at least not in terms of C semantics. The program has no C semantics once undefined behavior commences. That's what "undefined" means.

Share:
10,409
AdHominem
Author by

AdHominem

Coffee addicted software engineer in the field of IT security.

Updated on June 04, 2022

Comments

  • AdHominem
    AdHominem almost 2 years

    This simple method just creates an array of dynamic size n and initializes it with values 0 ... n-1. It contains a mistake, malloc() allocates just n instead of sizeof(int) * n bytes:

    int *make_array(size_t n) {
        int *result = malloc(n);
    
        for (int i = 0; i < n; ++i) {
            //printf("%d", i);
            result[i] = i;
        }
    
        return result;
    }
    
    int main() {
        int *result = make_array(8);
    
        for (int i = 0; i < 8; ++i) {
            printf("%d ", result[i]);
        }
    
        free(result);
    }
    

    When you check the output you will see that it will print some numbers as expected but the last ones are gibberish. However, once I inserted the printf() inside the loop, the output was strangely correct, even tho the allocation was still wrong! Is there some kind of memory allocation associated with printf()?

    • Jonathan Leffler
      Jonathan Leffler over 7 years
      Often, printf() — or many of the other <stdio.h> functions — will allocate a buffer associated with a FILE * when the buffer is first needed rather than when the file stream is created. So, the succinct answer to the headline question is "Yes".
    • Jongware
      Jongware over 7 years
      I'd guess that invoking the Demons of Undefined Behavior in the first place, you should not be surprised to get yet even more undefined behavior later on.
    • AnT stands with Russia
      AnT stands with Russia over 7 years
      "once I inserted the printf() inside the loop...". Where exactly did you insert the extra printf?
    • hetepeperfan
      hetepeperfan over 7 years
      malloc(8) returns the memory for 8 bytes or returns NULL, You try to store 8 integers in there which take (system dependent) 4 bytes each. Therefore C won't guarantee what happens to last 6 ints, hence the undefined behavior.
    • Fantastic Mr Fox
      Fantastic Mr Fox over 7 years
      The second printf you mention //printf("%d", i); You are just printing i, not the buffer, so this will work as expected.
    • AdHominem
      AdHominem over 7 years
      @AnT I'm referring to the printf() which is commented out in the code.
    • AnT stands with Russia
      AnT stands with Russia over 7 years
      @AdHominem: If you are reffing to that printf, then why are you surprised that that printf prints everything correctly? That printf just prints i directly. It is completely independent of any memory allocation.
    • AdHominem
      AdHominem over 7 years
      The magic part is that the resulting array will then have the correct numbers inside, the printf actually just prints each number another time. But this does change the array
    • Braden Best
      Braden Best over 7 years
      @AdHominem There's nothing magic about undefined behavior. See my answer where I attempt to explain it.
  • Fantastic Mr Fox
    Fantastic Mr Fox over 7 years
    However, once I inserted the printf() inside the loop, the output was strangely correct You should mention that the printf in the loop is just printing i which is well defined behaviour.
  • 12431234123412341234123
    12431234123412341234123 over 7 years
    a int can be only one byte, when CHAR_BIT is at least 16.
  • Braden Best
    Braden Best over 7 years
    @12431234123412341234123 Fact check: int is guaranteed by the standard to be 16 bits (2 bytes) at the minimum. int cannot be one byte. If it is, the compiler allowing it isn't standards-compliant and shouldn't be considered a C compiler.
  • 12431234123412341234123
    12431234123412341234123 over 7 years
    @Braden Best : int can be one Byte. ANSI-C, C99 nor C11 forbid that a int can be only one byte. (as i already written). CHAR_BIT can be 16 and in this case, a byte is 16 bits long and a int need only one singe byte.
  • Braden Best
    Braden Best over 7 years
    @12431234123412341234123 then you need to go read the standard again, because one byte cannot hold the guaranteed range of -32767..+32767.
  • 12431234123412341234123
    12431234123412341234123 over 7 years
    @BradenBest : A Byte must have at least 8 Bits, if you have a implementations which have 8 Bits / Byte, in this case a valid implementation need at least 2 Bytes for a int. But a valid implementation can have more Bits for a Byte, and if a implementation have at least 16 Bits for a Byte, this implementation may or may not have a single Byte int.
  • Braden Best
    Braden Best over 7 years
    @12431234123412341234123 there you have it. Two bytes is the minimum size for int.
  • 12431234123412341234123
    12431234123412341234123 over 7 years
    @BradenBest : (sorry i press enter unintended before i wrote the complete comment ) this question is allready answerd: (stackoverflow.com/questions/1738568/…). No 16 Bits are the minimum, not 2 Bytes. A Byte can have 16 Bits. It is also valid, that a Byte have more than 63 Bits and all integer types have the size of 1 Byte
  • Braden Best
    Braden Best over 7 years
    @12431234123412341234123 You are burying things in semantics. The whole world agrees that a byte is 8 bits. And it's been this way for decades. There is no rational reason to assume that any given computer will use anything else. Because nobody in their right mind would develop for such a brain-dead model.
  • dbush
    dbush over 7 years
    @BradenBest A byte does not necessarily have exactly 8 bits. Such models do exist. See stackoverflow.com/questions/6971886/…
  • Braden Best
    Braden Best over 7 years
    It would break things like sizeof, which returns a size_t value for the size in bytes. If a byte is 16 bits, then what is sizeof(char)? You can't say 1 because that would be inaccurate. Char is 8 bits. You can't say 0.5 because integer types don't work that way, and you can't say 0 because that would break existing code where sizeof (char) is used.
  • dbush
    dbush over 7 years
    @BradenBest By definition, sizeof(char) always returns 1. It doesn't matter if CHAR_BITS is 8, 9, 16, or something else. The standards account for architectures like this.
  • Braden Best
    Braden Best over 7 years
    @dbush but char is 1 byte. So while it wouldn't break code, it would waste a lot of memory when allocating for strings since ascii takes up only 7-8 bits. And I can't imagine how this system would deal with, say, a 250GB storage drive. Whoops, now it's 125GB.
  • 12431234123412341234123
    12431234123412341234123 over 7 years
    @BradenBest: C do nothing say about maximum sizes, if it make sence or not is here not on on topic. @ dbush i think you mean CHAR_BIT not CHAR_BITS.
  • dbush
    dbush over 7 years
    @BradenBest A byte is defined as the minimum addressable piece of memory. On some architectures, it could be 9 bits, or 16 bits. If that's the way the hardware is set up, you can't really help "wasting" in that sense.
  • dbush
    dbush over 7 years
    @12431234123412341234123 Yes, my mistake, I meant CHAR_BIT.
  • Braden Best
    Braden Best over 7 years
    @dbush and we've been using 8 bit bytes for over 30 years. It's completely reasonable to assume that every single digital device in your house uses 8 bit bytes. It's not reasonable to assume that a hardware vendor would dare break this convention outside of some weird experimental in house supercomputer
  • dbush
    dbush over 7 years
    @BradenBest The post I linked above references a bluetooth device with a 16 bit byte, so it's not just legacy hardware. The point here is that architectures exist, however uncommon, that don't have 8 bit bytes. The standard must account for those architectures.
  • Braden Best
    Braden Best over 7 years
    @dbush doesn't it seem a little broken to you if potentially sizeof (int) == sizeof (char) can be true?
  • dbush
    dbush over 7 years
    @BradenBest It would be unusual yes but nothing in the standard prevents it.
  • Braden Best
    Braden Best over 7 years
    @dbush this has been an interesting conversation. I think numbers should have addressed the size of a byte before arguing about how many bytes are in an int. It would have prevented a lot of unnecessary frustration. Also, you should probably edit your answer since you said that an int is at least 2 bytes in it