When and why will a compiler initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?

88

Solution 1

A quick summary of what Microsoft's compilers use for various bits of unowned/uninitialized memory when compiled for debug mode (support may vary by compiler version):

Value     Name           Description 
------   --------        -------------------------
0xCD     Clean Memory    Allocated memory via malloc or new but never 
                         written by the application. 

0xDD     Dead Memory     Memory that has been released with delete or free. 
                         It is used to detect writing through dangling pointers. 

0xED or  Aligned Fence   'No man's land' for aligned allocations. Using a 
0xBD                     different value here than 0xFD allows the runtime
                         to detect not only writing outside the allocation,
                         but to also identify mixing alignment-specific
                         allocation/deallocation routines with the regular
                         ones.

0xFD     Fence Memory    Also known as "no mans land." This is used to wrap 
                         the allocated memory (surrounding it with a fence) 
                         and is used to detect indexing arrays out of 
                         bounds or other accesses (especially writes) past
                         the end (or start) of an allocated block.

0xFD or  Buffer slack    Used to fill slack space in some memory buffers 
0xFE                     (unused parts of `std::string` or the user buffer 
                         passed to `fread()`). 0xFD is used in VS 2005 (maybe 
                         some prior versions, too), 0xFE is used in VS 2008 
                         and later.

0xCC                     When the code is compiled with the /GZ option,
                         uninitialized variables are automatically assigned 
                         to this value (at byte level). 


// the following magic values are done by the OS, not the C runtime:

0xAB  (Allocated Block?) Memory allocated by LocalAlloc(). 

0xBAADF00D Bad Food      Memory allocated by LocalAlloc() with LMEM_FIXED,but 
                         not yet written to. 

0xFEEEFEEE               OS fill heap memory, which was marked for usage, 
                         but wasn't allocated by HeapAlloc() or LocalAlloc(). 
                         Or that memory just has been freed by HeapFree(). 

Disclaimer: the table is from some notes I have lying around - they may not be 100% correct (or coherent).

Many of these values are defined in vc/crt/src/dbgheap.c:

/*
 * The following values are non-zero, constant, odd, large, and atypical
 *      Non-zero values help find bugs assuming zero filled data.
 *      Constant values are good, so that memory filling is deterministic
 *          (to help make bugs reproducible).  Of course, it is bad if
 *          the constant filling of weird values masks a bug.
 *      Mathematically odd numbers are good for finding bugs assuming a cleared
 *          lower bit.
 *      Large numbers (byte values at least) are less typical and are good
 *          at finding bad addresses.
 *      Atypical values (i.e. not too often) are good since they typically
 *          cause early detection in code.
 *      For the case of no man's land and free blocks, if you store to any
 *          of these locations, the memory integrity checker will detect it.
 *
 *      _bAlignLandFill has been changed from 0xBD to 0xED, to ensure that
 *      4 bytes of that (0xEDEDEDED) would give an inaccessible address under 3gb.
 */

static unsigned char _bNoMansLandFill = 0xFD;   /* fill no-man's land with this */
static unsigned char _bAlignLandFill  = 0xED;   /* fill no-man's land for aligned routines */
static unsigned char _bDeadLandFill   = 0xDD;   /* fill free objects with this */
static unsigned char _bCleanLandFill  = 0xCD;   /* fill new objects with this */

There are also a few times where the debug runtime will fill buffers (or parts of buffers) with a known value, for example, the 'slack' space in std::string's allocation or the buffer passed to fread(). Those cases use a value given the name _SECURECRT_FILL_BUFFER_PATTERN (defined in crtdefs.h). I'm not sure exactly when it was introduced, but it was in the debug runtime by at least VS 2005 (VC++8).

Initially, the value used to fill these buffers was 0xFD - the same value used for no man's land. However, in VS 2008 (VC++9) the value was changed to 0xFE. I assume that's because there could be situations where the fill operation would run past the end of the buffer, for example, if the caller passed in a buffer size that was too large to fread(). In that case, the value 0xFD might not trigger detecting this overrun since if the buffer size were too large by just one, the fill value would be the same as the no man's land value used to initialize that canary. No change in no man's land means the overrun wouldn't be noticed.

So the fill value was changed in VS 2008 so that such a case would change the no man's land canary, resulting in the detection of the problem by the runtime.

As others have noted, one of the key properties of these values is that if a pointer variable with one of these values is de-referenced, it will result in an access violation, since on a standard 32-bit Windows configuration, user mode addresses will not go higher than 0x7fffffff.

Solution 2

One nice property about the fill value 0xCCCCCCCC is that in x86 assembly, the opcode 0xCC is the int3 opcode, which is the software breakpoint interrupt. So, if you ever try to execute code in uninitialized memory that's been filled with that fill value, you'll immediately hit a breakpoint, and the operating system will let you attach a debugger (or kill the process).

Solution 3

It's compiler and OS specific, Visual Studio sets different kinds of memory to different values so that in the debugger you can easily see if you have overun into into malloced memory, a fixed array or an uninitialised object.

https://docs.microsoft.com/en-gb/visualstudio/debugger/crt-debug-heap-details?view=vs-2022

Solution 4

It's not the OS - it's the compiler. You can modify the behaviour too - see down the bottom of this post.

Microsoft Visual Studio generates (in Debug mode) a binary that pre-fills stack memory with 0xCC. It also inserts a space between every stack frame in order to detect buffer overflows. A very simple example of where this is useful is here (in practice Visual Studio would spot this problem and issue a warning):

...
   bool error; // uninitialised value
   if(something)
   {
      error = true;
   }
   return error;

If Visual Studio didn't preinitialise variables to a known value, then this bug could potentially be hard to find. With preinitialised variables (or rather, preinitialised stack memory), the problem is reproducible on every run.

However, there is a slight problem. The value Visual Studio uses is TRUE - anything except 0 would be. It is actually quite likely that when you run your code in Release mode that unitialised variables may be allocated to a piece of stack memory that happens to contain 0, which means you can have an unitialised variable bug which only manifests itself in Release mode.

That annoyed me, so I wrote a script to modify the pre-fill value by directly editing the binary, allowing me to find uninitalized variable problems that only show up when the stack contains a zero. This script only modifies the stack pre-fill; I never experimented with the heap pre-fill, though it should be possible. Might involve editing the run-time DLL, might not.

Solution 5

Is this specific to the compiler used?

Actually, it's almost always a feature of the runtime library (like the C runtime library). The runtime is usually strongly correlated with the compiler, but there are some combinations you can swap.

I believe on Windows, the debug heap (HeapAlloc, etc.) also uses special fill patterns which are different than the ones that come from the malloc and free implementations in the debug C runtime library. So it may also be an OS feature, but most of the time, it's just the language runtime library.

Do malloc/new and free/delete work in the same way with regard to this?

The memory management portion of new and delete are usually implemented with malloc and free, so memory allocated with new and delete usually have the same features.

Is it platform specific?

The details are runtime specific. The actual values used are often chosen to not only look unusual and obvious when looking at a hex dump, but are designed to have certain properties that may take advantage of features of the processor. For example, odd values are often used, because they could cause an alignment fault. Large values are used (as opposed to 0), because they cause surprising delays if you loop to an uninitialized counter. On x86, 0xCC is an int 3 instruction, so if you execute an uninitialized memory, it'll trap.

Will it occur on other operating systems, such as Linux or VxWorks?

It mostly depends on the runtime library you use.

Can you give any practical examples as to how this initialisation is useful?

I listed some above. The values are generally chosen to increase the chances that something unusual happens if you do something with invalid portions of memory: long delays, traps, alignment faults, etc. Heap managers also sometimes use special fill values for the gaps between allocations. If those patterns ever change, it knows there was a bad write (like a buffer overrun) somewhere.

I remember reading something (maybe in Code Complete 2) that it is good to initialise memory to a known pattern when allocating it, and certain patterns will trigger interrupts in Win32 which will result in exceptions showing in the debugger.

How portable is this?

Writing Solid Code (and maybe Code Complete) talks about things to consider when choosing fill patterns. I've mentioned some of them here, and the Wikipedia article on Magic Number (programming) also summarizes them. Some of the tricks depend on the specifics of the processor you're using (like whether it requires aligned reads and writes and what values map to instructions that will trap). Other tricks, like using large values and unusual values that stand out in a memory dump are more portable.

Share:
88
mahendra rajeshirke
Author by

mahendra rajeshirke

Updated on November 12, 2021

Comments

  • mahendra rajeshirke
    mahendra rajeshirke over 2 years

    I have used jquery fadein fadeout animation for 3 different sentences. Link is here paris.kanakia.com/zisit/ In this animation the "Z" is silent. But there is jerk and not consistency in the animation. Please find my code below.

    function startAnim(){
    setInterval(function () {
     $('#slide1').hide();
     $('#slide2').show();
     $('#slide2 .one').fadeIn(1000,function(){
                      $('#slide2 .one').fadeOut(1000,function(){
                       $('#slide2 .two').fadeIn(1000,function(){
                          $('#slide2 .two').fadeOut(1000,function(){
                              $('#slide2 .three').fadeIn(1000,function(){
                                $('#slide2 .three').fadeOut(1000);
                              });
                          });
                       });
                      });
                    });
    }, 8600);                    
    }
    jQuery(document).ready(function () {
      startAnim();
    });
    
  • strager
    strager over 15 years
    My guess is that it's used to check if you forget to terminate your strings properly too (since those 0xCD's or 0xDD's are printed).
  • strager
    strager over 15 years
    Doesn't VS issue a warning when using a value before it is initialized, like GCC?
  • FryGuy
    FryGuy over 15 years
    0xCC = uninitialized local (stack) variable 0xCD = uninitialized class (heap?) variable 0xDD = deleted variable
  • Michael Burr
    Michael Burr over 15 years
    Oh yeah - some of it is from the CRT source in DbgHeap.c.
  • sean e
    sean e almost 13 years
    Some of it is on MSDN (msdn.microsoft.com/en-us/library/bebs9zyz.aspx), but not all. Good list.
  • Tad Marshall
    Tad Marshall over 12 years
    And 0xCD is the int instruction, so executing 0xCD 0xCD will generate an int CD, which will also trap.
  • Simon Mourier
    Simon Mourier over 9 years
    @seane - FYI your link seems dead. The new one (text has been enhanced) is available here: msdn.microsoft.com/en-us/library/974tc9t1.aspx
  • Adrian McCarthy
    Adrian McCarthy over 7 years
    "It's not the OS - it's the compiler." Actually, it's not the compiler -- it's the runtime library.
  • MSalters
    MSalters about 7 years
    In todays world, Data Execution Prevention doesn't even allow the CPU to fetch an instruction from the heap. This answer is outdated since XP SP2.
  • Adam Rosenfield
    Adam Rosenfield about 7 years
    @MSalters: Yes, it's true that by default, newly allocated memory will be non-executable, but somebody could easily use VirtualProtect() or mprotect() to make the memory executable.
  • Miro
    Miro almost 7 years
    What is the name of these blocks? Is it memory barrier, membar, memory fence or fence instruction (en.wikipedia.org/wiki/Memory_barrier)?
  • Glenn Slayden
    Glenn Slayden over 6 years
    @FryGuy There's a practical reason which dictates (some of) these values, as I explain here.
  • Glenn Slayden
    Glenn Slayden over 6 years
    Your explanation doesn't follow, since you'd also get an access violation trying to read 0x00000000, which would be just as useful (or more, as a bad address). As I pointed out in another comment on this page, the real reason for 0xCD (and 0xCC) is that they are interpretable x86 opcodes which trigger a software interrupt, and this allows for graceful recovery into the debugger in just one single specific and rare type of error, namely, when the CPU mistakenly tries to execute bytes in a non-code region. Other than this functional use, fill values are just advisory hints, as you note.
  • Phil1970
    Phil1970 over 4 years
    When debugging, Visual Studio debugger will show the value of a bool if not 0 or 1 with something like true (204). So it is relatively easy to see that kind of bug if you trace code.
  • PhysicalEd
    PhysicalEd over 4 years
    This is a great summay! Here is another update - the /GZ flag has been deprecated, here is the latest doc on the replacement - /RTC docs.microsoft.com/en-us/cpp/build/reference/…
  • G.Vanem
    G.Vanem about 3 years
    @PhysicalEd And "/RTCc" is invalid (or tricky) in C++ and STL. Gotta love it!
  • AJM
    AJM over 2 years
    @PhysicalEd Many thanks for the link to the RTC documentation - after I couldn't find /GZ in the command line I was tearing my hair out trying to find the information!
  • AJM
    AJM over 2 years
    To anyone with enough rep to make a 1-character edit - there's now an https version of the URL in this post.