What is stored on heap and what is stored on stack?

29,145

Solution 1

Structure of a Program in Memory

The following is the basic structure of any program when loaded in the memory.

 +--------------------------+
 |                          |
 |      command line        |
 |        arguments         |
 |    (argc and argv[])     |
 |                          |
 +--------------------------+
 | Stack                    |
 | (grows-downwards)        |
 |                          |
 |                          |
 |                          |
 |         F R E E          |
 |        S P A C E         |
 |                          |
 |                          |
 |                          |
 |                          |
 |     (grows upwards) Heap |
 +--------------------------+
 |                          |
 |    Initialized data      |
 |         segment          |
 |                          |
 +--------------------------+
 |                          |
 |     Initialized to       |
 |        Zero (BSS)        |
 |                          |
 +--------------------------+
 |                          |
 |      Program Code        |
 |                          |
 +--------------------------+

Few points to note:

  • Data Segment
    • Initialized data segment (initialized to explicit initializers by programmers)
    • Uninitialized data segment (initialized to zero data segment - BSS [Block Start with Symbol])
  • Code Segment
  • Stack and Heap areas

Data Segment

The data segment contains the global and static data that are explicitly initialized by the users containing the intialized values.

The other part of data segment is called BSS (because of the old IBM systems had that segment initialized to zero). It is the part of memory where the OS initializes the memory block to zeros. That is how the uninitialized global data and static get default value as zero. This area is fixed and has static size.

The data area is separated into two areas based on explicit initialization because the variables that are to be initialized can be initialized one-by-one. However, the variables that are not initialized need not be explicitly initialized with 0's one-by-one. Instead of that, the job of initializing the variable is left to the OS. This bulk initialization can greatly reduce the time required to load the executable file.

Mostly the layout of the data segment is in the control of the underlying OS, still some loaders give partial control to the users. This information may be useful in applications such as embedded systems.

This area can be addressed and accessed using pointers from the code. Auto variables have overhead in initializing the variables each time they are required and code is required to do that initialization. However, the variables in the data area does not have such runtime overload because the initialization is done only once and that too at loading time.

Code segment

The program code is the code area where the executable code is available for execution. This area is also of fixed size. This can be accessed only be function pointers and not by other data pointers. Another important information to note here is that the system may consider this area as read only memory area and any attempt to write in this area leads to undefined behavior.

Constant strings may be placed either in code or data area and that depends on the implementation.

The attempt to write to code area leads to undefined behavior. For example (I'm going to give only C based examples) the following code may result in runtime error or even crash the system.

int main()
{
    static int i;
    strcpy((char *)main,"something");
    printf("%s",main);
    if(i++==0)
    main();
}

Stack and heap areas

For execution, the program uses two major parts, the stack and heap. Stack frames are created in stack for functions and heap for dynamic memory allocation. The stack and heap are uninitialized areas. Therefore, whatever happens to be there in the memory becomes the initial (garbage) value for the objects created in that space.

Lets look at a sample program to show which variables get stored where,

int initToZero1;
static float initToZero2;
FILE * initToZero3; 
// all are stored in initialized to zero segment(BSS)

double intitialized1 = 20.0;
// stored in initialized data segment

int main()
{
    size_t (*fp)(const char *) = strlen;
    // fp is an auto variable that is allocated in stack
    // but it points to code area where code of strlen() is stored

    char *dynamic = (char *)malloc(100);
    // dynamic memory allocation, done in heap

    int stringLength;
    // this is an auto variable that is allocated in stack

    static int initToZero4; 
    // stored in BSS

    static int initialized2 = 10; 
    // stored in initialized data segment   

    strcpy(dynamic,”something”);    
    // function call, uses stack

    stringLength = fp(dynamic); 
    // again a function call 
}

Or consider a still more complex example,

// command line arguments may be stored in a separate area  
int main(int numOfArgs, char *arguments[])
{ 
    static int i;   
    // stored in BSS 

    int (*fp)(int,char **) = main;  
    // points to code segment 

    static char *str[] = {"thisFileName","arg1", "arg2",0};
    // stored in initialized data segment

    while(*arguments)
        printf("\n %s",*arguments++);

    if(!i++)
        fp(3,str);
}

Hope this helps!

Solution 2

In C/C++: Local variables are allocated on the current stack frame (belonging to the current function). If you statically allocate an object, the whole object is allocated on the stack, including all of its member variables. When using recursion, with each function call a new stack frame is created, and all local variables are allocated on the stack. The stack has usually fixed size which and this value is usually written in the executable binary header during compilation/linking. However this is very OS and platform specific, some OS may grow the stack dynamically when needed. Because the size of the stack is usually limited, you can run out of stack when you use deep recursion or sometimes even when without recursion when you statically allocate large objects.

The heap is usually taken as an unlimited space (only limited by the available physical/virtual memory), and you can allocate objects on the heap using malloc/new (and other heap-allocating functions). When an object is created on the heap, all of its member variables are created within it. You should see an object as a continuous area of memory (this area contains member variables and a pointer to a virtual method table), no matter where is it allocated.

Literals, constants and other "fixed" stuff is usually compiled/linked into the binary as another segment, so it's not really is the code segment. Usually you can't alloc or free anything from this segment at runtime. However this is also platform specific, it might work differently on different platforms (for example iOS Obj-C code has a lot of constant references inserted directly into the code segment, between functions).

Solution 3

In C and C++, at least, this is all implementation-specific. The standards do not mention "stack" or "heap".

Solution 4

Section 3.5 of the Java Virtual Machine Specification describes runtime data areas (stacks and the heap).

Neither the C nor C++ language standards specify whether something should be stored on a stack or a heap. They only define object lifetimes, visibility, and modifiability; it's up to the implementation to map those requirements to a particular platform's memory layout.

Typically, anything allocated with the *alloc functions resides on the heap, while auto variables and function parameters reside on a stack. String literals may live "somewhere else" (they must be allocated and visible over the lifetime of the program, but attempting to modify them is undefined); some platforms use a separate, read-only memory segment to store them.

Just remember that there are some truly oddball platforms out there that may not conform to the common stack-heap model.

Solution 5

In Java, local variables can be allocate don the stack (unless optimised away)

Primitives and references in an object are on the heap (as the object is on the heap)

A stack is pre-allocated when the thread is created. It doesn't use heap space. (However creating a thread does result in a Thread Local Allocation Buffer being created which decreases the free memory quite a bit)

Unique string literals are added to the heap. primitive literals may be in the code somewhere (if not optimised away) Whether a field is static or not make no difference.

Share:
29,145
Amogh Talpallikar
Author by

Amogh Talpallikar

Updated on May 07, 2020

Comments

  • Amogh Talpallikar
    Amogh Talpallikar about 4 years

    Can anyone clearly explain, in terms of C,C++ and Java. What all goes on stack and what all goes on Heap and when is the allocation done.

    As far as I know,

    All local variables whether primitives,pointers or reference variables per function call are on a new stack frame.

    and anything created with new or malloc goes on heap.

    I am confused about few things.

    Are references/primitives which are members of an object created on heap also stored on heap ?

    and what about those local members of a method that are being recursively created in each frame. Are they all on stack, If yes then is that stack memory allocated at runtime ? also for literals, are they part of the code segment ? and what about globals in C, static in C++/Java and static in C .