Memory Allocation/Deallocation?

74,667

Solution 1

The Memory Model

The C++ standard has a memory model. It attempts to model the memory in a computer system in a generic way. The standard defines that a byte is a storage unit in the memory model and that memory is made up of bytes (§1.7):

The fundamental storage unit in the C++ memory model is the byte. [...] The memory available to a C++ program consists of one or more sequences of contiguous bytes.

The Object Model

The standard always provides an object model. This specifies that an object is a region of storage (so it is made up of bytes and resides in memory) (§1.8):

The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage.

So there we go. Memory is where objects are stored. To store an object in memory, the required region of storage must be allocated.

Allocation and Deallocation Functions

The standard provides two implicitly declared global scope allocation functions:

void* operator new(std::size_t);
void* operator new[](std::size_t);

How these are implemented is not the standard's concern. All that matters is that they should return a pointer to some region of storage with the number of bytes corresponding to the argument passed (§3.7.4.1):

The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall return the address of the start of a block of storage whose length in bytes shall be at least as large as the requested size. There are no constraints on the contents of the allocated storage on return from the allocation function.

It also defines two corresponding deallocation functions:

void operator delete(void*);
void operator delete[](void*);

Which are defined to deallocate storage that has previously been allocated (§3.7.4.2):

If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage.

new and delete

Typically, you should not need to use the allocation and deallocation functions directly because they only give you uninitialised memory. Instead, in C++ you should be using new and delete to dynamically allocate objects. A new-expression obtains storage for the requested type by using one of the above allocation functions and then initialises that object in some way. For example new int() will allocate space for an int object and then initialise it to 0. See §5.3.4:

A new-expression obtains storage for the object by calling an allocation function (3.7.4.1).

[...]

A new-expression that creates an object of type T initializes that object [...]

In the opposite direction, delete will call the destructor of an object (if any) and then deallocate the storage (§5.3.5):

If the value of the operand of the delete-expression is not a null pointer value, the delete-expression will invoke the destructor (if any) for the object or the elements of the array being deleted.

[...]

If the value of the operand of the delete-expression is not a null pointer value, the delete-expression will call a deallocation function (3.7.4.2).

Other Allocations

However, these are not the only ways that storage is allocated or deallocated. Many constructs of the language implicitly require allocation of storage. For example, giving an object definition, like int a;, also requires storage (§7):

A definition causes the appropriate amount of storage to be reserved and any appropriate initialization (8.5) to be done.

C standard library: malloc and free

In addition, the <cstdlib> header brings in the contents of the stdlib.h C standard library, which includes the malloc and free functions. They are also defined, by the C standard, to allocate and deallocate memory, much like the allocation and deallocation functions defined by the C++ standard. Here's the definition of malloc (C99 §7.20.3.3):

void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.
Returns
The malloc function returns either a null pointer or a pointer to the allocated space.

And the definition of free (C99 §7.20.3.2):

void free(void *ptr);
Description
The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.

However, there's never a good excuse to be using malloc and free in C++. As described before, C++ has its own alternatives.


Answers to Questions

So to answer your questions directly:

  1. Where is the "memory" that is being allocated?

    The C++ standard doesn't care. It simply says that the program has some memory which is made up of bytes. This memory can be allocated.

  2. What is this "memory"? Space in an array? Or something else?

    As far as the standard is concerned, the memory is just a sequence of bytes. This is purposefully very generic, as the standard only tries to model typical computer systems. You can, for the most part, think of it as a model of the RAM of your computer.

  3. What happens exactly when this "memory" gets allocated?

    Allocating memory makes some region of storage available for use by the program. Objects are initialized in allocated memory. All you need to know is that you can allocate memory. The actual allocation of physical memory to your process tends to be done by the operating system.

  4. What happens exactly when the memory gets deallocated?

    Deallocating some previously allocated memory causes that memory to be unavailable to the program. It becomes deallocated storage.

  5. It would also really help me if someone could answer what malloc does in these C++ lines:

    char* x; 
    x = (char*) malloc (8);
    

    Here, malloc is simply allocating 8 bytes of memory. The pointer it returns is being cast to a char* and stored in x.

Solution 2

1) Where is the "memory" that is being allocated?

This is completely different based on your operating system, programming environment (gcc vs Visual C++ vs Borland C++ vs anything else), computer, available memory, etc. In general, memory is allocated from what is called the heap, region of memory just waiting around for you to use. It will generally use your available RAM. But there are always exceptions. For the most part, so long as it gives us memory, where it comes from isn't a great concern. There are special types of memory, such as virtual memory, which may or may not actually be in RAM at any given time and may get moved off to your hard drive (or similar storage device) if you run out of real memory. A full explanation would be very long!

2) What is this "memory"? Space in an array? Or something else?

Memory is generally the RAM in your computer. If it is helpful to think of memory as a gigantic "array", it certain operates like one, then think of it as a ton of bytes (8 bit values, much like unsigned char values). It starts at an index of 0 at the bottom of memory. Just like before, though, there are tons of exceptions here and some parts of memory may be mapped to hardware, or may not even exist at all!

3) What happens exactly when this "memory" gets allocated?

At any given time there should be (we really hope!) some of it available for software to allocate. How it gets allocated is highly system dependent. In general, a region of memory is allocated, the allocator marks it as used, and then a pointer is given to you to use that tells the program where in all of your system's memory that memory is located. In your example, the program will find a consecutive block of 8 bytes (char) and return a pointer to where it found that block after it marks it as "in use".

4) What happens exactly when the memory gets deallocated?

The system marks that memory as available for use again. This is incredibly complicated because this will often cause holes in memory. Allocate 8 bytes then 8 more bytes, then deallocate the first 8 bytes and you've got a hole. There are entire books written on handling deallocation, memory allocation, etc. So hopefully the short answer will be sufficient!

5) It would also really help me if someone could answer what malloc does in these C++ lines:

REALLY crudely, and assuming it's in a function (by the way, never do this because it doesn't deallocate your memory and causes a memory leak):

void mysample() {
  char *x; // 1
  x = (char *) malloc(8); // 2
}

1) This is a pointer reserved in the local stack space. It has not be initialized so it points to whatever that bit of memory had in it.

2) It calls malloc with a parameter of 8. The cast just let's C/C++ know you intend for it to be a (char *) because it returns a (void *) meaning it has no type applied. Then the resulting pointer is stored in your x variable.

In very crude x86 32bit assembly, this will look vaguely like

PROC mysample:
  ; char *x;
  x = DWord Ptr [ebp - 4]
  enter 4, 0   ; Enter and preserve 4 bytes for use with 

  ; x = (char *) malloc(8);
  push 8       ; We're using 8 for Malloc
  call malloc  ; Call malloc to do it's thing
  sub esp, 4   ; Correct the stack
  mov x, eax   ; Store the return value, which is in EAX, into x

  leave
  ret

The actual allocation is vaguely described in point 3. Malloc usually just calls a system function for this that handles all the rest, and like everything else here, it's wildly different from OS to OS, system to system, etc.

Solution 3

1 . Where is the "memory" that is being allocated?

From a language perspective, this isn't specified, and mostly because the fine details often don't matter. Also, the C++ standard tends to err on the side of under-specifying hardware details, to minimise unnecessary restrictions (both on the platforms compilers can run on, and on possible optimisations).

sftrabbit's answer gives a great overview of this end of things (and it's all you really need), but I can give a couple of worked examples in case that helps.

Example 1:

On a sufficiently old single-user computer (or a sufficiently small embedded one), most of the physical RAM may be directly available to your program. In this scenario, calling malloc or new is essentially internal book-keeping, allowing the runtime library to track which chunks of that RAM are currently in use. You can do this manually, but it gets tedious pretty quickly.

Example 2:

On a modern multitasking operating system, the physical RAM is shared with many processes and other tasks including kernel threads. It's also used for disk caching and I/O buffering in the background, and is augmented by the virtual memory subsystem which can swap data to disk (or some other storage device) when they're not being used.

In this scenario, calling new may first check whether your process already has enough space free internally, and request more from the OS if not. Whatever memory is returned may be physical, or it may be virtual (in which case physical RAM may not be assigned to store it until it's actually accessed). You can't even tell the difference, at least without using platform-specific APIs, because the memory hardware and kernel conspire to hide it from you.

2 . What is this "memory"? Space in an array? Or something else?

In example 1, it's something like space in an array: the address returned identifies an addressable chunk of physical RAM. Even here, RAM addresses aren't necessarily flat or contiguous - some addresses may be reserved for ROM, or for I/O ports.

In example 2, it's an index into something more virtual: your process' address space. This is an abstraction used to hide the underlying virtual memory details from your process. When you access this address, the memory hardware may directly access some real RAM, or it might need to ask the virtual memory subsystem to provide some.

3 . What happens exactly when this "memory" gets allocated?

In general, a pointer is returned which you can use to store as many bytes as you asked for. In both cases, malloc or the new operator will do some housekeeping to track which parts of your process' address space are used and which are free.

4 . What happens exactly when the memory gets deallocated?

Again in general, free or delete will do some housekeeping so they know that memory is available to be re-allocated.

It would also really help me if someone could answer what malloc does in these C++ lines:

char* x; 
x = (char*) malloc (8);

It returns a pointer which is either NULL (if it couldn't find the 8 bytes you want), or some non-NULL value.

The only things you can usefully say about this non-NULL value are that:

  • it's legal (and safe) to access each of those 8 bytes x[0]..x[7],
  • it's illegal (undefined behaviour) to access x[-1] or x[8] or actually any x[i] unless 0 <= i <= 7
  • it's legal to compare any of x, x+1, ..., x+8 (although you can't dereference the last of those)
  • if your platform/hardware/whatever have any restrictions on where you can store data in memory, then x meets them

Solution 4

To allocate memory means to ask the operating system for memory. It means that it is the program itself to ask for "space" in RAM when only when it needs it. For example if you want to use an array but you don't know its size before the program runs, you can do two things: - declare and array[x] with x dediced by you, arbitrary long. For example 100. But what about if your program just needs an array of 20 elements? You are wasting memory for nothing. - then you program can malloc an array of x elements just when it knows the correct size of x. Programs in memory are divided in 4 segments: -stack (needed for call to functions) -code (the bibary executable code) - data (global variables/data) - heap, in this segment you find the allocated memory. When you decide you don't need the allocated memory anymore, you give it back to the operating system.

If you want to alloc and array of 10 integers, you do:

int *array = (int *)malloc(sizeof(int) * 10)

And then you give it back to the os with free(array)

Share:
74,667
Isaac
Author by

Isaac

I'm not a rapper.

Updated on July 09, 2022

Comments

  • Isaac
    Isaac almost 2 years

    I have been looking at memory allocation lately and I am a bit confused about the basics. I haven't been able to wrap my head around the simple stuff. What does it mean to allocate memory? What happens? I would appreciated answers to any of these questions:

    1. Where is the "memory" that is being allocated?
    2. What is this "memory"? Space in an array? Or something else?
    3. What happens exactly when this "memory" gets allocated?
    4. What happens exactly when the memory gets deallocated?
    5. It would also really help me if someone could answer what malloc does in these C++ lines:

      char* x; 
      x = (char*) malloc (8);
      

    Thank you.

  • Isaac
    Isaac about 11 years
    Thank you! That was extremely helpful. It even answered a question that I thought of as I was reading it. I do have one more question that came up just now though. Is fragmentation an issue with memory allocation? Example: 10 unused bytes stuck inside two allocated blocks of memory. Or is that something that usually isn't considered an issue? Thanks again!
  • Joseph Mansfield
    Joseph Mansfield about 11 years
    @Isaac If you create local variables or dynamically allocate objects with new and delete, you don't have to care about allocation at all. The compiler will make sure the right amount of storage is allocated. Class types often contain padding bytes between members but they serve a purpose. As far as the standard goes, you shouldn't need to care about this stuff. However, practically, you might need to. Some of the top questions on SO are related to this (here, here, etc.)
  • Isaac
    Isaac about 11 years
    Thanks! That helped a lot. I'm a little scared of creating holes in memory now though. Is that something I should be worried about? Or is it something that just happens?
  • Isaac
    Isaac about 11 years
    Thanks! I got to your answer last. But it helped reinforce my confidence in what I learned from the others.
  • Mark Ormston
    Mark Ormston about 11 years
    Holes happen a lot. It is generally called fragmentation, and there are a great deal of methods designed to work around the problem. In general, unless you are allocating/deallocating over and over again, it won't affect you much... and in that case, you may need a more advanced memory manager than malloc/free (or new/delete). For more (albeit vague) information, they describe it sufficiently on Wikipedia: en.wikipedia.org/wiki/Fragmentation_%28computing%29
  • Isaac
    Isaac about 11 years
    Sorry, to bother again. If you have the time though, I'd really appreciate the help. When you say it "marks" it as used. What does that mean? I understand if the byte has not been allocated it will probably be set to 00, and if it is allocated and used, then it will be whatever it is set to. But what about the bytes that are allocated, but not used? Is there a way to differentiate them from bytes that are not allocated?
  • Isaac
    Isaac about 11 years
    Never mind! I messed around with same code and found a way.
  • Shameel Mohamed
    Shameel Mohamed over 7 years
    I understand your question. Say, you are allocating 100 bytes for a string and you are using only 50 bytes, then the remaining bytes are left empty. And the highlight is they still are allocated. That necessarily means they cannot be used/reallocated to any other tasks. So this obviously constitutes an issue as the unused bytes are unavailable. For this kind of issue there is a realloc() function in standard c, which would deallocate the existing memory, allocate the requested memory in a new location and copy the existing contents to this location.
  • Shameel Mohamed
    Shameel Mohamed over 7 years
    So you can use this realloc() to allocate additional memory whenever necessary and you need not worry about memory being left unused. I do not know if there's a doppelganger for realloc() in C++. Please let me know if you do find..