How are multi-dimensional arrays formatted in memory?

85,130

Solution 1

A static two-dimensional array looks like an array of arrays - it's just laid out contiguously in memory. Arrays are not the same thing as pointers, but because you can often use them pretty much interchangeably it can get confusing sometimes. The compiler keeps track properly, though, which makes everything line up nicely. You do have to be careful with static 2D arrays like you mention, since if you try to pass one to a function taking an int ** parameter, bad things are going to happen. Here's a quick example:

int array1[3][2] = {{0, 1}, {2, 3}, {4, 5}};

In memory looks like this:

0 1 2 3 4 5

exactly the same as:

int array2[6] = { 0, 1, 2, 3, 4, 5 };

But if you try to pass array1 to this function:

void function1(int **a);

you'll get a warning (and the app will fail to access the array correctly):

warning: passing argument 1 of ‘function1’ from incompatible pointer type

Because a 2D array is not the same as int **. The automatic decaying of an array into a pointer only goes "one level deep" so to speak. You need to declare the function as:

void function2(int a[][2]);

or

void function2(int a[3][2]);

To make everything happy.

This same concept extends to n-dimensional arrays. Taking advantage of this kind of funny business in your application generally only makes it harder to understand, though. So be careful out there.

Solution 2

The answer is based on the idea that C doesn't really have 2D arrays - it has arrays-of-arrays. When you declare this:

int someNumbers[4][2];

You are asking for someNumbers to be an array of 4 elements, where each element of that array is of type int [2] (which is itself an array of 2 ints).

The other part of the puzzle is that arrays are always laid out contiguously in memory. If you ask for:

sometype_t array[4];

then that will always look like this:

| sometype_t | sometype_t | sometype_t | sometype_t |

(4 sometype_t objects laid out next to each other, with no spaces in between). So in your someNumbers array-of-arrays, it'll look like this:

| int [2]    | int [2]    | int [2]    | int [2]    |

And each int [2] element is itself an array, that looks like this:

| int        | int        |

So overall, you get this:

| int | int  | int | int  | int | int  | int | int  |

Solution 3

unsigned char MultiArray[5][2]={{0,1},{2,3},{4,5},{6,7},{8,9}};

in memory is equal to:

unsigned char SingleArray[10]={0,1,2,3,4,5,6,7,8,9};

Solution 4

In answer to your also: Both, though the compiler is doing most of the heavy lifting.

In the case of statically allocated arrays, "The System" will be the compiler. It will reserve the memory like it would for any stack variable.

In the case of the malloc'd array, "The System" will be the implementer of malloc (the kernel usually). All the compiler will allocate is the base pointer.

The compiler is always going to handle the type as what they are declared to be except in the example Carl gave where it can figure out interchangeable usage. This is why if you pass in a [][] to a function it must assume that it is a statically allocated flat, where ** is assumed to be pointer to pointer.

Solution 5

Suppose, we have a1 and a2 defined and initialized like below (c99):

int a1[2][2] = {{142,143}, {144,145}};
int **a2 = (int* []){ (int []){242,243}, (int []){244,245} };

a1 is a homogeneous 2D array with plain continuous layout in memory and expression (int*)a1 is evaluated to a pointer to its first element:

a1 --> 142 143 144 145

a2 is initialized from a heterogeneous 2D array and is a pointer to a value of type int*, i.e. dereference expression *a2 evaluates into a value of type int*, memory layout does not have to be continuous:

a2 --> p1 p2
       ...
p1 --> 242 243
       ...
p2 --> 244 245

Despite totally different memory layout and access semantics, C-language grammar for array-access expressions looks exactly the same for both homogeneous and heterogeneous 2D array:

  • expression a1[1][0] will fetch value 144 out of a1 array
  • expression a2[1][0] will fetch value 244 out of a2 array

Compiler knows that the access-expression for a1 operates on type int[2][2], when the access-expression for a2 operates on type int**. The generated assembly code will follow the homogeneous or heterogeneous access semantics.

The code usually crashes at run-time when array of type int[N][M] is type-casted and then accessed as type int**, for example:

((int**)a1)[1][0]   //crash on dereference of a value of type 'int'
Share:
85,130

Related videos on Youtube

Chris Cooper
Author by

Chris Cooper

I'm currently the solo developer of a new virtual reality game on Steam, but in my career I've led in mobile development, bioinformatics, distributed computing, and web development, focusing on innovation and improving complex systems. Mentoring is one of my favorite past times, and I am a firm believer in the power of comprehensive systems understanding and lifelong learning. I'm never afraid to try on a new hat. Past organizations include Xbox Live, the Ontario Institute for Cancer Research, and Loblaw Digital.

Updated on January 14, 2020

Comments

  • Chris Cooper
    Chris Cooper over 4 years

    In C, I know I can dynamically allocate a two-dimensional array on the heap using the following code:

    int** someNumbers = malloc(arrayRows*sizeof(int*));
    
    for (i = 0; i < arrayRows; i++) {
        someNumbers[i] = malloc(arrayColumns*sizeof(int));
    }
    

    Clearly, this actually creates a one-dimensional array of pointers to a bunch of separate one-dimensional arrays of integers, and "The System" can figure out what I mean when I ask for:

    someNumbers[4][2];
    

    But when I statically declare a 2D array, as in the following line...:

    int someNumbers[ARRAY_ROWS][ARRAY_COLUMNS];
    

    ...does a similar structure get created on the stack, or is it of another form completely? (i.e. is it a 1D array of pointers? If not, what is it, and how do references to it get figured out?)

    Also, when I said, "The System," what is actually responsible for figuring that out? The kernel? Or does the C compiler sort it out while compiling?

    • Andrew Henle
      Andrew Henle over 4 years
      @toohonestforthissite Indeed. To expand on that: Looping and calling malloc() does not result in an N-dimensional array.. It results in arrays of pointers [to arrays of pointers[...]] to completely separate one-dimensional arrays. See Correctly allocating multi-dimensional arrays to see how to allocate TRUE N-dimensional array.
  • Chris Cooper
    Chris Cooper about 14 years
    Thanks for the explanation. So "void function2(int a[][2]);" will accept both statically and dynamically declared 2D's? And I guess it's still good practice/essential to pass in the length of the array as well if the first dimension is left as []?
  • Carl Norum
    Carl Norum about 14 years
    @Chris I don't think so - you'll have a hard time making C swizzle a stack- or globally-allocated array into a bunch of pointers.
  • Narcisse Doudieu Siewe
    Narcisse Doudieu Siewe almost 9 years
    looking at the final layout make me think that int a[][] can be accessed as int *...right?
  • caf
    caf almost 9 years
    @user3238855: The types are not compatible, but if you get a pointer to the first int in the array of arrays (eg. by evaluating a[0] or &a[0][0]) then yes, you can offset that to sequentially access every int).
  • Jason K.
    Jason K. over 7 years
    Certainly a[][] should work! In C every array is really just a pointer with an offset, the memory allocation is the only real difference. I wouldn't expect the warning for a** mentioned above to prevent you from compiling, though that would depend on the compiler and settings.
  • Carl Norum
    Carl Norum over 7 years
    @JasonK. - no. Arrays are not pointers. Arrays "decay" into pointers in some contexts, but they are absolutely not the same.
  • Jason K.
    Jason K. over 7 years
    To be clear: Yes Chris "it's still good practice to pass in the length of the array" as a separate parameter, otherwise use std::array or std::vector (which is C++ not old C). I think we agree @CarlNorum both conceptually for new users and practically, to quote Anders Kaseorg on Quora: “The first step to learning C is understanding that pointers and arrays are the same thing. The second step is understanding that pointers and arrays are different.”
  • Manuel Selva
    Manuel Selva over 7 years
    @Jon L. I would not say that malloc is implemented by the kernel, but by the libc on top of kernel primitives (such as brk)
  • too honest for this site
    too honest for this site over 7 years
    @JasonK. "The first step to learning C is understanding that pointers and arrays are the same thing." - This quote is so very wrong and missleading! It is indeed the most important step to understand they are not the same, but that arrays are converted to a pointer to the first element for most operators! sizeof(int[100]) != sizeof(int *) (unless you find a platform with 100 * sizeof(int) bytes/int, but that is a different thing.
  • too honest for this site
    too honest for this site over 7 years
    @ManuelSelva: Where and how malloc is implemented is not specified by the standard and left to the implementation, resp. environment. For freestanding environments it is optional like all parts of the standard library requiring linking functions (that's what the requirements actually result in, not literaly what the standard states). For some modern hosted environments, it indeed relies on kernel functions, either the complete stuff, or (e.g. Linux) as you wrote using both, stdlib and kernel-primitives. For non-virtual memory single-process systems, it can be stdlib only.