C / C++ How to copy a multidimensional char array without nested loops?

52,008

Solution 1

You could use memcpy.

If the multidimensional array size is given at compile time, i.e mytype myarray[1][2], then only a single memcpy call is needed

memcpy(dest, src, sizeof (mytype) * rows * columns);

If, like you indicated the array is dynamically allocated, you will need to know the size of both of the dimensions as when dynamically allocated, the memory used in the array won't be in a contiguous location, which means that memcpy will have to be used multiple times.

Given a 2d array, the method to copy it would be as follows:

char** src;
char** dest;

int length = someFunctionThatFillsTmp(src);
dest = malloc(length*sizeof(char*));

for ( int i = 0; i < length; ++i ){
    //width must be known (see below)
    dest[i] = malloc(width);

    memcpy(dest[i], src[i], width);
}

Given that from your question it looks like you are dealing with an array of strings, you could use strlen to find the length of the string (It must be null terminated).

In which case the loop would become

for ( int i = 0; i < length; ++i ){
    int width = strlen(src[i]) + 1;
    dest[i] = malloc(width);    
    memcpy(dest[i], src[i], width);
}

Solution 2

When you have a pointer to a pointer in C, you have to know how the data is going to be used and laid out in the memory. Now, the first point is obvious, and true for any variable in general: if you don't know how some variable is going to be used in a program, why have it? :-). The second point is more interesting.

At the most basic level, a pointer to type T points to one object of type T. For example:

int i = 42;
int *pi = &i;

Now, pi points to one int. If you wish, you can make a pointer point to the first of many such objects:

int arr[10];
int *pa = arr;
int *pb = malloc(10 * sizeof *pb);

pa now points to the first of a sequence of 10 (contiguous) int values, and assuming that malloc() succeeds, pb points to the first of another set of 10 (again, contiguous) ints.

The same applies if you have a pointer to a pointer:

int **ppa = malloc(10 * sizeof *ppa);

Assuming that malloc() succeeds, now you have ppa pointing to the first of a sequence of 10 contiguous int * values.

So, when you do:

char **tmp = malloc(sizeof(char *)*CR_MULTIBULK_SIZE);

tmp points to the first char * object in a sequence of CR_MULTIBULK_SIZE such objects. Each of the pointers above is not initialized, so tmp[0] to tmp[CR_MULTIBULK_SIZE-1] all contain garbage. One way to initialize them would be to malloc() them:

size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i)
    tmp[i] = malloc(...);

The ... above is the size of the ith data we want. It could be a constant, or it could be a variable, depending upon i, or the phase of the moon, or a random number, or anything else. The main point to note is that you have CR_MULTIBULK_SIZE calls to malloc() in the loop, and that while each malloc() is going to return you a contiguous block of memory, the contiguity is not guaranteed across malloc() calls. In other words, the second malloc() call is not guaranteed to return a pointer that starts right where the previous malloc()'s data ended.

To make things more concrete, let's assume CR_MULTIBULK_SIZE is 3. In pictures, your data might look like this:

     +------+                                          +---+---+
tmp: |      |--------+                          +----->| a | 0 |
     +------+        |                          |      +---+---+
                     |                          |
                     |                          |
                     |         +------+------+------+
                     +-------->|  0   |  1   |  2   |
                               +------+------+------+
                                   |      |
                                   |      |    +---+---+---+---+---+
                                   |      +--->| t | e | s | t | 0 |
                            +------+           +---+---+---+---+---+
                            |
                            |
                            |    +---+---+---+
                            +--->| h | i | 0 |
                                 +---+---+---+

tmp points to a contiguous block of 3 char * values. The first of the pointers, tmp[0], points to a contiguous block of 3 char values. Similarly, tmp[1] and tmp[2] point to 5 and 2 chars respectively. But the memory pointed to by tmp[0] to tmp[2] is not contiguous as a whole.

Since memcpy() copies contiguous memory, what you want to do can't be done by one memcpy(). Further, you need to know how each tmp[i] was allocated. So, in general, what you want to do needs a loop:

char **realDest = malloc(CR_MULTIBULK_SIZE * sizeof *realDest);
/* assume malloc succeeded */
size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i) {
    realDest[i] = malloc(size * sizeof *realDest[i]);
    /* again, no error checking */
    memcpy(realDest[i], tmp[i], size);
}

As above, you can call memcpy() inside the loop, so you don't need nested loop in your code. (Most likely memcpy() is implemented with a loop, so the effect is as if you had nested loops.)

Now, if you had code like:

char *s = malloc(size * CR_MULTIBULK_SIZE * sizeof *s);
size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i)
    tmp[i] = s + i*CR_MULTIBULK_SIZE;

I.e., you allocated contiguous space for all the pointers in one malloc() call, then you can copy all the data without a loop in your code:

size_t i;
char **realDest = malloc(CR_MULTIBULK_SIZE * sizeof *realDest);
*realDest = malloc(size * CR_MULTIBULK_SIZE * sizeof **realDest);
memcpy(*realDest, tmp[0], size*CR_MULTIBULK_SIZE);

/* Now set realDest[1]...realDest[CR_MULTIBULK_SIZE-1] to "proper" values */
for (i=1; i < CR_MULTIBULK_SIZE; ++i)
    realDest[i] = realDest[0] + i * CR_MULTIBULK_SIZE;

From the above, the simple answer is, if you had more than one malloc() to allocate memory for tmp[i], then you will need a loop to copy all the data.

Solution 3

You can just calculate the overall size of the array and then use memcpy to copy it.

int cb = sizeof(char) * rows * columns;
memcpy (toArray, fromArray, cb);

Edit: new information in the question indicates that the number of rows and cols of the array is not known, and that the array may be ragged, so memcpy may not be a solution.

Solution 4

Lets explore some possibilities for what's going on here:

int main(int argc; char **argv){
  char **tmp1;         // Could point any where
  char **tmp2 = NULL;
  char **tmp3 = NULL;
  char **tmp4 = NULL;
  char **tmp5 = NULL;
  char **realDest;

  int size = SIZE_MACRO; // Well, you never said
  int cb = sizeof(char) * size * 8; //string inside 2. level has 8 chars

  /* Case 1: did nothing with tmp */
  memcpy(realDest,tmp,cb);  // copies 8*size bytes from WHEREEVER tmp happens to be
                          // pointing. This is undefined behavior and might crash.
  printf("%p\n",tmp[0]);    // Accesses WHEREEVER tmp points+1, undefined behavior, 
                            // might crash.
  printf("%c\n",tmp[0][0]); // Accesses WHEREEVER tmp points, undefined behavior, 
                            // might crash. IF it hasn't crashed yet, derefernces THAT
                            // memory location, ALSO undefined behavior and 
                            // might crash


  /* Case 2: NULL pointer */
  memcpy(realDest,tmp2,cb);  // Dereferences a NULL pointer. Crashes with SIGSEGV
  printf("%p\n",tmp2[0]);    // Dereferences a NULL pointer. Crashes with SIGSEGV
  printf("%c\n",tmp2[0][0]); // Dereferences a NULL pointer. Crashes with SIGSEGV


  /* Case 3: Small allocation at the other end */
  tmp3 = calloc(sizeof(char*),1); // Allocates space for ONE char*'s 
                                  // (4 bytes on most 32 bit machines), and 
                                  // initializes it to 0 (NULL on most machines)
  memcpy(realDest,tmp3,cb);  // Accesses at least 8 bytes of the 4 byte block: 
                             // undefined behavior, might crash
  printf("%p\n",tmp3[0]);    // FINALLY one that works. 
                             // Prints a representation of a 0 pointer   
  printf("%c\n",tmp3[0][0]); // Derefereces a 0 (i.e. NULL) pointer. 
                             // Crashed with SIGSEGV


  /* Case 4: Adequate allocation at the other end */
  tmp4 = calloc(sizeof(char*),32); // Allocates space for 32 char*'s 
                                  // (4*32 bytes on most 32 bit machines), and 
                                  // initializes it to 0 (NULL on most machines)
  memcpy(realDest,tmp4,cb);  // Accesses at least 8 bytes of large block. Works.
  printf("%p\n",tmp3[0]);    // Works again. 
                             // Prints a representation of a 0 pointer   
  printf("%c\n",tmp3[0][0]); // Derefereces a 0 (i.e. NULL) pointer. 
                             // Crashed with SIGSEGV


  /* Case 5: Full ragged array */
  tmp5 = calloc(sizeof(char*),8); // Allocates space for 8 char*'s
  for (int i=0; i<8; ++i){
    tmp5[i] = calloc(sizeof(char),2*i); // Allocates space for 2i characters
    tmp5[i][0] = '0' + i;               // Assigns the first character a digit for ID
  }
  // At this point we have finally allocated 8 strings of sizes ranging 
  // from 2 to 16 characters.
  memcpy(realDest,tmp5,cb);  // Accesses at least 8 bytes of large block. Works.
                             // BUT what works means is that 2*size elements of 
                             // realDist now contain pointer to the character 
                             // arrays allocated in the for block above/
                             //
                             // There are still only 8 strings allocated
  printf("%p\n",tmp5[0]);    // Works again. 
                             // Prints a representation of a non-zero pointer   
  printf("%c\n",tmp5[0][0]); // This is the first time this has worked. Prints "0\n"
  tmp5[0][0] = '*';
  printf("%c\n",realDest[0][0]); // Prints "*\n", because realDest[0] == tmp5[0],
                                 // So the change to tmp5[0][0] affects realDest[0][0]

  return 0;
}

The moral of the story is: you must to know what is on the other side of your pointers. Or else.

The second moral of the story is: just because you can access a double pointer using the [][] notation does not make it is the same as two-dimensional array. Really.


Let me clarify the second moral a little bit.

An array (be it one dimensional, two dimensional, whatever) is an allocated piece of memory, and the compiler knows how big it is (but never does any range checking for you), and a what address it starts. You declare arrays with

char string1[32];
unsigned int histo2[10][20];

and similar things;

A pointer is a variable that can hold a memory address. You declare pointers with

char *sting_ptr1;
double *matrix_ptr = NULL;

They are two different things.

But:

  1. If you use the [] syntax with a pointer, the compiler will do pointer arithmetic for you.
  2. In almost any place you use an array without dereferencing it, the compiler treats it as a pointer to the arrays start location.

So, I can do

    strcpy(string1,"dmckee");

because rule 2 says that string1 (an array) is treated as a char*). Likewise, I can fllow that with:

    char *string_ptr2 = string1;

Finally,

    if (string_ptr[3] == 'k') {
      prinf("OK\n");
    }

will print "OK" because of rule 1.

Solution 5

Why are you not using C++?

class C
{
    std::vector<std::string> data;
public:
    char** cpy();
};

char** C::cpy()
{
    std::string *psz = new std::string [data.size()];
    copy(data.begin(), data.end(), psz);
    char **ppsz = new char* [data.size()];
    for(size_t i = 0; i < data.size(); ++i)
    {
        ppsz[i] = new char [psz[i].length() + 1];
        ppsz[i] = psz[i].c_str();
    }
    delete [] psz;
    return(ppsz);
}

Or something similar? Also, do you need to use C-strings? I doubt it.

Share:
52,008

Related videos on Youtube

dan
Author by

dan

Updated on July 09, 2022

Comments

  • dan
    dan almost 2 years

    I'm looking for a smart way to copy a multidimensional char array to a new destination. I want to duplicate the char array because I want to edit the content without changing the source array.

    I could build nested loops to copy every char by hand but I hope there is a better way.

    Update:

    I don't have the size of the 2. level dimension. Given is only the length (rows).

    The code looks like this:

    char **tmp;
    char **realDest;
    
    int length = someFunctionThatFillsTmp(&tmp);
    
    //now I want to copy tmp to realDest
    

    I'm looking for a method that copies all the memory of tmp into free memory and point realDest to it.

    Update 2:

    someFunctionThatFillsTmp() is the function credis_lrange() from the Redis C lib credis.c.

    Inside the lib tmp is created with:

    rhnd->reply.multibulk.bulks = malloc(sizeof(char *)*CR_MULTIBULK_SIZE)
    

    Update 3:

    I've tried to use memcpy with this lines:

    int cb = sizeof(char) * size * 8; //string inside 2. level has 8 chars
    memcpy(realDest,tmp,cb);
    cout << realDest[0] << endl;
    
    prints: mystring
    

    But I'm getting a: Program received signal: EXC_BAD_ACCESS

    • caf
      caf about 14 years
      It entirely depends on how your "multidimensional array" is constructed. Show the code that creates it.
    • John Knoeller
      John Knoeller about 14 years
      if you don't have the array dimensions, then you can't copy it with a loop either.
    • dan
      dan about 14 years
      @John Knoeller: Thanks. I have updated the description.
    • dmckee --- ex-moderator kitten
      dmckee --- ex-moderator kitten about 14 years
      When caf asked for code he meant we need to know what someFunctionThatFillsTmp does, at least in outline. Is this a ragged array or is it a monolithic single block allocation. (Note that if it is the later, you don't need double indirection.)
    • dmckee --- ex-moderator kitten
      dmckee --- ex-moderator kitten about 14 years
      void * memcpy(void *dst, const void *src, size_t len); Are you sure you're using it right?
    • dan
      dan about 14 years
      @dmckee: Thanks for that. I have mixed scr & dest. Now I can access realDest but getting a Program received signal: EXC_BAD_ACCESS error.
    • dmckee --- ex-moderator kitten
      dmckee --- ex-moderator kitten about 14 years
      I note that tmp is a char **, while the argument to memcpy is a void *.
    • David Nehme
      David Nehme about 14 years
      1. You said there are 2 dimensions, but the function only returns one. It is a square array? 2. Are you sure your destination was allocated enough memory?
    • dan
      dan about 14 years
      @David Nehme: 1 dimension ist the list the second the chars per string. I guess, the strings inside the list consits of 8 chars. I've also tried to raise the size with the same result.
    • dmckee --- ex-moderator kitten
      dmckee --- ex-moderator kitten about 14 years
      @Dan: How sure are you that you know what the allocated structure of the thing pointed to by tmp is? The memcpy that you are doing is wrong in almost any case, but we can't tell you why unless you are specific about what is at the other end. And I am not going to dig through most of a thousand lines of library code to figure it out for you. My guess is that the thing returned by library is meant to be opaque, and you should be asking the library to hand you the string(s) you want.
  • Jon
    Jon about 14 years
    sizeof(char) == 1 byte by definition (whether that 1 byte is 8 bits or not is an entirely different question...)
  • John Knoeller
    John Knoeller about 14 years
    @Jon: Yes, but it's harmless, and it helps to make it clear that this is a byte count and not an element count - and would need to be updated if the array was wide chars.
  • dmckee --- ex-moderator kitten
    dmckee --- ex-moderator kitten about 14 years
    By all means, use memcpy, but the question is once for a real multi-dimensional array or many times for a ragged array (which is suggested by the OP's use of double indirection...)?
  • Yacoby
    Yacoby about 14 years
    @dmckee my original answer was written for the original question, not the updated question. My answer is now hopefully better suited to the updated question.
  • technosaurus
    technosaurus over 9 years
    doing strlen then memcpy is no different than just doing strdup(). see git.musl-libc.org/cgit/musl/tree/src/string/strdup.c
  • PC Luddite
    PC Luddite almost 8 years
    @technosaurus strdup() is not standard C or C++