Why is the first argument of getline a pointer to pointer "char**" instead of "char*"?

56,940

Solution 1

Why use char **lineptr instead of char *lineptr as a parameter of function getline?

Imagine the prototype for getline looked like this:

ssize_t
getline(char *line, size_t n, FILE *stream);

And you called it like this:

char *buffer = NULL;
size_t len = 0;
ssize_t read = getline(buffer, len, stdin);

Before calling getline, buffer is null:

+------+
|buffer+-------> NULL
+------+

When getline is called, line gets a copy of buffer because function arguments are passed by value in C. Inside getline, we no longer have access to buffer:

+------+
|buffer+-------> NULL
+------+          ^
                  |
+------+          |
| line +----------+
+------+

getline allocates some memory with malloc and points line to the beginning of the block:

+------+
|buffer+-------> NULL
+------+

+------+        +---+---+---+---+---+
| line +------->+   |   |   |   |   |
+------+        +---+---+---+---+---+

After getline returns, we no longer have access to line:

+------+
|buffer+-------> NULL
+------+

And we're right back where we started. We can't re-point buffer to the newly-allocated memory inside getline because we only have a copy of buffer.


The prototype for getline is actually:

ssize_t
getline(char **lineptr, size_t *n, FILE *stream);

And you call it like this:

char *buffer = NULL;
size_t len = 0;
ssize_t read = getline(&buffer, &len, stdin);

&buffer returns a pointer to buffer, so we have:

+-------+        +------+
|&buffer+------> +buffer+-------> NULL
+-------+        +---+--+

When getline is called, lineptr gets a copy of &buffer because C is call-by-value. lineptr points to the same place as &buffer:

+-------+        +------+
|&buffer+------->+buffer+-------> NULL
+-------+        +---+--+
                     ^
+-------+            |
|lineptr+------------+
+-------+

getline allocates some memory with malloc and points the pointee of lineptr (i.e. the thing lineptr points to) at the beginning of the block:

+-------+        +------+        +---+---+---+---+---+
|&buffer+------->+buffer+------->+   |   |   |   |   |
+-------+        +---+--+        +---+---+---+---+---+
                     ^
+-------+            |
|lineptr+------------+
+-------+

After getline returns, we no longer have access to lineptr, but we can still access the newly-allocated memory via buffer:

+-------+        +------+        +---+---+---+---+---+
|&buffer+------->+buffer+------->+   |   |   |   |   |
+-------+        +---+--+        +---+---+---+---+---+

Solution 2

Because getline() will allocate the memory for you if you pass in a pointer to a null pointer.

From the man page:

getline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-terminated and includes the newline character, if one was found.

If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.)

You need to pass in a char** (ie a pointer to a pointer to a char) so that the function is able to update the value of the char* that it points to.

You could have used:

char *my_string = NULL;  // getline will alloc

puts("Please enter a line of text");

bytes_read = getline(&my_string, &nbytes, stdin);

Don't forget that if you do this you're responsible for free()-ing the memory allocated by getline().

Solution 3

Therefromhere's answer is correct for your first question. Check the manpage in future, it has the information you need.

Your second line doesn't work because the pointer isn't initialised. If you want to do that, you'd need to write:

char **my_string = malloc(sizeof(char**))

Essentially, when you are creating a variable, * means a pointer, when you are referencing a variable, it means dereference the pointer (get what the pointer points to). & means "The pointer which points to this".

Solution 4

Having taken over some legacy code at my new gig, I think I should offer a caution against calling calloc and returning a pointer-pointer. It should work, but it obscures how getline() operates. The & operator makes it clear that you are passing the address of the pointer you got back from malloc(), calloc(). While technically identical, declaring foo as char **foo, instead of char *foo, and then calling getline(foo,,) instead of getline(&foo,,) obscures this important point.

  1. getline() allows you to allocate storage and pass getline() a pointer to the pointer that malloc(), calloc() returns to you, which you assign to your pointer. Eg:

    char *foo = calloc(size_t arbitrarily_large, 1);

  2. it is possible to pass it &foo=NULL, in which case it will do a blind allocation of storage for you by quietly calling malloc(), calloc(), hidden from view.

  3. char *foo, **p_foo=&foo would also work. Then call foo = calloc(size_t, size_t), and then call getline(p_foo,,); I think getline(&foo,,) is better.

Blind allocations are very bad, and an invitation to problematic memory leaks, because nowhere in YOUR code are you calling malloc(), calloc(), so you, or someone who later is tasked with maintaining your code, won't know to free() the pointer to that storage, because some function you called allocates memory without you knowing it (except for reading the function description and understanding that it's doing a blind allocation).

Since getline() will realloc() the memory your call to malloc(), calloc() provided if it's too small, it's best to just allocate your best guess as to the required storage with a call to calloc(), and make it clear what the pointer char *foo is doing. I don't believe getline() does anything with storage so long as that you have calloc()d is sufficient.

Keep in mind that the value of your pointer may get changed if getline() has to call realloc() to allocate more storage, as the new storage probably WILL be from a different location on the heap. IE: if you pass &foo, and foo's address is 12345, and getline() realloc()s your storage, and in a new location, foo's new address might be 45678.

This isn't an arguement against doing your own call to calloc(), because if you set foo=NULL, you're guaranteed that getline() will have to call realloc().

In summary, make a call to calloc() with some good guess as to size, which will make it obvious to anyone reading your code that memory IS BEING ALLOCATED which must be free()d, no matter what getline() does or doesn't do later.

if(NULL == line) {
     // getline() will realloc() if too small
    line = (char *)calloc(512, sizeof(char));
}
getline((char**)&line, (size_t *)&len, (FILE *)stdin);

Solution 5

Why use char **lineptr instead char *lineptr as a parameter of function getline?

char **lineptr is used because getline() asks for the adress of the pointer that points to where the string will be stored.
You would use char *lineptr if getline() expected the pointer itself(which wouldn't work, see why in ThisSuitIsBlackNot's answer)

Why it is wrong when I use the following code:
char **my_string; bytes_read = getline(my_string, &nbytes, stdin);

The following would work:

char *my_string;
char **pointer_to_my_string = &my_string;
bytes_read = getline(my_string, &nbytes, stdin);

I am confused with * and &.

The * has a double meaning.
When used in a declaration of a pointer, e.g. a pointer to char, it means that you want a pointer to char instead of a char.
When used elsewhere, it gets the variable at which a pointer points.

The & gets the address in memory of a variable(what pointers were created to hold as a value)

char letter = 'c';
char *ptr_to_letter = &letter;
char letter2 = *ptr_to_letter;
char *ptr2 = &*ptr_to_letter; //ptr2 will also point to letter

&*ptr_to_letter means give me the address(&) of the variable at which ptr_to_letter points(*), and is the same as writtingptr_to_letter
You can think of * as the opposite of &, and that they cancel each other.

Share:
56,940
ct586
Author by

ct586

Updated on March 10, 2020

Comments

  • ct586
    ct586 about 4 years

    I use getline function to read a line from STDIN.

    The prototype of getline is:

    ssize_t getline(char **lineptr, size_t *n, FILE *stream);
    

    I use this as a test program which get from http://www.crasseux.com/books/ctutorial/getline.html#getline

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int main(int atgc, char *argv[])
    {
        int bytes_read = 1;
        int nbytes = 10;
        char *my_string;
    
        my_string = (char *)malloc(nbytes+1);
    
        puts("Please enter a line of text");
    
        bytes_read = getline(&my_string, &nbytes, stdin);
    
        if (bytes_read == -1)
        {
            puts ("ERROR!");
        }
        else
        {
            puts ("You typed:");
            puts (my_string);
        }
    
        return 0;
    }
    

    This works fine.

    My doubts are?

    1. Why use char **lineptr instead char *lineptr as a parameter of function getline?

    2. Why it is wrong when I use the following code:

      char **my_string;
      bytes_read = getline(my_string, &nbytes, stdin); 
      
    3. I am confused with * and &.

    Here is part of warnings:

    testGetline.c: In function ‘main’: 
    testGetline.c:34: warning: pointer targets in passing argument 2 of  
      ‘getline’ differ in signedness 
    /usr/include/stdio.h:671: 
      note: expected ‘size_t * __restrict__’ but argument is of type ‘int *’  
    testGetline.c:40: warning: passing argument 1 of ‘putchar’ makes integer 
      from pointer without a cast 
    /usr/include/stdio.h:582: note: expected ‘int’ but argument is of 
      type ‘char *’
    

    I use GCC version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5).

    • Lightness Races in Orbit
      Lightness Races in Orbit about 13 years
      BTW you typo'd the declaration of bytes_read. And what on earth is "incertitude"?
    • ct586
      ct586 about 13 years
      I use "incertitude" to mean "puzzles, misunderstanding parts". Sorry.
  • ct586
    ct586 about 13 years
    Thanks for your explanation. I think if you pass a NULL pointer instead of a pointer to a null pointer, the getline() will realloc() memory for that pointer, and save the string in it. Does this work ok?
  • Admin
    Admin almost 10 years
    +1. I'm a big fan of pre-allocating an arbitrarily large amount of storage, as modern computing platforms typically have massive amounts of RAM available, and techniques like this offer big gains in preventing endless hits to the heap manager, memory which would otherwise be largely wasted trying to cache disk or some other marginally productive use. With careful planning memory can be allocated high enough up in the call-tree to remove almost ALL calls to the heap manager. Putting the returned pointers on structs and passing the struct's pointer down the tree is optimal.
  • chux - Reinstate Monica
    chux - Reinstate Monica about 8 years
    getline(&my_string, &nbytes, stdin); is incorrect as int nbytes is the wrong type.
  • chux - Reinstate Monica
    chux - Reinstate Monica about 8 years
    (size_t *)&len is just wrong. Use the right type size_t for the declaration of len.
  • chux - Reinstate Monica
    chux - Reinstate Monica about 8 years
    OP code has int nbytes = 10; ... bytes_read = getline(&my_string, &nbytes, stdin); An int and size_t are not necessarily the same size nor have the same alignment requirement. That leads to (size_t *)&len as undefined behavior. Example int is 4 -bytes and size_t is 8. Using (size_t *)&len will have getline() writing to memory it does not own.