Creating and Understanding linked lists of structs in C

29,697

Linked Lists - 101 - Singly-linked Lists

This is a long answer. The reason I have gone into so much detail is that there are a large number of linked-list questions that I hope to answer in on place, with proper context.

When I learned C, I had a hard time with pointers. However, after implementing a linked list, I finally started to grasp the concept of pointers. Master linked lists is a good thing in C, and it will help you be comfortable with pointers. When things seem confusing, grab a pencil and paper and sketch out a diagram of a list and the associated links to nodes. I occasionally do that when I am working with complex list implementations.

A linked list is a way to store data records. Unlike an array where all elements occupy one contiguous memory block of memory, linked list elements occupy random fragments of memory.

There are two basic types of linked list; a singly-linked list, and a doubly-linked list. The difference is that a singly-linked list can only be traversed in one direction; while a doubly-linked list can be traversed in both directions.

A singly-linked list is accessed via its "head" pointer, or a pointer to the head list node. A doubly-linked list may also be accessed by via its "head" pointer, or via its "tail" pointer.

Unlike an array where each element of the array can be directly addressed by its array index, linked list elements are accessed sequentially.

Here is a layout of a singly-linked list:

            Node #1      Node #2      Node #3      EndOfList
            ----------   ----------   --------     ---------
HEADPTR-->  NEXTPTR-->   NEXTPTR-->   NEXTPTR-->   NULL
            DataPayload  DataPayload  DataPayload

Each node in the list, with its data payload, is separately allocated. A node structure (in C) might look something like this:

    typedef struct NODE_PAYLOAD_S
       {
       /* Data Payload (defined by coder) */
       char name[10];
       char desc[10];
       int hours;
       int workordernum;
       } NODE_PAYLOAD_T;

    typedef struct LIST_NODE_S
       {
       /* Next-node pointer */
       struct LIST_NODE_S *next;     /* pointer to the next node in the list. */
       NODE_PAYLOAD_T      payload;  /* Data Payload (defined by coder) */
       } LIST_NODE_T;

To initialize a singly-linked list of the above structure:

LIST_NODE_T *listHead = NULL;

'listHead' is now a pointer to a linked list (with no nodes).

Here is how to add a new node at the head of the this list:

int LIST_InsertHeadNode(
      LIST_NODE_T **IO_head,

Q: Why is a "double-pointer" required here (ie: LIST_NODE_T **...)? Why not use a "single-level' pointer (ie: LIST_NODE_T *...)?

A: A "single" pointer to the list head would not be sufficient for this operation. Specifically, this operation designates a new "head node". This means that the this function needs to modify the pointer that points to the head node.

BEFORE:

                         Node #Y      Node #Z      EndOfList
                         ----------   ----------   ---------
             HEADPTR-->  NEXTPTR-->   NEXTPTR-->   NULL
                         DataPayload  DataPayload  

AFTER:

            New Node      Node #Y      Node #Z      EndOfList
            ----------   ----------   --------     ---------
HEADPTR-->  NEXTPTR-->   NEXTPTR-->   NEXTPTR-->   NULL
            DataPayload  DataPayload  DataPayload

Note that before, HEADPTR was pointing to 'Node #Y'; then after, HEADPTR is pointing at 'New node'. When this function is called, the address of the listHead pointer is passed in, allowing this function to change where the listHead pointer is pointing. In otherwords, the address of the listHead pointer is being passed into this function, which is represented (internally to this function) as a pointer to the listHead pointer (a pointer to a pointer). That is why it is a "double-pointer".


      char         *I__name,
      char         *I__desc,
      int           I__hours,
      int           I__workordernum
      )
   {
   int rCode=0;
   LIST_NODE_T *newNode = NULL;

   /* Allocate memory for new node (with its payload). */
   newNode=malloc(sizeof(*newNode));
   if(NULL == newNode)
      {
      rCode=ENOMEM;   /* ENOMEM is defined in errno.h */
      fprintf(stderr, "malloc() failed.\n");
      goto CLEANUP;

Q: What's this 'goto CLEANUP;' business?

A: The C language, unlike C++ and JAVA, has no concept of an 'exception'. Error checking in C is critical. There are a number of reasons that the malloc() function might fail, and if it does, the code has to handle it as gracefully as possible. The 'goto CLEANUP' statement causes the normal program flow to skip code jumping to the 'CLEANUP:' label (within this function, below).

Obviously, if malloc() has failed, it would be unwise to try to initialize the NULL pointer (returned by the failed malloc) with the lines that immediately follow. So, it is important to divert the flow of the program to skip this initialization (and the linkage that comes later).

There is nothing special about the 'CLEANUP:' label. I could have called it 'ERROR:', 'EXIT:', 'FINISH:', 'LIAHONA:', 'MY_BAD' or anything else that suited my pleasure. (Labels don't have to be uppercase, nor do they have to be placed at the left margin. However, my style is to do that so that they stand out.)

Labels, such as 'CLEANUP:' has a scope that is limited to the boundaries of the function where they are placed; which allows each function to have a unique 'CLEANUP:' label (if needed).


      }

   /* Initialize the new node's payload. */       
   snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
   snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
   newNode->payload.hours = I__hours;
   newNode->payload.workordernum = I__workordernum;

   /* Link this node into the list as the new head node. */
   newNode->next = *IO_head;
   *IO_head = newNode;

CLEANUP:

   return(rCode);
   }

The above function might be called as follows:

#include <stdio.h>
#include <errno.h>

int LIST_InsertHeadNode(LIST_NODE_T **, char *, char *, int, int);

int main(void)
   {
   int rCode=0;
   LIST_NODE_T *listHead = NULL;

   rCode=LIST_InsertHeadNode(&listHead, "Mahonri", "Jareds Bro", 4, 2421);
   if(rCode)
      {
      fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
      goto CLEANUP;
      }

CLEANUP:

   return(rCode);
   }

The LIST_InsertHeadNode() function may be called multiple times. Each call will add a new node to the list. The new node will be placed at the "head" of the list, which has the effect of pushing the rest of the nodes further down the list.

After adding several nodes to the list, it might be good to access the list; perhaps to print the payload of each node:

int PrintListPayloads(
      LIST_NODE_T *head;
      )
   {
   int rCode=0;
   LIST_NODE_T *cur = head
   int nodeCnt=0;

   while(cur)
      {
      ++nodeCnt;
      printf("%s, %s, %d, %d\n",
            cur->payload.name,
            cur->payload.desc,
            cur->payload.hours,
            cur->payload.workordernum
            );
       cur=cur->next;
       }

    printf("%d nodes printed.\n", nodeCnt);

   return(rCode);
   }

The above function can be called from main():

#include <stdio.h>
#include <errno.h>

int LIST_InsertHeadNode(LIST_NODE_T **, char *, char *, int, int);
int PrintListPayloads(LIST_NODE_T *);

int main(void)
   {
   int rCode=0;
   LIST_NODE_T *listHead = NULL;

   /* Insert a linked-list node. */
   rCode=LIST_InsertHeadNode(&listHead, "Mahonri", "Jareds Bro", 4, 2421);
   if(rCode)
      {
      fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
      goto CLEANUP;
      }

   /* Insert a linked-list node. */
   rCode=LIST_InsertHeadNode(&listHead, "Joe", "CEO", 5, 2419);
   if(rCode)
      {
      fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
      goto CLEANUP;
      }

   /* Insert a linked-list node. */
   rCode=LIST_InsertHeadNode(&listHead, "Eve", "Mother", 24, 2);
   if(rCode)
      {
      fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
      goto CLEANUP;
      }

   rCode=PrintListPayloads(listHerad);
   if(rCode)
      {
      fprintf(stderr, "PrintListPayloads() reports: %d\n", rCode);
      goto CLEANUP;
      }

CLEANUP:

   return(rCode);
   }

Adding nodes at the head of a list [ie: LIST_InsertHeadNode()] is one way to add nodes. However, there are times where adding nodes to the other end of the list (ie: the list 'tail') is preferable. The code below shows how this is done.

First, a function that will return the current 'tail node' of a list.

   int LIST_GetTailNode(
         LIST_NODE_T  *I__listHead,   /* The caller supplied list head pointer. */
         LIST_NODE_T **_O_listTail    /* The function sets the callers pointer to the
                                         last node. */
         )
      {
      int rCode=0;
      LIST_NODE_T *curNode = I__listHead;

      /* Iterate through all list nodes until the last node is found. */
      /* The last node's 'next' field, which is always NULL. */
      if(curNode)
         {
         while(curNode->next)
            curNode=curNode->next;
         }

      /* Set the caller's pointer to point to the last (ie: tail) node. */
      if(_O_listTail)
         *_O_listTail = curNode;

      return(rCode);
      }

Next, a function that will insert a node at the tail of the list.

   int LIST_InsertTailNode(
      LIST_NODE_T **IO_head,
      char         *I__name,
      char         *I__desc,
      int           I__hours,
      int           I__workordernum
      )
   {
   int rCode=0;
   LIST_NODE_T *tailNode;
   LIST_NODE_T *newNode = NULL;

   /* Get a pointer to the last node in the list. */
   rCode=LIST_GetTailNode(*IO_head, &tailNode);
   if(rCode)
      {
      fprintf(stderr, "LIST_GetTailNode() reports: %d\n", rCode);
      goto CLEANUP;
      }

Important note: The LIST_GetTailNode() function will set the tailNode pointer to the last node in the linked list; -unless- there are no nodes in the list. When the list is empty, LIST_GetTailNode() will set the tailNode pointer to NULL.


   /* Allocate memory for new node (with its payload). */
   newNode=malloc(sizeof(*newNode));
   if(NULL == newNode)
      {
      rCode=ENOMEM;   /* ENOMEM is defined in errno.h */
      fprintf(stderr, "malloc() failed.\n");
      goto CLEANUP;
      }

   /* Initialize the new node's payload. */       
   snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
   snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
   newNode->payload.hours = I__hours;
   newNode->payload.workordernum = I__workordernum;

   /* Link this node into the list as the new tail node. */
   newNode->next = NULL;
   if(tailNode)
      tailNode->next = newNode;
   else

This 'else' case indicates occurs when tailNode is NULL, meaning that (currently) the linked list has no nodes. In this case, this node will be the first (head) node in the list (as well as the last). So, the caller's 'List Head' pointer is updated to indicate that this new node is now the head node.

      *IO_head = newNode;

CLEANUP:

   return(rCode);
   }

The LIST_InsertTailNode() function is called the same way LIST_InsertHeadNode() is called. The only difference is that with LIST_InsertTailNode(), the new node is inserted at the list's tail, rather than at the list's head.

OK, so now you can insert a new node at the head, or tail of the list. What about inserting a new node somewhere in the middle of the list?

For example, assume that you want to have a list where all the nodes are sorted by some field in the payload like 'name'. While it is possible to add all the nodes, and then sort the list afterwords; it is much easier to insert each new node into the list in it's proper place. Doing that, the list will always be maintained in sorted order automatically.

Accomplishing this is done in two steps. First, allocate and initialize the new node. Then figure out where its proper place is in the list, then link the new node into the list at that location.

First, a function that will return what will be the 'parent node' to the new node. (This node assumes that the list is being maintained in sorted order by name):

int LIST_FetchParentNodeByName( 
      LIST_NODE_T *I__head,
      const char  *I__name,
      LIST_NODE_T **_O_parent
      )
   {
   int rCode=0;
   LIST_NODE_T *parent = NULL;
   LIST_NODE_T *curNode = I__head;

   /* Inform the caller of an 'empty list' condition. */
   if(NULL == I__head)
      {
      rCode=ENOENT;
      goto CLEANUP;
      }

   /* Find a node with a payload->name string greater than the I__name string */
   while(curNode)
      {
      if(strcmp(curNode->payload.name, I__name) > 0)
         break;

      parent = curNode; /* Remember this node. It is the parent of the next node. */
      curNode=curNode->next;  /* On to the next node. */
      }

   /* Set the caller's 'parent' pointer. */
   if(_O_parent)
      *_O_parent = parent;

CLEANUP:

   return(rCode);
   }

And now, a function that will insert new nodes, keeping the list sorted by name.

   int LIST_InsertNodeByName(
      LIST_NODE_T **IO_head,
      char         *I__name,
      char         *I__desc,
      int           I__hours,
      int           I__workordernum
      )
   {
   int rCode=0;
   LIST_NODE_T *parent;
   LIST_NODE_T *newNode = NULL;

   /* Allocate memory for new node (with its payload). */
   newNode=malloc(sizeof(*newNode));
   if(NULL == newNode)
      {
      rCode=ENOMEM;   /* ENOMEM is defined in errno.h */
      fprintf(stderr, "malloc() failed.\n");
      goto CLEANUP;
      }

   /* Initialize the new node's payload. */       
   snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
   snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
   newNode->payload.hours = I__hours;
   newNode->payload.workordernum = I__workordernum;

   /* Find the proper place to link this node */
   rCode=LIST_FetchParentNodeByName(*IO_head, I__name, &parent);
   switch(rCode)
      {
      case 0:
         break;

      case ENOENT:
         /* Handle empty list condition */ 
         newNode->next = NULL;
         *IO_head = newNode;
         rCode=0;
         goto CLEANUP;

      default:
         fprintf(stderr, "LIST_FetchParentNodeByName() reports: %d\n", rCode);
         goto CLEANUP;
      }

Important note: The LIST_FetchParentNodeByName() function will set the parent pointer to the node in the list that is immediately less than the specified I__name; -unless- the head node is greater than the specified I__name. For this special case, LIST_FetchParentNodeByName() will set the parent pointer to NULL.


   /* Handle the case where all current list nodes are greater than the new node. */
   /* (Where the new node will become the new list head.) */
   if(NULL == parent)
      {
      newNode->next = *IO_head;
      *IO_head = newNode;
      goto CLEANUP;
      }

   /* Final case, insert the new node just after the parent node. */
   newNode->next = parent->next;
   parent->next = newNode;

CLEANUP:

   return(rCode);
   }

The LIST_InsertNodeByName() function is called the same way LIST_InsertHeadNode() or LIST_InsertTailNode() is called. The only difference is that with LIST_InsertNodeByName(), the new node is inserted into its sorted (by name) location in the list; rather than at the list head or tail.

There will be occasions when a node will have to be deleted from the list. This is done by locating the node to be deleted, unlinking the node from the list, and then deleting the node and it's payload.

First, a function to locate a specific node by the payload name field.

int LIST_FetchNodeByName( 
      LIST_NODE_T  *I__head,
      const char   *I__name,
      LIST_NODE_T **_O_node,
      LIST_NODE_T **_O_parent
      )
   {
   int rCode=0;
   LIST_NODE_T *parent = NULL;
   LIST_NODE_T *curNode = I__head;

   /* Search the list for a matching payload name. */
   while(curNode)
      {
      if(0 == strcmp(curNode->payload.name, I__name))
         break;

      parent = curNode;   /* Remember this node; it will be the parent of the next. */
      curNode=curNode->next;
      }

   /* If no match is found, inform the caller. */
   if(NULL == curNode)
     {
     rCode=ENOENT;
     goto CLEANUP;
     }

   /* Return the matching node to the caller. */
   if(_O_node)
      *_O_node = curNode;

   /* Return parent node to the caller. */
   if(_O_parent)
      *_O_parent = parent;

CLEANUP:

   return(rCode);
   }

Here is a function that will delete a node from the list that matches a specific payload name.

   int LIST_DeleteNodeByName(
      LIST_NODE_T **IO_head,
      char         *I__name
      )
   {
   int rCode=0;
   LIST_NODE_T *parent;
   LIST_NODE_T *delNode = NULL;

   /* Find the node to delete. */
   rCode=LIST_FetchNodeByName(*IO_head, I__name, &delNode, &parent); 
   switch(rCode)
      {
      case 0:
         break;

      case ENOENT:
         fprintf(stderr, "Matching node not found.\n");
         goto CLEANUP;

      default:
         fprintf(stderr, "LIST_FetchNodeByName() reports: %d\n", rCode);
         goto CLEANUP;
      }

Important note: The LIST_FetchNodeByName() function will set the parent pointer of the delNode; -unless- the the delNode is the head node. For this special case, LIST_FetchNodeByName() will set the parent pointer to NULL.


   /* Unlink the delNode from the list. */
   if(NULL == parent)
      *IO_head = delNode->next;
   else
      parent->next = delNode->next;

   /* Free the delNode and its payload. */
   free(delNode);

CLEANUP:

   return(rCode);
   }       

NOTE: All code above has been tested and should be functional, and can be downloaded as: 23279119_List_101.c

(To be continued- as per request...)

Share:
29,697
user3570478
Author by

user3570478

Updated on April 02, 2021

Comments

  • user3570478
    user3570478 about 3 years

    I am having trouble grasping the concepts of struct and the linked list data structure together. For example lets say we have have this code: a struct that has a worker's content and a linked list of these structs that contains nodes of each worker and a pointer to the next node(?).

        typedef struct Schedule {
            char name[10];
            char description[10];
            int hours;
            int workordernum;
        } Work;
    
        typedef struct linkedlist {
            struct Schedule work;
            struct linkedlist *next;
        } Node;
    

    The problem is how do you make a method that always adds a node in the beginning of the list, a method that adds it anywhere(middle) in the list using user defined workordernum, and a method that always puts it in the end.

    I am not quite understanding -> and * usages properly. I did read online about creating head and tail nodes but I didn't quite get its usage properly, since they had a struct for a list and a struct for a node.

    Another thing which I didn't get is, lets say adding a node to the beginning of the list works, how do you go about then changing every single workordernum value for all the nodes that were previously there?

    I understand one must keep track of every time a node gets added, removed, or moved, meaning every time these methods are called we must have a variable that keeps track of the number. So if we have a node in the list all ready, its order would be one, then we add another one to the beginning, how would we change the order number 1 to 2 and the one being added to 1?

    Or how does node->next->next->next work if we only have one pointer? Then how would we print all of them? Since we cannot use a for loop.

    So these are the concepts I cant grasp code wise. I would KINDLY appreciate it if you take your time to explain it rather than just give the code, if possible. Because I would have to apply what I learn to moving around and deleting nodes as well. I want to learn it on my own. If something must be given as a code example then that's fine but please just don't post all the answer codes for me.

    -Thank You

    *Please forgive any format errors since I am new to this site.

    Edit: I do understand a pointer is an address and that -> pertains to "pointing to" a member. I mean I understand all the basics, but my understanding isn't firm enough else I could be doing what I need help with.

    Edit2: I will try and make a head node with linked list from what I have learned so far. I will be using the structs above and it will be loose code, not perfect. This is just to make sure I'm on the right track so far.

    int main() {
    
       // creates a work struct to hold user content
       Work *workstruct = (Work*)malloc((sizeof(struct Schedule));
    
       // creates first node to hold a work struct for linkedlist
       Node *initialNode = (Node*)malloc((sizeof(struct linkedlist));
    
       // Method call to add work nodes to list in main
       addWork(initialNode, workstruct);
    
    }
    
    void addWork(Node *initialNode, Work *workstruct) {
    
       // Suppose user already initialized workstruct
    
       // create double-pointer to make initialNode as head
       Node **head = (Node **)malloc(sizeof(struct linkedlist));
    
       // assigns user's workstruct to the workstruct of initialNode
       initialNode->work = *workstruct;
    
       // If im understanding what you have taught me so far,
       // this is how you always add a node on the head
    
       initialNode->next = *head;
       *head = initialNode;
    }
    

    The only issue I seem to have so far is that every time I try and add a new node to the list, it makes the new node the head but loses the previous node that was in the list.

  • Mahonri Moriancumer
    Mahonri Moriancumer about 10 years
    @PaulGriffiths Understood. Thank you for your down-vote restraint. The structures in the question have several problems that have to be ironed out; I will not be using the raw structures. I am interested in teaching, that is my motive.
  • user3570478
    user3570478 about 10 years
    @Mahonri Yes please continue teaching. I didnt mean trouble. I cant understand the complexity of the code structure you have with double pointers and having an int List and such. You dont need to use my structures but what I meant with making the code simpler was with teaching standards of the material itself. I mean, if I dont fully understand the notion of linked lists how can I understand half of the structure of your code your trying to teach me? I stand by my words, I TRULY want to learn. Once again, im sorry for causing your rep trouble. I appreciate dearly that your trying to teach me.
  • Mahonri Moriancumer
    Mahonri Moriancumer about 10 years
    @user3570478, Before I continue; there are two things. 1) I would like to know if there are parts of my explanation that confuse you... Perhaps I can clarify. 2) I headed home and will be off-line for a about 20 min or so...
  • user3570478
    user3570478 about 10 years
    To start off first, why must there be a double pointer? Can it not work without one?
  • user3570478
    user3570478 about 10 years
    Also how does cleanup work when adding a node? I have never seen that being used.
  • Mahonri Moriancumer
    Mahonri Moriancumer about 10 years
    @user3570478, points answered (see the explanations in the answer above).
  • user3570478
    user3570478 about 10 years
    Is it okay if I present you an example of what I have learned so far and you let me know if im doing it correctly or not?
  • user3570478
    user3570478 about 10 years
    I wish we could get in contact on skype it would be alot easier :( Okay so i guess I will post the code above.
  • Mustafa
    Mustafa about 8 years
    this is a very good explanation. the link is dead by the way.
  • Navoneel Talukdar
    Navoneel Talukdar almost 8 years
    awesome explanation.