I am having trouble grasping the concepts of struct
and the linked list data structure together. For example lets say we have have this code: a struct
Linked Lists - 101 - Singly-linked Lists
This is a long answer. The reason I have gone into so much detail is that there are a large number of linked-list questions that I hope to answer in on place, with proper context.
When I learned C, I had a hard time with pointers. However, after implementing a linked list, I finally started to grasp the concept of pointers. Master linked lists is a good thing in C, and it will help you be comfortable with pointers. When things seem confusing, grab a pencil and paper and sketch out a diagram of a list and the associated links to nodes. I occasionally do that when I am working with complex list implementations.
A linked list is a way to store data records. Unlike an array where all elements occupy one contiguous memory block of memory, linked list elements occupy random fragments of memory.
There are two basic types of linked list; a singly-linked list, and a doubly-linked list. The difference is that a singly-linked list can only be traversed in one direction; while a doubly-linked list can be traversed in both directions.
A singly-linked list is accessed via its "head" pointer, or a pointer to the head list node. A doubly-linked list may also be accessed by via its "head" pointer, or via its "tail" pointer.
Unlike an array where each element of the array can be directly addressed by its array index, linked list elements are accessed sequentially.
Here is a layout of a singly-linked list:
Node #1 Node #2 Node #3 EndOfList
---------- ---------- -------- ---------
HEADPTR--> NEXTPTR--> NEXTPTR--> NEXTPTR--> NULL
DataPayload DataPayload DataPayload
Each node in the list, with its data payload, is separately allocated. A node structure (in C) might look something like this:
typedef struct NODE_PAYLOAD_S
{
/* Data Payload (defined by coder) */
char name[10];
char desc[10];
int hours;
int workordernum;
} NODE_PAYLOAD_T;
typedef struct LIST_NODE_S
{
/* Next-node pointer */
struct LIST_NODE_S *next; /* pointer to the next node in the list. */
NODE_PAYLOAD_T payload; /* Data Payload (defined by coder) */
} LIST_NODE_T;
To initialize a singly-linked list of the above structure:
LIST_NODE_T *listHead = NULL;
'listHead' is now a pointer to a linked list (with no nodes).
Here is how to add a new node at the head of the this list:
int LIST_InsertHeadNode(
LIST_NODE_T **IO_head,
Q: Why is a "double-pointer" required here (ie: LIST_NODE_T **...)? Why not use a "single-level' pointer (ie: LIST_NODE_T *...)?
A: A "single" pointer to the list head would not be sufficient for this operation. Specifically, this operation designates a new "head node". This means that the this function needs to modify the pointer that points to the head node.
BEFORE:
Node #Y Node #Z EndOfList
---------- ---------- ---------
HEADPTR--> NEXTPTR--> NEXTPTR--> NULL
DataPayload DataPayload
AFTER:
New Node Node #Y Node #Z EndOfList
---------- ---------- -------- ---------
HEADPTR--> NEXTPTR--> NEXTPTR--> NEXTPTR--> NULL
DataPayload DataPayload DataPayload
Note that before, HEADPTR was pointing to 'Node #Y'; then after, HEADPTR is pointing at 'New node'. When this function is called, the address of the listHead pointer is passed in, allowing this function to change where the listHead pointer is pointing. In otherwords, the address of the listHead pointer is being passed into this function, which is represented (internally to this function) as a pointer to the listHead pointer (a pointer to a pointer). That is why it is a "double-pointer".
char *I__name,
char *I__desc,
int I__hours,
int I__workordernum
)
{
int rCode=0;
LIST_NODE_T *newNode = NULL;
/* Allocate memory for new node (with its payload). */
newNode=malloc(sizeof(*newNode));
if(NULL == newNode)
{
rCode=ENOMEM; /* ENOMEM is defined in errno.h */
fprintf(stderr, "malloc() failed.\n");
goto CLEANUP;
Q: What's this 'goto CLEANUP;' business?
A: The C language, unlike C++ and JAVA, has no concept of an 'exception'. Error checking in C is critical. There are a number of reasons that the malloc() function might fail, and if it does, the code has to handle it as gracefully as possible. The 'goto CLEANUP' statement causes the normal program flow to skip code jumping to the 'CLEANUP:' label (within this function, below).
Obviously, if malloc() has failed, it would be unwise to try to initialize the NULL pointer (returned by the failed malloc) with the lines that immediately follow. So, it is important to divert the flow of the program to skip this initialization (and the linkage that comes later).
There is nothing special about the 'CLEANUP:' label. I could have called it 'ERROR:', 'EXIT:', 'FINISH:', 'LIAHONA:', 'MY_BAD' or anything else that suited my pleasure. (Labels don't have to be uppercase, nor do they have to be placed at the left margin. However, my style is to do that so that they stand out.)
Labels, such as 'CLEANUP:' has a scope that is limited to the boundaries of the function where they are placed; which allows each function to have a unique 'CLEANUP:' label (if needed).
}
/* Initialize the new node's payload. */
snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
newNode->payload.hours = I__hours;
newNode->payload.workordernum = I__workordernum;
/* Link this node into the list as the new head node. */
newNode->next = *IO_head;
*IO_head = newNode;
CLEANUP:
return(rCode);
}
The above function might be called as follows:
#include <stdio.h>
#include <errno.h>
int LIST_InsertHeadNode(LIST_NODE_T **, char *, char *, int, int);
int main(void)
{
int rCode=0;
LIST_NODE_T *listHead = NULL;
rCode=LIST_InsertHeadNode(&listHead, "Mahonri", "Jareds Bro", 4, 2421);
if(rCode)
{
fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
goto CLEANUP;
}
CLEANUP:
return(rCode);
}
The LIST_InsertHeadNode() function may be called multiple times. Each call will add a new node to the list. The new node will be placed at the "head" of the list, which has the effect of pushing the rest of the nodes further down the list.
After adding several nodes to the list, it might be good to access the list; perhaps to print the payload of each node:
int PrintListPayloads(
LIST_NODE_T *head;
)
{
int rCode=0;
LIST_NODE_T *cur = head
int nodeCnt=0;
while(cur)
{
++nodeCnt;
printf("%s, %s, %d, %d\n",
cur->payload.name,
cur->payload.desc,
cur->payload.hours,
cur->payload.workordernum
);
cur=cur->next;
}
printf("%d nodes printed.\n", nodeCnt);
return(rCode);
}
The above function can be called from main():
#include <stdio.h>
#include <errno.h>
int LIST_InsertHeadNode(LIST_NODE_T **, char *, char *, int, int);
int PrintListPayloads(LIST_NODE_T *);
int main(void)
{
int rCode=0;
LIST_NODE_T *listHead = NULL;
/* Insert a linked-list node. */
rCode=LIST_InsertHeadNode(&listHead, "Mahonri", "Jareds Bro", 4, 2421);
if(rCode)
{
fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
goto CLEANUP;
}
/* Insert a linked-list node. */
rCode=LIST_InsertHeadNode(&listHead, "Joe", "CEO", 5, 2419);
if(rCode)
{
fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
goto CLEANUP;
}
/* Insert a linked-list node. */
rCode=LIST_InsertHeadNode(&listHead, "Eve", "Mother", 24, 2);
if(rCode)
{
fprintf(stderr, "LIST_InsertHeadNode() reports: %d\n", rCode);
goto CLEANUP;
}
rCode=PrintListPayloads(listHerad);
if(rCode)
{
fprintf(stderr, "PrintListPayloads() reports: %d\n", rCode);
goto CLEANUP;
}
CLEANUP:
return(rCode);
}
Adding nodes at the head of a list [ie: LIST_InsertHeadNode()] is one way to add nodes. However, there are times where adding nodes to the other end of the list (ie: the list 'tail') is preferable. The code below shows how this is done.
First, a function that will return the current 'tail node' of a list.
int LIST_GetTailNode(
LIST_NODE_T *I__listHead, /* The caller supplied list head pointer. */
LIST_NODE_T **_O_listTail /* The function sets the callers pointer to the
last node. */
)
{
int rCode=0;
LIST_NODE_T *curNode = I__listHead;
/* Iterate through all list nodes until the last node is found. */
/* The last node's 'next' field, which is always NULL. */
if(curNode)
{
while(curNode->next)
curNode=curNode->next;
}
/* Set the caller's pointer to point to the last (ie: tail) node. */
if(_O_listTail)
*_O_listTail = curNode;
return(rCode);
}
Next, a function that will insert a node at the tail of the list.
int LIST_InsertTailNode(
LIST_NODE_T **IO_head,
char *I__name,
char *I__desc,
int I__hours,
int I__workordernum
)
{
int rCode=0;
LIST_NODE_T *tailNode;
LIST_NODE_T *newNode = NULL;
/* Get a pointer to the last node in the list. */
rCode=LIST_GetTailNode(*IO_head, &tailNode);
if(rCode)
{
fprintf(stderr, "LIST_GetTailNode() reports: %d\n", rCode);
goto CLEANUP;
}
Important note: The LIST_GetTailNode() function will set the tailNode pointer to the last node in the linked list; -unless- there are no nodes in the list. When the list is empty, LIST_GetTailNode() will set the tailNode pointer to NULL.
/* Allocate memory for new node (with its payload). */
newNode=malloc(sizeof(*newNode));
if(NULL == newNode)
{
rCode=ENOMEM; /* ENOMEM is defined in errno.h */
fprintf(stderr, "malloc() failed.\n");
goto CLEANUP;
}
/* Initialize the new node's payload. */
snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
newNode->payload.hours = I__hours;
newNode->payload.workordernum = I__workordernum;
/* Link this node into the list as the new tail node. */
newNode->next = NULL;
if(tailNode)
tailNode->next = newNode;
else
This 'else' case indicates occurs when tailNode is NULL, meaning that (currently) the linked list has no nodes. In this case, this node will be the first (head) node in the list (as well as the last). So, the caller's 'List Head' pointer is updated to indicate that this new node is now the head node.
*IO_head = newNode;
CLEANUP:
return(rCode);
}
The LIST_InsertTailNode() function is called the same way LIST_InsertHeadNode() is called. The only difference is that with LIST_InsertTailNode(), the new node is inserted at the list's tail, rather than at the list's head.
OK, so now you can insert a new node at the head, or tail of the list. What about inserting a new node somewhere in the middle of the list?
For example, assume that you want to have a list where all the nodes are sorted by some field in the payload like 'name'. While it is possible to add all the nodes, and then sort the list afterwords; it is much easier to insert each new node into the list in it's proper place. Doing that, the list will always be maintained in sorted order automatically.
Accomplishing this is done in two steps. First, allocate and initialize the new node. Then figure out where its proper place is in the list, then link the new node into the list at that location.
First, a function that will return what will be the 'parent node' to the new node. (This node assumes that the list is being maintained in sorted order by name):
int LIST_FetchParentNodeByName(
LIST_NODE_T *I__head,
const char *I__name,
LIST_NODE_T **_O_parent
)
{
int rCode=0;
LIST_NODE_T *parent = NULL;
LIST_NODE_T *curNode = I__head;
/* Inform the caller of an 'empty list' condition. */
if(NULL == I__head)
{
rCode=ENOENT;
goto CLEANUP;
}
/* Find a node with a payload->name string greater than the I__name string */
while(curNode)
{
if(strcmp(curNode->payload.name, I__name) > 0)
break;
parent = curNode; /* Remember this node. It is the parent of the next node. */
curNode=curNode->next; /* On to the next node. */
}
/* Set the caller's 'parent' pointer. */
if(_O_parent)
*_O_parent = parent;
CLEANUP:
return(rCode);
}
And now, a function that will insert new nodes, keeping the list sorted by name.
int LIST_InsertNodeByName(
LIST_NODE_T **IO_head,
char *I__name,
char *I__desc,
int I__hours,
int I__workordernum
)
{
int rCode=0;
LIST_NODE_T *parent;
LIST_NODE_T *newNode = NULL;
/* Allocate memory for new node (with its payload). */
newNode=malloc(sizeof(*newNode));
if(NULL == newNode)
{
rCode=ENOMEM; /* ENOMEM is defined in errno.h */
fprintf(stderr, "malloc() failed.\n");
goto CLEANUP;
}
/* Initialize the new node's payload. */
snprintf(newNode->payload.name, sizeof(newNode->payload.name), "%s", I__name);
snprintf(newNode->payload.desc, sizeof(newNode->payload.desc), "%s", I__desc);
newNode->payload.hours = I__hours;
newNode->payload.workordernum = I__workordernum;
/* Find the proper place to link this node */
rCode=LIST_FetchParentNodeByName(*IO_head, I__name, &parent);
switch(rCode)
{
case 0:
break;
case ENOENT:
/* Handle empty list condition */
newNode->next = NULL;
*IO_head = newNode;
rCode=0;
goto CLEANUP;
default:
fprintf(stderr, "LIST_FetchParentNodeByName() reports: %d\n", rCode);
goto CLEANUP;
}
Important note: The LIST_FetchParentNodeByName() function will set the parent pointer to the node in the list that is immediately less than the specified I__name; -unless- the head node is greater than the specified I__name. For this special case, LIST_FetchParentNodeByName() will set the parent pointer to NULL.
/* Handle the case where all current list nodes are greater than the new node. */
/* (Where the new node will become the new list head.) */
if(NULL == parent)
{
newNode->next = *IO_head;
*IO_head = newNode;
goto CLEANUP;
}
/* Final case, insert the new node just after the parent node. */
newNode->next = parent->next;
parent->next = newNode;
CLEANUP:
return(rCode);
}
The LIST_InsertNodeByName() function is called the same way LIST_InsertHeadNode() or LIST_InsertTailNode() is called. The only difference is that with LIST_InsertNodeByName(), the new node is inserted into its sorted (by name) location in the list; rather than at the list head or tail.
There will be occasions when a node will have to be deleted from the list. This is done by locating the node to be deleted, unlinking the node from the list, and then deleting the node and it's payload.
First, a function to locate a specific node by the payload name field.
int LIST_FetchNodeByName(
LIST_NODE_T *I__head,
const char *I__name,
LIST_NODE_T **_O_node,
LIST_NODE_T **_O_parent
)
{
int rCode=0;
LIST_NODE_T *parent = NULL;
LIST_NODE_T *curNode = I__head;
/* Search the list for a matching payload name. */
while(curNode)
{
if(0 == strcmp(curNode->payload.name, I__name))
break;
parent = curNode; /* Remember this node; it will be the parent of the next. */
curNode=curNode->next;
}
/* If no match is found, inform the caller. */
if(NULL == curNode)
{
rCode=ENOENT;
goto CLEANUP;
}
/* Return the matching node to the caller. */
if(_O_node)
*_O_node = curNode;
/* Return parent node to the caller. */
if(_O_parent)
*_O_parent = parent;
CLEANUP:
return(rCode);
}
Here is a function that will delete a node from the list that matches a specific payload name.
int LIST_DeleteNodeByName(
LIST_NODE_T **IO_head,
char *I__name
)
{
int rCode=0;
LIST_NODE_T *parent;
LIST_NODE_T *delNode = NULL;
/* Find the node to delete. */
rCode=LIST_FetchNodeByName(*IO_head, I__name, &delNode, &parent);
switch(rCode)
{
case 0:
break;
case ENOENT:
fprintf(stderr, "Matching node not found.\n");
goto CLEANUP;
default:
fprintf(stderr, "LIST_FetchNodeByName() reports: %d\n", rCode);
goto CLEANUP;
}
Important note: The LIST_FetchNodeByName() function will set the parent pointer of the delNode; -unless- the the delNode is the head node. For this special case, LIST_FetchNodeByName() will set the parent pointer to NULL.
/* Unlink the delNode from the list. */
if(NULL == parent)
*IO_head = delNode->next;
else
parent->next = delNode->next;
/* Free the delNode and its payload. */
free(delNode);
CLEANUP:
return(rCode);
}
NOTE: All code above has been tested and should be functional, and can be downloaded as: 23279119_List_101.c
(To be continued- as per request...)