MathJax 3

Wednesday, 21 January 2015

Linked lists in C: memory usage and differences with arrays.



The linked list is a type of data structure which doesn't requires the programmer to know at compile time the exact size of the data he needs to store. Another peculiarity of the linked list is the dynamic nature of its size, which can be modified during the program execution. The elements of a linked list can be added to or removed from both the extremities, making this data structure an ideal storage structure for both stack and queue. This dynamical behavior results in the non-contiguity of the memory addresses to which the linked list's elements are stored at. Differently from the array, which is made of a unique block of contiguous memory, in the linked list the elements forming its building blocks are stored at random locations.
To better understand the difference, let's see how arrays store their data:

#include <stdlib.h>
#include <stdio.h>
int main() 
{
   int myarray[80];
   int i;
   for ( i=0; i<80; i++ )
      printf("%p\n",myarray+i);     

   return 0;
}

The above program prints the memory address for each element of myarray.
Compiling and running the code above will generate the following output:

0xbffca380
0xbffca384
0xbffca388
0xbffca38c
0xbffca390
0xbffca394
0xbffca398
0xbffca39c
0xbffca3a0
0xbffca3a4

...               


Here are only reported the first 10 line of the actually 80 printed by the program. As we can see, the address of each element of myarray, increments by 4 on each iteration. This means that the cells of memory used to store myarray  elements are contiguous. Thus, myarray elements can be accessed directly and randomly, using an index. If you run the program few times, the addresses will change each time because the program is run from the RAM of your PC. Nevertheless, they will always be contiguous and equally spaced ( the space is determined by the array's data type ).

That brings us to the next difference between array and linked list:
while the array elements can be accessed randomly, by using the [] operator,
accessing the linked list elements requires writing a loop which traverses all the elements of the list, until the desired one is reached. We shall see later the pros and cons of using a linked list instead of an array. For now we just want to understand how differently memory is allocated in a linked list, compared to an array. We will see this by rewriting the previous program using a linked list. Linked list are used exclusively with structure, so as first thing we define a structure representing an element of the list:

/* begin code */
#include <stdio.h>
#include <stdlib.h>

typedef struct item
{
  int data;
  struct item * next;
}item;


this structure defines the most basic elements a linked list can be made of: the data stored by the element and a pointer which is used to link the element to the next one in the list.

int main()
{
  item * head, *node;
  head = node;
  node = malloc(sizeof(item));
  node->data = 2;


In the previous part of the code we declare two pointers, head and node. We use malloc() to allocate the memory for the first element of the list. node points to this element and head, in turn, points to node; we should notice that a piece of memory allocated with malloc() it's always accessible by the program, until we call free(), even if we use malloc() inside a function. This is important to understand because in professional written programs, a linked list is generally created and manipulated through functions, and that it's possible thanks to the particular way in which malloc() allocates memory (on the heap).

  node->next = malloc(sizeof(item));
  node->next->data = 3;

  node->next->next = NULL;

As we said before,  an element of the linked list has (at least) one link to the next element. Here, we allocate memory for  the second element while using the pointer next, contained in the first element, to point at it, so that they link together. Now, let's say stop, we don't need any more elements on this side of the list. To mark the end of a linked list, we nullify the last element, which in this case is represented by the pointer node->next->next.

  item * tmp =  malloc(sizeof(item)); 
  tmp->data = 1;


  head = tmp;
  tmp->next = node;


In the code above we see the power and flexibility of linked lists. Until now, the element of the list have been added in a contiguous fashion, just like an array, with the difference that for arrays, memory is allocated at compile time. Above we add an element at the beginning of the list, to the opposite side from where we added the second element. This is done by allocating memory for an new element with malloc() and using tmp to point to that memory. Then we copy tmp to head ( which will point always to the first element of the list ) . We finish by linking this new added element to the list, by copying node to tmp->next.

What we do next it's entirely up to our needs. We can keep appending elements to any of the two sides of our list.
To finish the program, we see how to traverse the list, for printing out the data stored into it. To achieve this, we use the head pointer and a while loop:

  while(head!=NULL)
    {
      printf( "address %p data %d\n", &head->next, head->data );
      head = head->next;
    }


  free(tmp);           //frees node containing data=1  
  tmp = NULL;

  free(node);          //frees node containing data=2
  node = NULL;

  free(node->next);    //frees node containing data=3
  node->next = NULL;

 
return 0;
}/* end code */

The following is the output of the program:

address 0x934d02c value 1
address 0x934d00c value 2
address 0x934d01c value 3
 
As you can see the address of the elements doesn't follow the order in which the elements are printed out by the program, or to be more precise, the order in which they are placed inside the list.

The obvious advantage of using linked lists is the fact that their size can be modified during the execution of the program. This property is empowered even more by the possibility to expand the linked lists in any direction and on any of their elements. Though, this comports using some extra time and memory. Extra time, which we need for scanning when we must access an element that isn't the next during a list traversal. Extra memory, needed to maintain one or more pointers ( depending on the type and implementation of a linked list ) to link and navigate the list.
To exploit the possibilities offered by this kind of data structure, they should be used in all those situations where the size of data is unknown before the program is executed or during it's execution, like in real time computing. For example, compilers or interpreters construction represent one of those cases where linked lists shine out, compared to arrays. In the next post, we will see how to implement a linked list.

    
Graphic representation of a linked list
(en.wikipedia.org)

No comments:

Post a Comment