Singly linked list (C Plus Plus)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C | C++ | Java | Python

This is a straightforward implementation of a singly linked list in C++. It does not include performance features that a more complex implementation might.

The standard library in C++ comes with a std::list class which provides a doubly linked list which is more efficient in some operations, but uses more memory per node. It should generally be preferred except in specialized circumstances.

The following code shows how to design container classes that are fully compatible with the STL. Our operation set, including iteration, push_back(), push_front(), insert_after(), erase_after(), pop_front(), and size(), can be used to implement data structures including stacks and queues.

Contents

[edit] Internal representation

The list consists of a sequence of nodes. A node structure contains an element of type T and a pointer to the next node.

<<node structure>>=
struct node
{
  node(T val)
    : next(NULL), value(val)
  {}

  node* next;
  T     value;
};

A constructor is required, otherwise we could not generate node structures for types T that do not have a default constructor.

We need to know where our list starts, so our list needs a pointer to the head node of the list. We also need to keep track of the last node to facilitate rapid insertion at the end of the list (push_back()).

<<list private data>>=
node* _head;
node* _tail;

The head and tail pointers are initialized with NULL to indicate an empty list:

<<member initialisation>>=
_head(NULL), _tail(NULL) 

We also keep track of the number of elements in the list to permit a constant-time size() operation, which would otherwise take linear time:

<<list private data>>=
int _size;

Initially, the size will be 0:

<<member initialisation>>=
, _size(0)

[edit] Iterators

Iterators, objects that reference a particular element of a container, are an important concept in the STL library. Supporting iterators allow C++'s generic algorithms to work on your own data structures, provided that your iterator supports the necessary traits (discussed later).

[edit] Types

We'll provide a const_iterator, which does not allow the referenced element to be modified through it, as well as an iterator, which does allow this.

We will first write the const_iterator and then derive iterator from it. Although this this doesn't seem like an "is-a" relationship ("an iterator is a constant iterator" sounds suspicious), it is consistent with how C++ treats pointers and const. It's okay to take a pointer to a non-const object and assign it to a pointer to a const object, since this just further restricts how the pointer can be used:

int j = 42;
int* i = &j;
int const* ci  = i;   // This is correct
int k = *ci;          // Can't do anything with ci we can't do with i

But the reverse assignment is not okay, because it's not typesafe, allowing operations that before were forbidden:

const int j = 42;
int const* ci = &k;
i = ci2;              // This causes a compiler error
*i = 6;               // Because otherwise this violates constness of j

Our design ensures that our iterators behave as expected: you can assign an iterator to a const_iterator but not remove the constness of the const_iterator by assigning it to an iterator. So we have this hierarchy:

<<iterators>>=
class const_iterator
  : public const_iterator baseclass
{
  public:
    const_iterator interface
  const_iterator state
};

class iterator
  : public const_iterator
{
  public:
    iterator
};

[edit] Traits

Another concept required to be fully compatible with STL are iterator traits. They help algorithms determine what operations an iterator supports. Some standard algorithms can use iterator traits to switch to a more efficient algorithm at compile-time when supplied with more powerful iterators. std::rotate() is an example of such a function.

Here are the traits for the const_iterator:

<<const_iterator interface>>=
typedef iterator category iterator_category;
typedef T         value_type;
typedef ptrdiff_t difference_type;
typedef T const*  pointer;
typedef T const&  reference;

Next we choose an iterator category. Random access and bidirectional movement are not efficient in a singly linked list, so we choose a simple forward iterator that can store and retrieve values:

<<iterator category>>=
std::forward_iterator_tag

Finally, our iterator must be derived from std::iterator:

<<const_iterator baseclass>>=
std::iterator<std::forward_iterator_tag, T, size_t, T const*, T const&>

[edit] State

In our iterator, we store a pointer to the current node:

<<const_iterator state>>=
protected:
  node* _curpos;

If the iterator is pointing one element past the last element, _curpos will contain NULL. Construction is straightforward:

<<const_iterator interface>>=
const_iterator(node* cur_node = NULL)
  : _curpos(cur_node)
{}

This iterator representation has the benefit that it allows us to insert and remove elements as we please without invalidating any iterators (other than one referencing the node being removed of course), which is ideal.

Unfortunately, this choice of iterator state also limits the available operations: the standard insert() operation of std::list inserts a node before the current node, and the standard erase() operation removes the current node. In either case, we require access to the previous node to do the same in a singly-linked list. We can store a pointer to this node in the iterator state, but this leads to a more complex implementation with unexpected invalidation of iterators on insertion and deletion. Instead, we supply only insert_after() and erase_after() operations, which are efficient given only the current node.

[edit] Implementation

The remaining functions to do are element access and increment. These are trivial to write.

<<const_iterator interface>>=
reference operator*()
{
  return _curpos->value;
}

pointer operator ->()
{
  return &(**this);
}

// The lack of arguments indicates that this is prefix ++
const_iterator& operator++()
{
  _curpos = _curpos->next;
  return *this;
}

// The dummy argument indicates that this is the postfix ++
const_iterator operator++(int)
{
  const_iterator old_it = *this;
  ++(*this);
  return old_it;
}

bool operator ==(const_iterator const& rhs)
{
  return _curpos == rhs._curpos;
}

bool operator !=(const_iterator const& rhs)
{
  return !(*this == rhs);
}

The implementation of the subclass iterator is identical except that we change constant-referencing types to non-constant-referencing types:

<<iterator>>=
iterator(node* iter_node = NULL)
  : const_iterator(iter_node)
{}

typedef std::forward_iterator_tag iterator_category;
typedef T      value_type;
typedef size_t difference_type;
typedef T*     pointer;
typedef T&     reference;

using const_iterator::_curpos;

reference operator*()
{
  return _curpos->value;
}

pointer operator ->()
{
  return &(**this);
}

iterator& operator++()
{
  _curpos = _curpos->next;
  return *this;
}

iterator operator++(int)
{
  iterator old_it = *this;
  ++(*this);
  return old_it;
}

[edit] Implementation

[edit] Iteration

Now that we've got iterators, we should give the user of the class access to them. The standard functions for these are:

<<public interface>>=
iterator begin();
iterator end();
const_iterator begin() const;
const_iterator end() const;

For begin() we just use the list's head pointer. For end(), we use NULL, which was reserved for this purpose.

<<list implementation>>=
template <typename T>
inline typename list<T>::iterator list<T>::begin()
{
    return iterator(_head);
}

template <typename T>
inline typename list<T>::iterator list<T>::end()
{
    return iterator(NULL);
}

The const versions are similar:

<<list implementation>>=
template <typename T>
inline typename list<T>::const_iterator list<T>::begin() const
{
    return const_iterator(_head);
}

template <typename T>
inline typename list<T>::const_iterator list<T>::end() const
{
    return const_iterator(NULL);
}

[edit] Insertion

We'll provide three methods that insert elements into lists:

<<public interface>>=
void push_front(T const& elt);
void push_back(T const& elt);
void insert_after(iterator where, T const& elt);

We do not supply insert(), since this cannot be implemented efficiently in the current framework, and so could be dangerous to use.

insert_after() is the most general method. It inserts an element after the element referred to by the given iterator. It is not valid to invoke it on end(). We implement it by creating a node and placing it between the current node and its successor:

<<list implementation>>=
template <typename T>
inline void list<T>::insert_after(typename list<T>::iterator where, T const& elt)
{
    node* newnode = new node(elt);
    newnode->next = where._curpos->next;
    where._curpos->next = newnode;
    _size++;
    insert: update tail
}

Note the typename keyword in the argument list. It is necessary here because the compiler cannot determine by itself that iterator is a type since it is dependent on the template parameter T.

To write the above, we need to ensure that the list has access to the iterator's internal pointer _curpos using a friend declaration. This violates encapsulation, but the iterator and list are tightly coupled, and it's better than exposing the pointer to everyone via a public method:

<<list friend declarations>>=
friend class list<T>::iterator;

If we're inserting at the end of the list, we must update the tail pointer:

<<insert: update tail>>=
if (newnode->next == NULL)
{
    _tail = newnode;
}

The above method has the limitation that it cannot be used to insert an element at the beginning of the list. For this we supply push_front, which simply prepends a new node and updates the head pointer:

<<list implementation>>=
template <typename T>
inline void list<T>::push_front(T const& elt)
{
    node* newnode = new node(elt);
    newnode->next = _head;
    _head = newnode;
    push_front: update tail
    _size++;
}

If the list is empty prior to the insertion, _tail may be NULL; we set it to point to the new element, which is both first and last at this point:

<<push_front: update tail>>=
if (_tail == NULL)
{
    _tail = newnode;
}

Finally, push_back() just inserts after the last element, except in the special case where the list is empty, where we use push_front():

<<list implementation>>=
template <typename T>
inline void list<T>::push_back(T const& elt)
{
    if (_tail != NULL)
        insert_after(iterator(_tail), elt);
    else
        push_front(elt);
}

[edit] Deletion

The following methods delete elements from the list:

<<public interface>>=
iterator erase_after(iterator it);
void clear();
void pop_front();

erase_after() erases the element following the one the iterator points to and returns an iterator to the next valid element after the one deleted. clear() erases all elements. pop_front() removes the first element of the list. We cannot implement pop_back() efficiently because, although we have a pointer to the tail node, we cannot update this pointer efficiently without access to the previous node.

We begin with erase_after(), which simply sets the next pointer of the current node to the node after the following one. It is not valid to invoke on the final element of the list or on end(). We then delete the following node:

<<list implementation>>=
template <typename T>
inline typename list<T>::iterator list<T>::erase_after(typename list<T>::iterator it)
{
    node* todelete = it._curpos->next;
    it._curpos->next = it._curpos->next->next;
    _size--;

    erase: update tail
    delete todelete;
    return iterator(it._curpos->next);
}

This can be used to delete the final element of the list, since it._curpos->next->next is NULL in this case. We have to update the tail pointer if we delete the final element of the list:

<<erase: update tail>>=
if (it._curpos->next == NULL)
{
    _tail = it._curpos;
}

Because we have the iterator state handy, we don't have the same problem as we would have implementing pop_back(). This method has the limitation though that it cannot delete the first node in the list. For this purpose we supply pop_front(), which just updates head and then deletes the old head:

<<list implementation>>=
template <typename T>
inline void list<T>::pop_front()
{
    node* todelete = _head;
    _head = _head->next;
    if (_head == NULL) {
        _tail = NULL;
    }
    _size--;
    delete todelete;
}

We demonstrate the use of pop_front() with clear(), which simply removes elements until none remain:

<<list implementation>>=
template <typename T>
inline void list<T>::clear()
{
    while (begin() != end())
    {
        pop_front();
    }
}

[edit] Assignment

We next implement the assignment operator, which can simply use the insertion operation and iteration without ever examining internal state. It must also use clear() to clean out any existing list elements in the target list.

<<public interface>>=
list& operator =(list<T> const& rhs);

<<list implementation>>=
template <typename T>
list<T>& list<T>::operator =(list<T> const& rhs)
{
    clear();

    const_iterator rhs_it = rhs.begin();
    for (rhs_it = rhs.begin(); rhs_it != rhs.end(); ++rhs_it)
    {
        push_back(*rhs_it);
    }

    return *this;
}

[edit] List size

We supply two methods that query the number of elements in the list and determine whether it is empty:

<<public interface>>=
size_t size() const;
bool empty() const;

Their implementation is trivial:

<<list implementation>>=
template <typename T>
inline size_t list<T>::size() const
{
  return _size;
}

template <typename T>
inline bool list<T>::empty() const
{
  return _size == 0;
}

[edit] Construction and destruction

Finally, we wish to provide the following constructors and destructor:

<<construction interface>>=
list();
list(list<T> const& rhs);
~list();

The default constructor creates an empty list, while the copy constructor and destructor are trivial to implement using the methods we have already implemented:

<<list implementation>>=
template <typename T>
inline list<T>::list()
  : member initialisation
{}

template <typename T>
inline list<T>::list(list<T> const& rhs)
  : member initialisation
{
    *this = rhs;
}

template <typename T>
inline list<T>::~list()
{
    clear();
}

[edit] Files

All of the code is placed in header files, since this is a templated class. We separate declarations from definitions by placing one in a .h file and the other in a .hpp file:

<<sllist.h>>=
#include <iterator>

template <typename T>
class list
{
  private:
    node structure
    list private data

  public:
    iterators

    construction interface
    public interface
    list friend declarations
};

#include "sllist.hpp"

<<sllist.hpp>>=
list implementation

[edit] Test code

<<list_test.cpp>>=
#include <iostream>

#include "sllist.h"

template <typename T>
void print_list(list<T> const& lst)
{
  typename list<T>::const_iterator it;
  for (it = lst.begin(); it != lst.end(); ++it)
    std::cout << *it << " ";
  std::cout << std::endl;
}

int main()
{
  list<int> lst;
  lst.push_back(2);
  lst.push_back(3);
  lst.push_front(1);
  lst.push_back(4);
  list<int> list2;
  list2 = lst;
  print_list(lst);
  list<int>::iterator it;
  it = lst.begin();
  it++;
  lst.insert_after(it, 42);
  ++it;
  print_list(lst);
  lst.erase_after(++it);
  print_list(lst);
  it = lst.begin();
  *it = 10;
  print_list(lst);
  lst.clear();
  print_list(lst);
  print_list(list2);
  std::fill(list2.begin(), list2.end(), 42);
  print_list(list2);
  return 0;
}

If you build and run this, you should see this:

<<list_test_output.txt>>=
1 2 3 4 
1 2 42 3 4 
1 2 42 3 
10 2 42 3 

1 2 3 4 
42 42 42 42 
Download code
hijacker
hijacker
hijacker
hijacker