Shell sort (C)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C | Fortran95

Shell sort is a relatively simple sorting algorithm based on insertion sort. It operates by taking every kth element of the list and performing an insertion sort on this subset. It does this for several decreasing values of k, such as 23, 10, 4, and 1, until k=1, at which point it is performing insertion sort; but the previous iterations moved each element closer to its final position, speeding up this final pass dramatically.


In this implementation, for simplicity we'll concentrate on sorting integer words.

[edit] Choosing an interval size list

In Shellsort or shell sort, the key to efficiency is a good list of interval or gap sizes. To maximize speed, we begin with a list of interval sizes devised by careful empirical research by Marcin Ciura in order to minimize average runtime. For large enough lists, we extend the sequence geometrically by multiplying the largest element of Ciura's list successively by 2.3 (since Ciura's sequence is similar to a geometric sequence with multiplier 2.3).

<<Ciura's list>>=
int ciura_intervals[] = {701, 301, 132, 57, 23, 10, 4, 1};
double extend_ciura_multiplier = 2.3;

<<compute initial interval>>=
int interval_idx = 0;
int interval = ciura_intervals[0];
if (length > interval) {
    while(length > interval) {
        interval_idx--;
        interval = (int)(interval*extend_ciura_multiplier);
    }
} else {
    while(length < interval) {
        interval_idx++;
        interval = ciura_intervals[interval_idx];
    }
}

<<advance to next interval>>=
interval_idx++;
if (interval_idx >= 0) {
    interval = ciura_intervals[interval_idx];
} else {
    interval = (int)(interval/extend_ciura_multiplier);
}

[edit] Main sort

The main sort is very similar to insertion sort (see Insertion sort (C, simple)), with an added parameter determining the interval:

<<function to perform a single pass of shell sort with a given interval>>=
void shell_sort_pass(int a[], int length, int interval) {
    int i;
    for (i=0; i < length; i++) {
        /* Insert a[i] into the sorted sublist */
        int j, v = a[i];
 
        for (j = i - interval; j >= 0; j -= interval) {
            if (a[j] <= v) break;
            a[j + interval] = a[j];
        }
        a[j + interval] = v;
   }
}

The only difference from insertion sort is that the inner loop progresses in steps of interval instead of steps of 1. Note that the outer loop still visits every element; the reason for this is clear if we write it like this:

    /* Visits same elements as the loop above in different order */
    for (column=0; column < k; column++) {
        for (i=column; i < length; i+=k) {
            /* Perform insertion */
        }
    }

Imagine the list is written as a matrix of k by length/k elements. If i were to progress in steps of k, we would only be sorting the first column of this matrix, leaving the other k−1 unsorted, which will not give us the speedup we need. The above code sorts each column, and it is equivalent to the original code.

We now execute this function once for each interval size in our sequence:

<<main shell sort>>=
void shell_sort(int a[], int length) {
    Ciura's list
    compute initial interval
    while (interval > 1) {
        advance to next interval
        shell_sort_pass(a, length, interval);
    }
}

We can test it on a random list of any desired size with this sample code:

<<shellsort.c>>=
#include <stdio.h>
#include <stdlib.h>

function to perform a single pass of shell sort with a given interval
main shell sort

int main(int argc, char* argv[]) {
    int length = atoi(argv[1]);
    int* a = malloc(sizeof(int)*length);
    int i;
    /* Fill list with random values */
    for(i=0; i<length; i++) {
        a[i] = rand();
    }
    shell_sort(a, length);
    /* Verify list is in order */
    for(i=1; i<length; i++) {
        if (a[i-1] > a[i]) {
            puts("ERROR");
            return -1;
        }
    }
    return 0;
}

This completes the implementation.

[edit] Performance

The implementation was compiled using gcc with the -O3 flag and run on a 2.8 GhZ Pentium 4 machine under Gentoo Linux with 1 GB of RAM. For comparison, we used C++'s carefully tuned and template-specialized std::sort (we did not use qsort() because it would have an unfair disadvantage due to overhead for function pointer calls). The results are as follows:

List size 1,000 2,500 5,000 10,000 25,000 50,000 75,000 100,000 200,000 500,000 1M 5M 10M
Shell sort time (s) 0.002 0.002 0.003 0.004 0.008 0.014 0.019 0.024 0.061 0.147 0.299 1.65 3.51
Quicksort time (s) 0.002 0.002 0.003 0.003 0.005 0.009 0.012 0.016 0.031 0.078 0.158 0.846 1.752

Although shellsort begins to lose ground around 25000 elements, it remains competitive with the tuned quicksort algorithm even up to millions of elements, performing at worst about twice as slow.

Download code
hijacker
hijacker
hijacker
hijacker