Quicksort (Erlang)

From LiteratePrograms
Jump to: navigation, search
Other implementations: AWK | C | C++ | Eiffel | Erlang | Forth | Haskell | Java | JavaScript | Mathematica | Mercury | Oz | Python | Python, arrays | Scala | Sed | Standard ML | Visual Basic .NET | XProc

We describe several different implementations of the quicksort algorithm in Erlang. All of these implementations operate on lists. Therefore, they avoid use of the randomized pivot techniques found in, for example, the C implementation of quicksort, since access to a random element of a list takes O(n) time.


[edit] Implementation

[edit] Using list comprehensions

The most straightforward way to implement quicksort in Erlang is a direct implementation using list comprehensions. It generates two lists, one of elements greater than or equal to the "pivot" element (in this case the first element of the list), and one of elements less than the pivot. These two lists are then recursively sorted, before being concatenated around the pivot to form the final sorted list.

qsort1([]) ->
qsort1([H | T]) -> 
    qsort1([ X || X <- T, X < H ]) ++ [H] ++ qsort1([ X || X <- T, X >= H ]).

This implementation has the advantage of being very clear and easy to understand, yet quite compact. However, it sacrifices some performance in order to achieve its remarkable clarity.

[edit] Using a partitioning function

One problem with the list comprehension implementation is that it makes two passes through the list to generate the lesser and greater lists. An alternative approach is to partition the list in a single pass. One way to accomplish this is through the standard lists module's partition function, which splits a list based upon some predicate. We have instead elected to implement our own partitioning function. This choice was made partly for pedagogical reasons, but also because our custom partitioning function can accumulate list items that are equal to the pivot, as well as those that are lesser and greater than the pivot. This eliminates the need to process more than once any items that are equal to the pivot.

The new implementation is conceptually the same as the earlier list comprehension implementation, except that the Lesser, Equal, and Greater lists are now generated by the part function.

qsort2([]) ->
qsort2([H | T]) ->
    {Less, Equal, Greater} = part(H, T, {[], [H], []}),
    qsort2(Less) ++ Equal ++ qsort2(Greater).

partition function

The partitioning function is a straightforward recursion. The base case has an empty list in its second argument, which corresponds to the list to be partitioned. When the list being partitioned is non-empty, we compare the head of the list to the pivot, and add the head to the appropriate list (lesser, equal, or greater) based on the outcome of these comparisons.

<<partition function>>= 
part(_, [], {L, E, G}) ->
    {L, E, G};
part(X, [H | T], {L, E, G}) ->
        H < X ->
            part(X, T, {[H | L], E, G});
        H > X ->
            part(X, T, {L, E, [H | G]});
        true ->
            part(X, T, {L, [H | E], G})

Since the part function performs only a single pass through the list to create the the partitions, this implementation should be faster than the version using list comprehensions.

[edit] Using an accumulator

The qsort2 implementation of quicksort is more efficient than the naive implementation, but still involves many costly list traversals to concatenate the sorted list segments together. It also stores lots of sorted list segments as separate lists, which increases the amount of storage required by the implementation. A more efficient alternative is to add an accumulator to the quicksort implementation. The accumulator is a list which is used to accumulate (hence the name) sorted list elements. The qsort3 implementation initializes the accumulator to the empty list, and calls the auxiliary qsort3_acc function to perform the actual sorting.

qsort3([]) ->
qsort3([H | T]) ->
    qsort3_acc([H | T], []).


The qsort3_acc function is the workhorse of this implementation. This function has two possible cases. If it has run out of unsorted list elements, it returns the accumulated list of sorted elements.

qsort3_acc([], Acc) ->

If there is a non-empty unsorted list, the first element is pulled off to act as a pivot for partitioning the unsorted list.

qsort3_acc([H | T], Acc) ->
    part_acc(H, T, {[], [H], []}, Acc).


The partitioning function resembles the approach to partitioning used for the qsort2 implementation. However, unlike the previous partitioning function, when the qsort3_acc partitioning function finishes partitioning the list it recursively calls qsort3_acc on the "greater" portion of the list and the current value of the accumulator. The result is a sorted list of elements greater than the pivot, to which the list of elements equal to the pivot is concatenated. This new list becomes the accumulator for a recursive call of qsort3_acc on the "lesser" portion of the partitioned list.

part_acc(_, [], {L, E, G}, Acc) ->
    qsort3_acc(L, (E ++ qsort3_acc(G, Acc)));
part_acc(X, [H | T], {L, E, G}, Acc) ->
        H < X ->
            part_acc(X, T, {[H | L], E, G}, Acc);
        H > X ->
            part_acc(X, T, {L, E, [H | G]}, Acc);
        true ->
            part_acc(X, T, {L, [H | E], G}, Acc)

The result of using the accumulator is that the list traversals required for segment concatenation in qsort2 are eliminated. Instead, the sorted list is assembled as the sorting process takes place. The only concatentation that is required involves the (hopefully short) list of items equal to the pivot.

[edit] Testing

We package the various quicksort implementations into a module called qsort.

-export([qsort1/1, qsort2/1, qsort3/1]).


The simple test-driver test_qsort checks that each of the quicksort implementations can successfully sort lists of numbers and of strings.


start() ->
    io:format("Testing qsort...~n"),
    test_numeric_sort({qsort, qsort1}),
    test_string_sort({qsort, qsort1}),
    test_numeric_sort({qsort, qsort2}),
    test_string_sort({qsort, qsort2}),
    test_numeric_sort({qsort, qsort3}),
    test_string_sort({qsort, qsort3}),

test_numeric_sort({M,F}) ->
    Stimulus = [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3],
    Response = [1,1,2,3,3,3,4,5,5,5,6,7,8,9,9,9],
    test_sort({M, F}, "numeric sort", Stimulus, Response).
test_string_sort({M, F}) ->
    Stimulus = ["bob","alice","barry","zoe","charlotte","fred","marvin"],
    Response = ["alice","barry","bob","charlotte","fred","marvin","zoe"],
    test_sort({M, F}, "string sort", Stimulus, Response).
test_sort({M, F}, Testname, S, R) ->
    FunName = erlang:atom_to_list(F),
    ModName = erlang:atom_to_list(M),
    Result =  M:F(S),
        R == Result -> 
            io:format("~s:~s - passed - ~s.~n", [ModName, FunName, Testname]);
        true -> 
            io:format("~s:~s - failed - ~s.~n", [ModName, FunName, Testname])

Both of these modules can be compiled at the command line, using the erlc compiler:

$ erlc qsort.erl 
$ erlc test_qsort.erl 

Running test_qsort produces the following results:

$ erl -run test_qsort
Erlang (BEAM) emulator version 5.4.12 [source] [hipe] 

Testing qsort...
qsort:qsort1 - passed - numeric sort.
qsort:qsort1 - passed - string sort.
qsort:qsort2 - passed - numeric sort.
qsort:qsort2 - passed - string sort.
qsort:qsort3 - passed - numeric sort.
qsort:qsort3 - passed - string sort.
Download code