Quicksort (Haskell)

From LiteratePrograms
Jump to: navigation, search
Other implementations: AWK | C | C++ | Eiffel | Erlang | Forth | Haskell | Java | JavaScript | Mathematica | Mercury | Oz | Python | Python, arrays | Scala | Sed | Visual Basic .NET

This program is under development.
Please help to debug it. When debugging
is complete, remove the {{develop}} tag.

We describe several different implementations of the quicksort algorithm. All of these implementations operate on lists. Therefore, they avoid use of the randomized pivot techniques found in, for example, the C implementation of quicksort, since access to a random element of a list takes O(n) time.


[edit] Implementation

[edit] Using list comprehensions

The classic presentation of quicksort in Haskell is a direct implementation using list comprehensions.

qsort1 :: Ord a => [a] -> [a]
qsort1 []     = []
qsort1 (p:xs) = qsort1 lesser ++ [p] ++ qsort1 greater
        lesser  = [ y | y <- xs, y < p ]
        greater = [ y | y <- xs, y >= p ]

This implementation has the advantage of being very clear and easy to understand, yet quite compact. However, it sacrifices some performance in order to achieve its remarkable clarity.

[edit] Using a partitioning function

One problem with the list comprehension implementation is that it makes two passes through the list to generate the lesser and greater lists. An alternative approach is to partition the list in a single pass. One way to accomplish this is through the standard List module's partition function, which splits a list based upon some predicate. We have elected instead to implement our own partitioning function. This choice was made partly for pedagogical reasons, but also because our custom partitioning function can accumulate list items that are equal to the pivot, as well as those that are lesser and greater than the pivot. This eliminates the need to process more than once any items that are equal to the pivot.

qsort2 :: Ord a => [a] -> [a]
qsort2 []     = []
qsort2 (x:xs) = qsort2 lesser ++ equal ++ qsort2 greater
        (lesser,equal,greater) = part x xs ([],[x],[])
part :: Ord a => a -> [a] -> ([a],[a],[a]) -> ([a],[a],[a])
part _ [] (l,e,g) = (l,e,g)
part p (x:xs) (l,e,g)
    | x > p     = part p xs (l,e,x:g)
    | x < p     = part p xs (x:l,e,g)
    | otherwise = part p xs (l,x:e,g)

This implementation should be faster than the version using list comprehensions.

[edit] Using an accumulator

An implementation that uses an accumulator to reduce the amount of storage, and avoid costly concatenation operations.

qsort3 :: Ord a => [a] -> [a]
qsort3 x = qsort3' x []

qsort3' [] y     = y
qsort3' [x] y    = x:y
qsort3' (x:xs) y = part xs [] [x] []  
        part [] l e g = qsort3' l (e ++ (qsort3' g y))
        part (z:zs) l e g 
            | z > x     = part zs l e (z:g) 
            | z < x     = part zs (z:l) e g 
            | otherwise = part zs l (z:e) g

[edit] Testing


[edit] Functionality

The functionality of the quicksort implementations is most easily tested by loading qsort.hs into an interactive Haskell environment, such as GHCi or Hugs. Here's an example of a few tests, and their results:

*Main> let testlist = [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3]
*Main> qsort1 testlist
*Main> qsort2 testlist
*Main> qsort3 testlist
*Main> let testlist = ["bob","alice","barry","zoe","charlotte","fred"]
*Main> qsort1 testlist
*Main> qsort2 testlist
*Main> qsort3 testlist

So, our sorting functions actually sort. But how quickly? Let's take a look at the performance of the different quicksort implementations.

[edit] Performance

The performance statistics shown in the table below were collected using GHC's profiling capabilities. Each implementation was run several times on a randomly-generated 100,000-element list of integers, and the resulting profile data was aggregated.

Implementation Average Time (s) Average Allocation (MB)
qsort1 2.53 139
qsort2 2.03 139
qsort3 1.64 83
GHC List.sort 2.14 107

There are several interesting observations that we can make about this table:

  1. The move from list comprehension to a single pass partitioning function does provide a slight improvement in performance.
  2. The inclusion of an accumulator significantly decreases the amount of storage required, while also yielding an improvement in execution time (presumably as a result of eliminating expensive list concatenations).
  3. The standard GHC sort (which is a quicksort) does not perform quite as well in terms of time and space as the qsort3 implementation. Since GHC's sort implements the same accumulator approach used by qsort3, it seems likely that its lower performance can be attributed to the extra work it does to provide a stable sort (something that qsort3 does not do).

[edit] Parting thoughts

Quicksort was designed to operate on arrays, and works very well on them. However, quicksort is somewhat less suitable for sorting lists, since many of the techniques available for improving the performance of the algorithm rely on random access. It should perhaps not come as a surprise that the standard Haskell list sorting function, Data.List's sort, is a version of the insertion sort. The Hugs Haskell interpreter at one point implemented sort as a stable version of quicksort, but has since replaced it with a mergesort. The GHC compiler continues to use a stable quicksort to implement sort.

Download code