Median cut algorithm (C Plus Plus)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C++ | J
If you're looking to find the median of a list of elements, see the nth_element standard library function or Category:Selection_algorithm.

The median cut algorithm is a popular algorithm for color quantization, but can be applied to virtually any point clustering problem. In outline, it works as follows:

  • Let B be a set of boxes containing points, initially containing only a single box containing all points.
  • While the number of boxes in B is less than desired number of clusters...
    • Find the largest side length of any side of any box.
    • Cut that box into two boxes along its largest side in such a way that half the contained points fall into each new box.
    • Shrink the two new boxes so that they are just large enough to contain their points.


C++ includes a number of standard containers and algorithms that greatly facilitate the simple and efficient implementation of median cut.

Contents

[edit] Point and Block data structures

We represent each point with a simple structure reserving an unsigned byte for each component:

<<constants>>=
const int NUM_DIMENSIONS = 3;

<<class declarations>>=
struct Point
{
    unsigned char x[NUM_DIMENSIONS];
};

We don't include a comparison operator because we'll want to compare them in several different ways.

We represent a block by storing the points corresponding to two opposite corners, where one has all three coordinates as small as possible and the other as large as possible. For example, a block might have minCorner set to (1, 1, 1) and maxCorner set to (2, 2, 2) — this block has all sides of length one. Blocks also keep track of the input points that they contain. For efficiency reasons to be explained later, they track points using a pointer and a length:

<<class declarations>>=
class Block
{
    Point minCorner, maxCorner;
    Point* points;
    int pointsLength;
public:
    Block constructor declaration
    Block public method declarations
private:
    Block helper method declarations
};

We'll see that it's necessary that we have random access to the points. The only constructor we will need is one that takes an array of points and initializes the box to the largest possible box:

<<Block constructor declaration>>=
Block(Point* points, int pointsLength);
<<Block constructor definition>>=
Block::Block(Point* points, int pointsLength)
{
    this->points = points;
    this->pointsLength = pointsLength;
    for(int i=0; i < NUM_DIMENSIONS; i++)
    {
        minCorner.x[i] = std::numeric_limits<unsigned char>::min();
        maxCorner.x[i] = std::numeric_limits<unsigned char>::max();
    }
}

We need limits.h for numeric_limits<int>::min() and max():

<<header files>>=
#include <limits>

We provide simple accessors for the points and points length:

<<Block public method declarations>>=
Point * getPoints();
int numPoints() const;
<<Block public methods>>=
Point * Block::getPoints()
{
    return points;
}

int Block::numPoints() const
{
    return pointsLength;
}


In each step of the algorithm, we will need to find the block with the longest side. We first add a method that figures out which side of a block is longest:

<<Block public method declarations>>=
int longestSideIndex() const;
<<Block public methods>>=
int Block::longestSideIndex() const
{
    int m = maxCorner.x[0] - minCorner.x[0];
    int maxIndex = 0;
    for(int i=1; i < NUM_DIMENSIONS; i++)
    {
        int diff = maxCorner.x[i] - minCorner.x[i];
        if (diff > m)
        {
            m = diff;
            maxIndex = i;
        }
    }
    return maxIndex;
}

Now, we define a method to determine how long the block's longest side is:

<<Block public method declarations>>=
int longestSideLength() const;
<<Block public methods>>=
int Block::longestSideLength() const
{
    int i = longestSideIndex();
    return maxCorner.x[i] - minCorner.x[i];
}

To avoid noncompliance issues in Visual C++, we refrain from using std::max, instead redefining our own min and max methods:

<<Block helper method declarations>>=
template <typename T>
static T min(const T a, const T b)
{
    if (a < b)
        return a;
    else
        return b;
}

template <typename T>
static T max(const T a, const T b)
{
    if (a > b)
        return a;
    else
        return b;
}

Next, we define a priority queue of blocks:

<<header files>>=
#include <queue>
<<median cut declarations>>=
std::priority_queue<Block> blockQueue;

Finally, we define a less-than operator on Block that causes the Block having the longest side to have maximum priority in the queue, meaning they will be selected first:

<<Block public method declarations>>=
bool operator<(const Block& rhs) const;
<<Block public methods>>=
bool Block::operator<(const Block& rhs) const
{
    return this->longestSideLength() < rhs.longestSideLength();
}

Now, blockQueue.top() will rapidly yield the block we desire with no search. Old blocks (blocks being cut) are removed with pop(), and new blocks (the pieces resulting from the cut) can be added using push():

<<median cut algorithm main loop>>=
create initial block
blockQueue.push(initialBlock);
while (blockQueue.size() < desiredSize)
{
    Block longestBlock = blockQueue.top();
    blockQueue.pop();
    split longestBlock into block1 and block2
    blockQueue.push(block1);
    blockQueue.push(block2);
}

Finally, we need a way to shrink a block so that it just barely contains its points; that is, its minimum and maximum coordinates are chosen according to the minimum and maximum coordinates of its points. We add a method shrink() for this, which computes the minimums and maximums of each coordinate of the points in a block and sets the two corners accordingly:

<<Block public method declarations>>=
void shrink();
<<Block public methods>>=
void Block::shrink()
{
    int i,j;
    for(j=0; j<NUM_DIMENSIONS; j++)
    {
        minCorner.x[j] = maxCorner.x[j] = points[0].x[j];
    }
    for(i=1; i < pointsLength; i++)
    {
        for(j=0; j<NUM_DIMENSIONS; j++)
        {
            minCorner.x[j] = min(minCorner.x[j], points[i].x[j]);
            maxCorner.x[j] = max(maxCorner.x[j], points[i].x[j]);
        }
    }
}

We compute the minimums and maximums in a single pass to maximize cache performance.

[edit] Performing the median cut

Our median cut algorithm will take an array of image points as its input and produce a list of points denoting the optimized palette entries:

<<median cut algorithm declaration>>=
std::list<Point> medianCut(Point* image, int numPoints, unsigned int desiredSize)
<<median cut algorithm>>=
median cut algorithm declaration
{
    median cut declarations
    median cut algorithm main loop
    std::list<Point> result;
    find a representative point for each block and add it to the result
    return result;
}
<<type definition headers>>=
#include <list>

Our initial block will wrap all the points:

<<create initial block>>=
Block initialBlock(image, numPoints);
initialBlock.shrink();

To split a block, we use nth_element, which can simultaneously locate the median and then rearrange the points list to separate the elements less than the median from those greater than the median:

<<split longestBlock into block1 and block2>>=
Point * begin  = longestBlock.getPoints();
Point * median = longestBlock.getPoints() + (longestBlock.numPoints()+1)/2;
Point * end    = longestBlock.getPoints() + longestBlock.numPoints();
partition the points into two sublists using nth_element
Block block1(begin, median-begin), block2(median, end-median);
block1.shrink();
block2.shrink();

The two new blocks block1 and block2 wrap around the lower and upper halves of the rearranged list. We need the algorithm header for nth_element:

<<header files>>=
#include <algorithm>

Depending on the direction we're cutting, we need to examine a different coordinate of the points when partitioning. One straightforward but slow way to do this is to use a comparator object that knows the block's longest side's index:

<<slow: partition the points into two sublists using nth_element>>=
CoordinatePointComparator pointComparator(longestBlock.longestSideIndex());
std::nth_element(begin, median, end, pointComparator);

<<slow: class declarations>>=
class CoordinatePointComparator
{
    int index;
public:
    CoordinatePointComparator(int index)
    {
        this->index = index;
    }

    bool operator()(Point left, Point right)
    {
        return left.x[index] < right.x[index];
    }
};

If we can assume we're dealing with three dimensions, a more efficient approach is to use templates to specialize the comparator object to the particular index being compared, moving costly branches from the inner loop of the partition operation to an initial branch outside the calls:

<<partition the points into two sublists using nth_element>>=
switch(longestBlock.longestSideIndex())
{
case 0: std::nth_element(begin, median, end, CoordinatePointComparator<0>()); break;
case 1: std::nth_element(begin, median, end, CoordinatePointComparator<1>()); break;
case 2: std::nth_element(begin, median, end, CoordinatePointComparator<2>()); break;
}

<<class declarations>>=
template <int index>
class CoordinatePointComparator
{
public:
    bool operator()(Point left, Point right)
    {
        return left.x[index] < right.x[index];
    }
};

Experiments show that this gives a 30% time improvement over the slower solution.

To find a representative point for each block, we merely compute the arithmetic mean (average) of all points in the cluster:

<<find a representative point for each block and add it to the result>>=
while(!blockQueue.empty())
{
    Block block = blockQueue.top();
    blockQueue.pop();
    Point * points = block.getPoints();

    int sum[NUM_DIMENSIONS] = {0};
    for(int i=0; i < block.numPoints(); i++)
    {
        for(int j=0; j < NUM_DIMENSIONS; j++)
        {
            sum[j] += points[i].x[j];
        }
    }

    Point averagePoint;
    for(int j=0; j < NUM_DIMENSIONS; j++)
    {
        averagePoint.x[j] = sum[j] / block.numPoints();
    }

    result.push_back(averagePoint);
}

[edit] Files

Finally, we wrap up the implementation in source and header files for reuse:

<<median_cut.h>>=
#ifndef _MEDIAN_CUT_H_
type definition headers
constants
class declarations
median cut algorithm declaration;
#endif /* #ifndef _MEDIAN_CUT_H_ */
<<median_cut.cpp>>=
header files
#include "median_cut.h"

Block constructor definition
Block public methods
median cut algorithm

[edit] Sample code

We can try out the algorithm using a small command-line driver. We feed it a raw image file along with its dimensions and the desired palette size. For example, if you download Image:Amorphophallus titanum USBG small.raw, and assuming your compiled binary is named "median_cut_sample", you can run median cut on it like this:

./median_cut_sample Amorphophallus_titanum_USBG_small.raw 100 145 16

The output will be:

74 68 86
51 49 39
165 126 127
135 141 134
118 84 78
20 12 13
179 153 159
205 216 148
41 51 16
171 178 94
241 240 229
134 147 80
57 40 57
74 85 58
120 95 124
184 210 103

I reduced color depth in Photoshop using this custom palette. The resulting image is a bit muted compared to the proprietary color quantization used by Photoshop, but is nevertheless quite suited to the image:

Amorphophallus titanum USBG small.png Amorphophallus titanum USBG small median cut.png

<<median_cut_sample.cpp>>=
#include <iostream>
#include "median_cut.h"

int main(int argc, char* argv[]) {
    FILE * raw_in;
    int numPoints = atoi(argv[2]) * atoi(argv[3]);
    Point* points = (Point*)malloc(sizeof(Point) * numPoints);

    raw_in = fopen(argv[1], "rb");
    for(int i = 0; i < numPoints; i++)
    {
        fread(&points[i], 3, 1, raw_in);
    }
    fclose(raw_in);

    std::list<Point> palette =
        medianCut(points, numPoints, atoi(argv[4]));

    std::list<Point>::iterator iter;
    for (iter = palette.begin() ; iter != palette.end(); iter++)
    {
        std::cout << (int)iter->x[0] << " "
                  << (int)iter->x[1] << " "
                  << (int)iter->x[2] << std::endl;
    }

    return 0;
}

[edit] Performance

Because it avoids sorting the points and is very simple, this implementation performs well. As the following table demonstrates, both the image size and palette size have an impact on runtime. The same photograph at various scales was used for each image. The program was compiled on Linux with gcc with the "-O3" flag and run on a 2.8 GhZ Pentium 4 processor. User space time is measured, including load time. User times were very consistent between runs, with only a few milliseconds of variation.

Image pixels Time in milliseconds for palette of size
1 2 4 8 16 32 64 128 256 512 1024
196,608 24 28 32 40 48 48 56 56 64 76 84
393,132 48 56 68 76 92 96 112 116 132 144 164
786,432 88 100 120 152 172 188 208 232 252 288 308
1,572,528 168 200 236 292 324 356 408 448 492 548 612
3,145,728 336 404 480 580 640 712 832 876 972 1084 1184

Time scales roughly linearly with image size and logarithmically with palette size, as theory would predict.

According to profiling, runtime is dominated about half by the partition operation and half by the block shrinking operation (mainly, the computing of minimums and maximums). A more efficient approach would combine these operations, finding minimums and maximums at the same time that it performs the partition instead of scanning the points all over again, but this would require opening up or reimplementing nth_element, so we don't do it here.

Download code
hijacker
hijacker
hijacker
hijacker