Merge sort (C Plus Plus)

From LiteratePrograms
Jump to: navigation, search
Other implementations: ACL2 | C | C++ | dc | Eiffel | Erlang | Haskell | Java | JavaScript | Lisp | OCaml | Oz | Perl | Prolog | Python | Ruby | Scheme

This is an implementation of mergesort in C++.

C++'s Standard Template Library provides a std::stable_sort() function to perform stable sort on a range between random access iterators, which uses mergesort by default. Also, std::list (doubly-linked list) has a method called sort() which sorts the linked list, which is probably also implemented using mergesort by default. Both of the above can use either the < operator or a user-supplied comparator for comparison.

<<mergesort.cpp>>=

credit

#include<vector>
#include<string>
#include<algorithm>
#include<iostream>

merge
mergesort
test

[edit] Mergesort

The mergesort algorithm works by:

  • Splitting the data in 2 halves
  • Sorting each half
  • Merging the halves

mergesort() takes two iterators, one representing the start of the data to be sorted, and one representing the end of the data to be sorted (but in fact pointed to the element that follows the section of data to be sorted). The data is split in two parts, and mergesort() is recursively called on each. The merge() function is applied on the result, making the data completely sorted. If the provided iterators represent a single element or no elements at all, mergesort() can just return.

<<mergesort>>=

template<typename IT> void mergesort(IT begin, IT end)
{
	size_t size(end-begin);
	if(size<2) return;

	IT begin_right=begin+size/2;

	mergesort(begin, begin_right);
	mergesort(begin_right, end);
	merge(begin, begin_right, end);
}

[edit] Merging

We use in-place merging, which doesn't waste much memory, but can be a little complicated. An alternative method is to create a new container and copying the elements in to it.

Each element in the left part is compared with the first element in the right part. If it is larger, the right element is copied to the left half, and the old left element is inserted in the correct position in the right half, using insert(). When all left half elements are tested, the complete container is sorted. Swap is used to eliminate potentially expensive/throwing construction for types which provide an overloaded swap within their namespace.

<<merge>>=

insert

template<typename IT> void merge(IT begin, IT begin_right, IT end)
{
	for(;begin<begin_right; ++begin) {
		if(*begin_right < *begin) {
			typename std::iterator_traits<IT>::value_type v;
			using ::std::swap;
			swap(v, *begin);
			swap(*begin, *begin_right);
			insert(begin_right, end, v);
		}
	}
}

Inserting an element into the right part is simple. The first element is already copied to the left part, so it can be thrown away. The insert() function moves all elements smaller than the value to be inserted 1 position left. The value to be inserted is then copied to the free position. The unqualified call to swap used a user defined swap within the user type's namespace if one exists, and std::swap otherwise.

<<insert>>=

template<typename IT, typename VT> void insert(IT begin, IT end, const VT &v)
{
	using ::std::swap;
	while (begin+1!=end && *(begin+1)<v) {
		swap(*begin, *(begin+1));
		++begin;
	}
	swap(*begin, v);
}

[edit] Testing

This test code will run the algorithm on an std::vector of std::strings and an array of ints. This implementation will not work on std::list, since it uses operators + and - on the iterators.

<<test>>=

template<typename T> void print(const T &data)
{
	std::cout<<" "<<data;
}


int main()
{
	std::vector<std::string> v(10);
	v[0]="Paris";
	v[1]="London";
	v[2]="Stockholm";
	v[3]="Berlin";
	v[4]="Oslo";
	v[5]="Rome";
	v[6]="Madrid";
	v[7]="Tallinn";
	v[8]="Amsterdam";
	v[9]="Dublin";

	std::cout<<"v before qsort: ";
	std::for_each(v.begin(), v.end(), print<std::string>);
	std::cout<<'\n';

	mergesort(v.begin(), v.end());

	std::cout<<"v after mergesort: ";
	std::for_each(v.begin(), v.end(), print<std::string>);
	std::cout<<'\n';

	int a[]={3,8,0,6,7,4,2,1,9,3,1,8,3,9,2,0,9};
	int *a_end=a+sizeof a/sizeof(int);
	

	std::cout<<"a before mergesort: ";
	std::for_each(a, a_end, print<int>);
	std::cout<<'\n';

	mergesort(a, a_end);

	std::cout<<"a after mergesort: ";
	std::for_each(a, a_end, print<int>);
	std::cout<<'\n';

	return 0;
}

This implementation is originally based on a public domain D program posted by David Medlock to Digital Mars, who as a courtesy we credit in the source:

<<credit>>=
// This implementation is originally based on a public domain D program by
// David Medlock (http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/32113).
Download code
hijacker
hijacker
hijacker
hijacker