Binary search tree (Python)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C | C++ | Haskell | Java | Python | Scheme | Standard ML

This program is under development.
Please help to debug it. When debugging
is complete, remove the {{develop}} tag.

((Enliteration forthcoming on delete, enumerations. Code forthcoming on advanced operations.))

Linked lists, while robust with insertions and deletions preclude the efficient binary search algorithm. One could simulate a binary search, but it would involve iterating through the linked list to the correct 'index' (node) before testing the value. Instead, one can structure an ordered linked list such that the links dispose it to a binary search by having nodes connect to distant values. This organization invites complexity in adding and deleting nodes to preserve its properties, but appropriate use cases can diminish that cost. This data structure is a Binary Search Tree.


[edit] BST Properties

Technically, a tree is a 'connected, acyclic, undirected graph.' Connected: there is a path to every node from any other node. Acyclic: there is only one path to any node from some other node (it won't form a loop). Undirected: the notion that the link (edge) between nodes is traversable in either direction. Whether a tree implementation is actually undirected depends on whether nodes have pointers to their parent. A binary tree node only has three edges: two children, and one parent. The node without a parent is called a root. Finally, a binary search tree is a binary tree with nodes representing values adhering to a 'total order.' In this case, one child of a node will have a value less than the node in focus and the other, a value greater than the focus node.

[Image:Red-black_tree_example.png|A colored binary search tree tree|right]

For us, this means a tree is a linked list of nodes with a key and two pointers to child nodes. A tree holds a pointer to the root and traverses to whichever node an operation demands via their internal pointers. Maintaining the total order of the keys demands care when inserting and deleting keys (nodes). Insertion involves appending to the deepest node such that its key is either the minimum, maximim (these latter two where appropriate), or in between its parent and grandparent. Deletion involves replacing the obsolete key with the next closest to its value. Depending on the key distribution, this may be as simple as deleting the node (where it has no children) or as complex as adopting the minimum of a child's subtree and replacing that node with its greater valued child.


These techniques ensure, potentially, a much better performance than simpler structures. A bst can insert and delete values faster than a sorted array, as it needn't move aside every value greater than the target. (However, a bst can not search as quickly, given the algorithm involves more steps and more complex lookups.) A bst can also insert, search, and delete faster than a sorted linked list. Here, though, the emphasis is on 'can.' A small collection of nodes renders a bst's conservative operations lagging behind the naive linked list. Thus, most comparisons of operations assume at least ten thousand elements so such fine grained differences are smoothed out.

Even then, it is possible to create a bst whose performance is inferior to a ll, namely by inserting the values in sorted order. In that case, the tree is called 'degenerate': checks for a pointer in between the parent and grandparent on the other side always fail because each node only have one child. The optimal binary search tree, instead, is one whose keys have been inserted by selecting the center element of the subset of the sorted list and then recursing on the lesser or greater half of the list. This produces a tree of minimum depth. Alternately, others have augmented binary search trees to balance their nodes, even if the values are inserted in sorted order.

[edit] Nodes

Nodes will form the atoms of the tree molecule we will build. They will hold (and report) values and pointers to greater and lesser nodes. Further, we will prompt them for information about how many children they have and how their keys compare to those of another node.

class Node( object ) :
child get-set
key get-set
value comparisons
query children

[edit] Fields

Like the (doubly) linked list, a tree is made of nodes with two node pointers. Rather than fore and aft, these are less and more. Others - whose native languages predispose them - prefer to call them left and right. As the value comparison is more appropriate to the tree, we adopt the former naming scheme. For clarity, clients will access or mutate these members through a Node's methods only.

<<child get-set>>=
	def g_less( self ) :
		return self.less
	def g_more( self ) :
		return self.more
	def change_less( self, another ) :
		self.less = another
	def change_more( self, another ) :
		self.more = another

Further, a node holds the key or value. Key is a more appropriate term when the information provided to a node is more nuanced than the value used by the tree explicitly. For example, one could have an employee id number as the key for lookups, but also provide a boolean field for gender, fields for years worked, an enum for the employee's department, and so on. As such, the values are more typically wrapped in another object which would provide the id as the key, but surrender the whole struct to the client for unwrapping.

Were we working with a statically, rather than dynamically, typed language, the value field would need to be generic or templated somehow. Python abstracts this requirement, and we can provide it strings, integers, and floats. If anticipating peeling the key from a struct, one could change node to explicitly deal with that type of object or query isinstance(key, some_object) and then peel out the key value for comparison. In that case, comparisons should definitely call g_val() rather than look at the val so the state handling is encapsulated rather than spread to any method touching a node's value.

<<key get-set>>=
	def g_val( self ) :
		return self.val

	def change_val( self, new_key ) :
		self.val = new_key

When creating a new node, we can assume that we will always know the value of the key. We do not set children, on the other hand, until one is inserted beneath a particular node. So, the constructor presupposes that the less and right arguments will be blank and supplies None. This particular implementation would not suffer in omitting these arguments at all. But, it is possible to handcraft a tree - perhaps for a test - by creating nodes and then supplying the intended links rather than invoking the tree's add() functionality.

	def __init__( self, key, left=None, right=None ) :
		self.val = key
		self.less = left
		self.more = right	

[edit] Queries

The core of bst operations involve comparing two nodes' values. One could overload __cmp__() with a single function that provides a flag (-1,0,1) to indicate a node's relation to the other node (less, same, more; respectively). Here, we essentially prefer rich comparison by overloading __lt__(), __eq__(), __gt__(). That way, when python encounters node_fuji < node_barstow, it calls the appropriate method here.

<<value comparisons>>=
	def __eq__( self, another ) :
		return self.val == another.val

	def __gt__( self, another ) :
		return self.val > another.val

	def __lt__( self, another ) :
		return self.val < another.val

Finally, a node will be called upon to report how many children it has. These provide suitable base cases for traversing the tree. Insertion will append to the appropriate empty side. Deletion will ask for the maximum (or min) of either subtree, which involves following only more links until a node reports more_is_empty( ).

<<query children>>=
	def am_leaf( self ) :
		return self.less is None and self.more is None

	def less_is_empty( self ) :
		return self.less is None

	def more_is_empty( self ) :
		return self.more is None

[edit] Binary Search Tree

Equipped with node, we are ready to create a binary search tree. Its interface will consist of the same four operations we expect for all collections: insertion, search, deletion, and enumeration. (Enumeration means returning a list of all the elements within the tree.) The typical advanced operations also ascribed to a bst are merge, split, enumerate values within a range, and find the kth element.

class Bst( object ) :
advanced interfaces

[edit] Initialization

A binary search tree only needs a single member: a reference to the first node inserted, the root. Python's OO only permits a single class constructor, so __init__ must handle the various possible initialization scenarios. We account for three: no initial value - the client will add the rest later, a list of initial values, and a single initial value.

	def __init__( self, first=None ) :
		self.root = None
		if first is None :
		elif isinstance( first, list ) :
			self._add_multiple( first )
		else :
			self.add( first )

[edit] Insertion

We accomplish insertion by wending a path down to a leaf that is slightly larger or smaller than the client's value and appending a new node. This preserves the total order, so that every node to the less side of the node is indeed less, and every node to the more side has a greater value. As per tradition, insertion will rely on recursion rather than a loop assisted iteration, though the body will be the same. We also provide the opportunity, at runtime or instantiation, for a client to provide a list of values. This is largely sugar though, as the tree will simply insert them sequentially.

client add
add multiple
add single

This is the interface function for the client to insert a list or single value to append to the tree. First, add() tests whether the client provided a list and uses our private method to handle them. Otherwise, the reaction depends upon whethe the root is still null. In that case, we simply replace it with a new node. Otherwise, _add_leaf() will find a leaf to append a new node (here called insertion) with the client's value.

<<client add>>=
	def add( self, wanted ) :
		if isinstance( wanted, list ) :
			self._add_multiple( wanted )
		elif self.is_empty() :
			self.root = Node( wanted )
		else :
			insertion = Node( wanted )
			self._add_leaf( self.root, insertion )

Adding values from a list is as easy as iterating over the list with a foreach loop. However, it is not impossible that a client sent in a multi-dimensional list (perhaps a matrix), so we invoke add() again, just in case. Such are the concessions in a dynamically typed language.

<<add multiple>>=
	def _add_multiple( self, wanted_vals ) :
		for val in wanted_vals :
			self.add( val )

Adding a leaf finally accomplishes the goal. We will assume that duplicates are not meaningful and exit. Otherwise, the node in focus will either be greater or less than the new_node. In either case, should we have reached a leaf on the appropriate side, comparison adopts new_node as a child with change_less() (or more) and the task is done. Otherwise, we have visited an internal node and must examine whether its child (g_less() or g_more(), as appropriate) is a leaf yet.

<<add single>>=
	def _add_leaf( self, comparison, new_node ) : 
		'currently, not checking until I enter'
		if comparison == new_node :
			return # no duplicates
		elif comparison > new_node :
			if comparison.less_is_empty( ) :
				comparison.change_less( new_node )
			else :
				self._add_leaf( comparison.g_less( ), new_node )
		else :
			if comparison.more_is_empty( ) :
				comparison.change_more( new_node )
			else :
				self._add_leaf( comparison.g_more( ), new_node )

[edit] Tests I

There are few cases to test for insertion: adding a new root to an empty tree and a child to a node. We add a redundant key, given the earlier assumption to ignore them. Finally, bst's add() will add list contents via _add_multiple(); we must check that works too. While we could test that these work with more nodes, the algorithms are indifferent to the tree height and, by induction, should work given these base cases.

A word on flag and, below, something_wrong. When a test fails, it will return something_wrong (True) and print a reason. These are collected at the end to determine whether to report whether nothing went wrong.

<<insert tests>>=
def should_insert( tree ) :
	flag = should_add_to_empty( tree )
	flag = should_add_to_root_leaf( tree ) or flag
	flag = should_add_multiple( tree ) or flag
	flag = should_not_add_copy( tree ) or flag
	return flag
T I empty tree
T I to root
T I to a child
T I multiple
T I redundant

As the first insertion is handled separately, we sould test it separate from the rest. Failure here involves the root not adopting the intended value. If it does, this implementation reports the problem and then returns a something_wrong flag. It is also possible to test these conditions with assert statements. However, a failed assert will stop execution, keeping other bugs hidden until the earliest problem is resolved.

<<T I empty tree>>=
def should_add_to_empty( tree ) :
	if not tree.is_empty( ) :
		tree.erase( )
	root_v = 'aa'
	tree.add( root_v )
	if not tree.root.g_val() == root_v :
		print "I> didn't add val to empty tree"
		return something_wrong
	tree.erase( )
	return seems_okay

Once the root exists, we test the more common insertion: to the leaf of another node. If the root doesn't have children or has a child on the wrong side, we implemented _add_leaf() wrong. If the root does have one, but the value doesn't match what we sent, we signal failure.

<<T I to root>>=
def should_add_to_root_leaf( tree ) :
	root_v = -12
	new_val = 21
	tree.add( root_v )
	tree.add( new_val )
	if tree.root.am_leaf( ) :
		print "I> didn't add val to a node"
		return something_wrong
	elif tree.root.more_is_empty() :
		print "I> added new node to opposite side"
		return something_wrong
	elif tree.root.g_more( ).g_val() != new_val :
		print "I> added to right side, wrong value"
		return something_wrong
	return seems_okay

Add() extracts values from lists it's given, so we send it one. While it might be more explicit to test whether one of the nodes has a list, that node's value can't be equal to root_val or l_child, so there is no danger. It's also worth noting that all these tests check the tree's structure explicitly as the new organization is the point of nearly all the functionality. So, we will chain the pointer calls (g_less) to match the anticipated layout.

<<T I multiple>>=
def should_add_multiple( tree ) :
	#print "I> doesn't test multiple yet"
	root_val = .5
	l_child = .25
	lis = [root_val, l_child]
	if not tree.root.g_val() == root_val :
		print "I> multiple didn't deposit first val"
		return something_wrong
	elif not tree.root.g_less().g_val() == l_child :
		print "I> multiple didn't add second val"
		return something_wrong
	return seems_okay

Finally, we confirm that redundant insertions fail, as per the earlier decision. Were root to receive a new child the first time, add_leaf() would have been inappropriately called. If, rather than ignoring subsequent insertions, we were recording times inserted, the conditions would check that the increments matched insertions. If we kept a linked list of redundant or tangentially distinct nodes, we could check the list's structure like we just checked the tree.

<<T I redundant>>=
def should_not_add_copy( tree ) :
	root_v = 'do'
	child = 're'
	tree.add( root_v )
	tree.add( root_v )
	if not tree.root.am_leaf() :
		print "I> added a second copy of a val"
		return something_wrong
	tree.add( child )
	tree.add( child )
	if not tree.root.g_more().am_leaf() :
		print "I> added redundant node to interior"
		return something_wrong
	tree.add( root_v )
	root_okay = tree.root.less_is_empty()
	child_okay = tree.root.g_more().am_leaf()
	if not root_okay or not child_okay :
		print "I> added redundant on second try"
		return something_wrong
	return seems_okay

These tests represent all the cases add() might trip on. Provided no test complains, we can use add() without fear in later tests now.

[edit] Search

Search will, only slightly, differ from the algorithm for insertion above. Namely, comparing the client's key with that of a node in focus and descending to less if less, or through the more pointer if the key is greater. We will, unfortunately, reimplement this algorithm for the three main tasks because each reacts differently upon reaching the base case.

client find
find node

As with insertion, our interface function will test the easiest cases dealing with the root. Namely, is the tree empty? Yes - then, there are no values to find. Is the root's value the target? Yes - then, report success. Note, we will merely provide the client with a confirmation of a value's presence. Where the value is atomic - integers - the client possesses the value queried. Were this bst to be used for a map, the value would contain different information the client needs to reference. In that case, this function would extract the value and return it rather than a boolean.

<<client find>>=
	def confirm( self, desire ) :
		if self.is_empty() :
			return False
		elif self.root.g_val() == desire :
			return True
		else :
			fake = Node( desire )
			result = self._find( self.root, fake )
			if result is None :
				return False
			else : # found
				return True	

Because confirm() checks the client's query against the root's value, _find() is written to assume that the maybe served to this iteration has already been checked. Hence, we determine which child to pull into focus before testing it. If that node is null, the client's value is in between leaves, ie not in the tree. If the values match, we return that node. These together constitute the base case and are united in the lesser block. Otherwise, we must search farther down the tree. The next call of _find() will determine which child to focus on, so maybe is passed naively.

<<find node>>=
	def _find( self, maybe, unconfirmed ) :
		if maybe < unconfirmed :
			maybe = maybe.g_more()
			if maybe is None :
				return maybe # failure
			elif maybe == unconfirmed :
				return maybe # success
			else :
				return self._find( maybe, unconfirmed )
		else : # maybe > unconfirmed
			maybe = maybe.g_less()
			if maybe == unconfirmed or maybe is None :
				return maybe
			else :
				return self._find( maybe, unconfirmed )

[edit] Test S

We test confirm()'s four relevant behaviors. Where the tree is empty, confirm should fail. If the client asks to confirm a value that is in the root, we should see a success. Otherwise, bst employs _find() rather than confirm(). _find() involves comparisons, rather than tree structure, so we can simply hand it a path with the three comparisons (greater, less, & equal) to validate it. (Val_yes is greater than root, less than 6, and is equal to the inserted val_yes.) Finally, we rule out false positives by asking for val_no, which we didn't insert.

<<search tests>>=
def should_confirm( tree ) :
	root_v = 4
	val_yes = 5
	val_no = val_yes + val_yes
	bunch = [ root_v, 3, 6, val_yes]
	if tree.confirm( val_no ) :
		print "C> confirmed value when tree empty"
		return something_wrong
	tree.add( bunch )
	if not tree.confirm( root_v ) :
		print "C> didn't confirm root's value"
		return something_wrong
	elif not tree.confirm( val_yes ) :
		print "C> didn't find key in subtree"
		return something_wrong
	elif tree.confirm( val_no ) :
		print "C> tree thinks it contains an uninserted value"
		return something_wrong
	else :
		return seems_okay

[edit] Deletion

Deletion represents the most complex operation in a binary search tree. Its complexity stems from handling differently shaped trees when preserving the order of the keys. Our goal is to replace the unwanted value with its closest value. Where they exist, these are the maximum of a node's less subtree and the minimum of a node's more subtree. Those nodes will also need to be replaced. Traditionally, this is done by deleting the replacer. Here, we will handle each case explicitly instead.

perform delete
deletion cases
deletion utilities

As we have done so far, delete will handle the simplest cases first. Obviously, an empty tree has no values to delete, so we return in success. If the client no longer requires the root, it is found & the tree can move to deleting. Otherwise, delete() demands the parent pointer, to update its child pointer to a new value. _path_to_obsolete provides this. As with the insert & confirm base cases, following a None pointer indicates the obsolete value is not within the tree, so deletion can declare success.

Otherwise, it is time to perform the deletion. This involves replacing the offending node with its closest successor or predecessor. The offending node may harbor several possible child configurations, so _perform_delete() will select the appropriate case to apply.

Strictly speaking, a stack is overkill for this deletion. The indicated node and its parent provide all the information we need. However, using a stack may be useful if we were to augment our tree with another operation that requires a parent pointer and might recurse upward (splay tree rotation?). The alternative involves modifying node to three pointers, so one points at the parent. This is an instance of the space/time trade off. We favor spending time on, given our undemanding use case.

<<perform delete>>=
	def delete( self, unwanted_val ) :
		if self.is_empty() :
		elif self.root.g_val() == unwanted_val :
		else :
			obsolete = Node( unwanted_val )
			ancestor_stack = [ ]
			ancestor_stack = self._path_to_obsolete( obsolete, self.root, ancestor_stack )
			if ancestor_stack[ -1 ] is None :
			else :
				self._perform_delete( ancestor_stack )
select case

To keep ._perform_delete() simple, we will establish some useful terms to refer to the obsolete node's surroundings. We need obsolete and parent, both of which come right off the ancestor_stack, in that order. We will update the parent's pointer with a new value (None or another node) and will send this information down. Otherwise, the four selection functions would test the same relation, which isn't DRY. Finally, as a convenience, o_less_Leaf will indicate whether obsolete's less pointer is a leaf.

<<prep delete>>=
o_less_Leaf = True
obsolete = ancestor_stack.pop( )
parent = ancestor_stack.pop( )
o_was_less = obsolete < parent
cleanup root

Thus equipped, _perform_delete() can select the appropriate case. A binary node can exist in four configurations: no children, one child on either its less or more side, or it has two children. Where the node is a leaf on both sides, only the parent's pointer must be updated. Failing that, if obsolete's less child is None, then we know it has a more child. The reverse is also true and can be accomplished with almost the same algorithm. In doing so though, we need to provide or negate o_less_Leaf so it queries the correct pointer. Replacing obsolete with two children requires more care. Parent must adopt the node closest to obsolete's value, which may not be its immediate children. So, that case is also handled separately. fix to match update

<<select case>>=
	def _perform_delete( self, ancestor_stack ) :
		prep delete
		if obsolete.am_leaf() :
			self._update_parent( parent_of_o, None, o_was_less )
		elif obsolete.less_is_empty() :
			self._replace_with_solo( parent, o_was_less, obsolete, o_less_Leaf )
		elif obsolete.more_is_empty() :
			self._replace_with_solo( parent, o_was_less, obsolete, not o_less_Leaf )
		else : # two children
			self._replace_with_two_children( parent, o_was_less, obsolete )

Now that we've seen how deletion works in general, we can return to the special case for the root. _perform_delete() demands a parent to update, so we supply one with twice the value of root. No worry, this just determines which side of 'fake' receives the new root. If less is empty, there may be a new root in the more pointer. Otherwise, we will update root with None, which would be valid if the root is a leaf.

<<cleanup root>>=
	def _delete_root( self ) :
		fake = Node( self.root.g_val() + self.root.g_val() )
		fake_stack = [fake, self.root]
		self._perform_delete( fake_stack )
		if fake.less_is_empty() :
			self.root = fake.g_more()
		else :
			self.root = fake.g_less()

((Will put picture of cases here))

<<deletion cases>>=
update parent
single child replaces
resolve parent of two

The simplest configuration to delete is a node with no children. The only node to update in that case is its parent. We provided which side to null and do so. However, this parent update is actually useful for all three types of cases. So the function is abstracted to its current form where the caller decides what to provide to the parent and which side.

<<update parent>>=
	def _update_parent( self, parent, replacer, from_less ) :
		if from_less :
			parent.change_less( replacer )
		else :
			parent.change_more( replacer )

When the obsolete node had a single child, we can simply promote it to the vacated position. o_less_Leaf indicates which side the client found a valid child. Take hold of that side and the tree pinches off the redundant intermediary node.

<<single child replaces>>=
	def _replace_with_solo( self, parent, from_less, obsolete, o_less_Leaf ) :
		if o_less_Leaf :
			self._update_parent( parent_of_o, obsolete.g_more(), from_less )
		else : # more is empty
			self._update_parent( parent_of_o, obsolete.g_less(), from_less )

The final case involves chosing a node with the closest value to obsolete. On the less side, this is the maximum of the subtree and the opposite for the more side. However, always deleting from the less subtree could eventually unbalance the tree and negatively impact all its operations. So we need to arbitrarily choose between the less and right replacements largely equally. We could keep a bool & flip it for every deletion, but that's inelegant as the whole tree is concerned with a single case in deletion. One could query a randomizer for a bool, but this too is overkill. (Our emphasis is alternation, not randomness.) This implementation sides with a simple technique that may not work with nonnumeric data. If obsolete is even, 'resolve parent of two' will select from the less subtree. Otherwise, it uses the minimum of more's subtree.

We will see in more detail in a moment, but both min & max_and_parent() provide the replacer's parent as it, too, may need an update.

<<resolve parent of two>>=
	def _replace_with_two_children( self, parent, from_less, obsolete ) :
		left_replace = obsolete.g_val() % 2 == 0 # magic, or use rand_bool( )
		if left_replace :
			replacer, reps_parent = self._max_and_parent( obsolete.g_less(), obsolete )
			self._less_replace_2_kid( parent, from_less, obsolete, replacer, reps_parent )
		else : # right_replace
			replacer, reps_parent = self._min_and_parent( obsolete.g_more( ), obsolete )
			self._more_replace_2_kid( parent, from_less, obsolete, replacer, reps_parent )
del 2, less
del 2, more

Next section ; code dump follows. Enliteration forthcoming

<<del 2, less>>=
	def _less_replace_2_kid( self, parent, from_less, obsolete, replacer, reps_parent ) :
		if reps_parent is obsolete :
			replacer.change_more( obsolete.g_more() )
			o_less_Leaf = True
			self._replace_with_solo( parent, from_less, obsolete, not o_less_Leaf )
		else :
			self._update_parent( parent, replacer, from_less )
			replacer.change_more( obsolete.g_more() )
			reps_parent.change_more( replacer.g_less() )
			replacer.change_less( obsolete.g_less() )


<<del 2, more>>=
	def _more_replace_2_kid( self, parent, from_less, obsolete, replacer, reps_parent ) :
		if reps_parent is obsolete :
			replacer.change_less( obsolete.g_less() )
			o_less_Leaf = True
			self._replace_with_solo( parent, from_less, obsolete, o_less_Leaf )
		else :
			self._update_parent( parent, replacer, from_less )
			replacer.change_less( obsolete.g_less() )
			reps_parent.change_less( replacer.g_more() ) # is this right?
			replacer.change_more( obsolete.g_more() )
<<deletion utilities>>=
	def _path_to_obsolete( self, obsolete, focus, ancestorStack ) :
		ancestorStack.append( focus )
		if focus is None or focus == obsolete :
			return ancestorStack
		else :
			if focus < obsolete :
				return self._path_to_obsolete( obsolete, focus.g_more( ), ancestorStack )
			else :
				return self._path_to_obsolete( obsolete, focus.g_less( ), ancestorStack )

	def _max_and_parent( self, a_child, a_root ) :
		if a_child.more_is_empty() :
			return a_child, a_root
		else :
			return self._max_and_parent( a_child.g_more(), a_child )

	def _min_and_parent( self, a_child, a_root ) :
		if a_child.less_is_empty() :
			return a_child, a_root
		# else :
		while not a_child.less_is_empty() :
			a_root = a_child
			a_child = a_child.g_less()
		return a_child, a_root

[edit] Test D

<<delete tests>>=
def should_delete( tree ) :
	print "D> doesn't test deletion AT ALL"
	flag = test_deletion_cases( tree )
	flag = test_deletion_utilities( tree ) or flag
	return something_wrong

def test_deletion_cases( tree ) :
	flag = should_D_leaf_root( tree )
	flag = should_D_only_child( tree ) or flag
	flag = should_D_replace_w_sole( tree ) or flag
	flag = should_D_with_2( tree ) or flag
	#flag = ( tree ) or flag
	return flag

def should_D_leaf_root( tree ) :
	val = 5
	tree.add( val )
	tree.delete( val )
	if not tree.is_empty() :
		print "D> solitary root didn't delete"
		return something_wrong
	else :
		return seems_okay

def should_D_only_child( tree ) :
	ro = 4
	l_ch = 3
	r_ch = 5
	tree.add( ro )
	tree.add( l_ch )
	tree.delete( l_ch )
	if not tree.root.am_leaf() :
		print "D> didn't cut solitary less child"
		return something_wrong
	tree.add( r_ch )
	tree.delete( r_ch )
	if not tree.root.am_leaf() :
		print "D> didn't cut solitary more child"
		return something_wrong
	return seems_okay

def should_D_replace_w_sole( tree ) :
	mid_v = 'm'
	l_ch = 'd'
	r_ch = 'r'
	tree.add( [ mid_v, l_ch ] )
	tree.delete( mid_v )
	# from root
	if not tree.root.g_val() == l_ch :
		print "D> root not replaced by sole (less) child"
		return something_wrong
	# from interior
	tree.add( [ mid_v, r_ch ] )
	tree.delete( mid_v )
	if not tree.root.g_more().g_val() != r_ch :
		print "D> "
		return something_wrong
	else :
		return seems_okay

def should_D_with_2( tree ) : # fill
	print "D> doesn't test deleting with two children"

def test_deletion_utilities( tree ) : # verify
	flag = should_get_parent( tree ) # get that hotkey thing
	flag = should_get_min_max( tree ) or flag
	return flag

def should_get_parent( tree ) : # verify
	left_val = 3
	root_val = 5
	between_val = 6
	parent_val = 7
	outer_val = 8
	from_left = True
	tree.add( root_val )
	tree.add( left_val )
	tree.add( parent_val )
	tree.add( between_val )
	tree.add( outer_val )
	print "D> does not Test get_path()"
	return something_wrong

def should_get_min_max( tree ) : # verify
	left_val = 3
	root_val = 5
	between_val = 6
	parent_val = 7
	outer_val = 8
	from_left = True
	tree.add( root_val )
	tree.add( left_val )
	tree.add( parent_val )
	tree.add( between_val )
	tree.add( outer_val )
	return something_wrong

[edit] Enumeration

Provide client with view of tree, either as a list of values or to standard output.

	def g_all_elements( self ) :
		everything = []
		if self.root is None :
			return everything
		else :
			everything = self._traverse( self.root, everything )
			return everything

	def _traverse( self, focus, container ) :
		'in order, obviously'
		if focus is None :
			return container
		else :
			container = self._traverse( focus.g_less(), container )
			container.push( focus.g_val() )
			container = self._traverse( focus.g_more(), container )
			return container

	def is_empty( self ) :
		return self.root is None

	def erase( self ) :
		self.root = None

[edit] Test E

<<enumeration tests>>=
def should_traverse( tree ) :
	bunch = [ 4, 2, 1, 3, 7, 5, 8 ]
	tree.add( bunch )
	bundle = tree.g_all_elements()
	for elem in range( 0, bunch.__len__() - 1 )	:
		if not bundle[ elem ] == bunch[ elem ] :
			print "T> ret " + str( bundle[ elem ] ) + " is not supp " + str( bunch[ elem ] )
			print bundle
			return something_wrong
	return seems_okay

[edit] Print

Enumerate above takes care of traditional, simple printing algorithm. One performs an inorder traversal, sending down the level from the root and tabs over that many times. Here, we take a more interesting case of printing the tree at right angles.

	def pr_hV( self ) :
		if tree.is_empty() :
			print ">--"
		else :
			print "v"
			self._pr_90( [ self.root ] )

	def _pr_hV( self, pr_stack ) :
		vert = pr_stack.__len__() - 1
		if vert >= 0 :
			self._pr_90_links( pr_stack, vert + 1 )
			self._pr_90_links( pr_stack, vert )
			pr_stack = self._pr_90_row( pr_stack )
			self._pr_90( pr_stack )

	def _pr_hV_links( self, pr_stack, vert ) : # test this addition
		for ind in range( 0, vert ) :
			nn = pr_stack[ ind ]
			if nn is None :
				print "  ",
			else :
				print "| ",

	def _pr_hV_row( self, pr_stack ) :
		## python print \n suppression substitutes a space
		row_str = []
		focus = pr_stack.pop()
		while not focus is None :
			row_str.append( str( focus.g_val() ) )
			if not focus.less_is_empty() :
				pr_stack.append( focus.g_less() )
			elif not focus.more_is_empty() :
				pr_stack.append( None ),		
			if not focus.more_is_empty() :
				row_str.append( "-" )
			focus = focus.g_more()
		row_str = ''.join( row_str ) # turn list to string.
		print row_str
		#self._pr_node_stack( pr_stack )
		pr_stack = self._strip_placeholders( pr_stack )
		return pr_stack

	def _strip_placeholders( self, pr_stack ) :
		'ie None'
		focus = None
		while pr_stack.__len__() > 0 :
			focus = pr_stack.pop()
			if not focus is None :
				pr_stack.append( focus )
		return pr_stack

[edit] Advanced

Merge: test easy cases; otherwise, convert to linked lists, merge (like latter half of mergesort), kink into trees again using stack to hold subtrees.

Split: find node and parent, temp = node, parent.side_of_node() = node.g_less()

Enumerate range: find lower bound, use stack to iterate up & down until upper bound is found

Find Kth: deserves augmentation, otherwise use stack to iterate up & down.

[edit] Tests

test framework
node tests
insert tests
search tests
deletion tests
enumeration tests
advanced tests

Test nodes, and all the cases within the standard api: insert, delete, confirm, enumerate.

<<test framework>>=
tree = Bst()
something_wrong = True
seems_okay = not something_wrong
status = seems_okay # at least initially
print "\n\t PROBLEMS"
status = node_should_work() or status
status = should_erase()  or status
status = should_insert( tree ) or status # propagates truth
status = should_confirm( tree ) or status
status = should_delete( tree ) or status
status = should_traverse( tree ) or status
if seems_okay :
	print "> Silent test run"
else :
	print "\n<> so go fix that"
#check_by_eye( tree )
Download code