CSC 153: Computer Science Fundamentals  Grinnell College  Spring, 2005 
Laboratory Exercise Reading  
This laboratory builds on the previous lab on trees to provide more experience with binary search trees. Work in this lab involves expanding program ~walker/java/examples/trees/BSTree.java that formed the basis for the previous lab.
Several terms describe characteristics of nodes and paths within a tree:
For example, consider the following tree:
In this tree, the path from the root (containing value 44) to the node
containing value 29 has length 4.
Similarly, the length of the path from the nodes between values 54 and 56
is 2.
Also, the node at level 0 contains value 44;
the nodes at level 1 contain values 8 and 56;
the nodes at level 2 contain values 3, 23, 51, and 63;
the nodes at level 3 contain values 16, 33, 46, 54, and 72; and
the nodes at level 4 contain values 29, 49, 53, 55, 68, and 75.
Finally, since the nodes farthest away from the root are at level 4, the tree has height 4.
The following discussion is taken, with minor editing, from Computer Science 2: Principles of Software Engineering, Data Types, and Algorithms by Henry M. Walker [Scott, Foresman and Company, 1989]. This material is used with permission of the copyright holder.
While many tree algorithms work for binary search trees of any shape, the efficiency of tree algorithms often requires that the trees are reasonably compact or balanced. The following definition provides one perspective on the notion of a balanced tree.
A tree of height h is said to be balanced or completely balanced if all leaves occur at level h or h1, and if all nodes at levels lower than h1 have two children.
Applying this definition to the following trees,
Tree C is not balanced, because the tree has height 3, but 'q' is a leaf at level 1. On the other hand, Tree D is balanced, because all leaves occur at levels 2 and 3 and all nodes at levels 0 and 1 have two children.
While completely balanced trees are particularly compact, it may take considerable processing to maintain such balance in a tree as nodes are inserted and deleted. Thus, completely balanced trees may be of more theoretical than practical interest. However, it turns out that another notion of balance can be maintained with reasonable processing (although the details must be left to a more advanced course).
A tree is said to be heightbalanced if, for each node in the tree, the height of the left subtree differs from the height of the right subtree by no more than 1.
Applying this definition to Trees A and B in the earlier section of this lab on Computing the Height of a Tree, Tree A is not heightbalanced, because the node containing the value 8 has a left subtree of height 0, while its right subtree has height 2. In the same tree, the node with value 63 has a similar problem, although all other nodes have the desired property. Since at least one node is unbalanced, this first tree is not height balanced.
In contrast, Tree B (in the following diagram) is height balanced.
Finally, in our discussion of binary search trees, we consider how to delete a node from a tree. This tasks sometimes is a bit tricky, because the resulting tree must be a binary tree (with no more than two children for any node), and because it must have an appropriate ordering of data to remain a search tree.
The work required for removing a node from a tree depends upon what children that node might have. To determine the alternatives, consider again Tree B from earlier in this lab:
The removal of a node from a tree involves four cases:
We now examine each of these cases in turn.
If the node to be deleted is a leaf, then it can be removed without any further effect on the tree. The resulting tree may or may not be balanced or height balanced, but the new tree will still be a valid binary search tree. For example, the following diagram shows Tree B with both nodes a and q removed.
When a node has a no left child, then the right child may be promoted to the position of the removed node. For example, in the original Tree B, when removing node l, the subtree beginning with node m may take the place of l.
Similarly, when a node has no right child, the the left subtree may be promoted. For example, in the original Tree B, when removing node h, the subtree beginning with node g may move upwards.
The following diagram shows the original Tree B with both nodes 'h' and 'l' removed.
In the original Tree B, suppose we wish to remove the root, node k. We cannot just remove node k, since it connects two subtrees. Also, we cannot just move the left or right subtree upward to the root, because we must include the other subtree in the final structure as well. Instead, we first consider what values might replace 'k' in the root. In order to retain the ordering of nodes within a search tree, there are only two choices:
To find either of these candidates for promotion, we follow a similar strategy:
While moving either of these candidates to the root maintains the tree structure, Tree B shows that further work might be needed if a candidate still has a nonempty subtree. In particular, if we promote node l to the root, then we must reattach m (and any subtree of m) in its place. These steps for promoting l in Tree B are shown below:
As this illustrates, removal of a node with two children involves three steps:
As a technical note, a programmer should observe that changing a tree requires changing a pointer in the parent node (or changing the root pointer). Thus, in working with the removal of nodes, code typically must contain a variable to keep track of a parent node, while other variables keep track of nodes to be deleted and candidates for promotion.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/153.sp05/readings/readingmoreBSTtrees.shtml
created April 21, 2002 last revised March 24, 2005 

For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. 