CSC 153: Computer Science Fundamentals Grinnell College Spring, 2005
 
Laboratory Exercise Reading
 

More Binary Search Trees

Summary

This laboratory builds on the previous lab on trees to provide more experience with binary search trees. Work in this lab involves expanding program ~walker/java/examples/trees/BSTree.java that formed the basis for the previous lab.

Computing the Height of a Tree

Several terms describe characteristics of nodes and paths within a tree:

For example, consider the following tree:

a binary search tree

In this tree, the path from the root (containing value 44) to the node containing value 29 has length 4.
Similarly, the length of the path from the nodes between values 54 and 56 is 2.

Also, the node at level 0 contains value 44;
the nodes at level 1 contain values 8 and 56;
the nodes at level 2 contain values 3, 23, 51, and 63;
the nodes at level 3 contain values 16, 33, 46, 54, and 72; and
the nodes at level 4 contain values 29, 49, 53, 55, 68, and 75.

Finally, since the nodes farthest away from the root are at level 4, the tree has height 4.

Determining if a Tree is Height Balanced

The following discussion is taken, with minor editing, from Computer Science 2: Principles of Software Engineering, Data Types, and Algorithms by Henry M. Walker [Scott, Foresman and Company, 1989]. This material is used with permission of the copyright holder.

While many tree algorithms work for binary search trees of any shape, the efficiency of tree algorithms often requires that the trees are reasonably compact or balanced. The following definition provides one perspective on the notion of a balanced tree.

Applying this definition to the following trees,

examples of binary search trees

Tree C is not balanced, because the tree has height 3, but 'q' is a leaf at level 1. On the other hand, Tree D is balanced, because all leaves occur at levels 2 and 3 and all nodes at levels 0 and 1 have two children.

While completely balanced trees are particularly compact, it may take considerable processing to maintain such balance in a tree as nodes are inserted and deleted. Thus, completely balanced trees may be of more theoretical than practical interest. However, it turns out that another notion of balance can be maintained with reasonable processing (although the details must be left to a more advanced course).

Applying this definition to Trees A and B in the earlier section of this lab on Computing the Height of a Tree, Tree A is not height-balanced, because the node containing the value 8 has a left subtree of height 0, while its right subtree has height 2. In the same tree, the node with value 63 has a similar problem, although all other nodes have the desired property. Since at least one node is unbalanced, this first tree is not height balanced.

In contrast, Tree B (in the following diagram) is height balanced.

another binary search tree

Deletion of a Node

Finally, in our discussion of binary search trees, we consider how to delete a node from a tree. This tasks sometimes is a bit tricky, because the resulting tree must be a binary tree (with no more than two children for any node), and because it must have an appropriate ordering of data to remain a search tree.

The work required for removing a node from a tree depends upon what children that node might have. To determine the alternatives, consider again Tree B from earlier in this lab:

another binary search tree

The removal of a node from a tree involves four cases:

We now examine each of these cases in turn.

Removal of a leaf

If the node to be deleted is a leaf, then it can be removed without any further effect on the tree. The resulting tree may or may not be balanced or height balanced, but the new tree will still be a valid binary search tree. For example, the following diagram shows Tree B with both nodes a and q removed.

Tree B with nodes a and q removed

Removal of a node with no left child (but with right children)

When a node has a no left child, then the right child may be promoted to the position of the removed node. For example, in the original Tree B, when removing node l, the subtree beginning with node m may take the place of l.

Removal of a node with no right child (but with left children)

Similarly, when a node has no right child, the the left subtree may be promoted. For example, in the original Tree B, when removing node h, the subtree beginning with node g may move upwards.

The following diagram shows the original Tree B with both nodes 'h' and 'l' removed.

Tree B with nodes a and q removed

Removal of a node with both left and right children

In the original Tree B, suppose we wish to remove the root, node k. We cannot just remove node k, since it connects two subtrees. Also, we cannot just move the left or right subtree upward to the root, because we must include the other subtree in the final structure as well. Instead, we first consider what values might replace 'k' in the root. In order to retain the ordering of nodes within a search tree, there are only two choices:

To find either of these candidates for promotion, we follow a similar strategy:

While moving either of these candidates to the root maintains the tree structure, Tree B shows that further work might be needed if a candidate still has a non-empty subtree. In particular, if we promote node l to the root, then we must re-attach m (and any subtree of m) in its place. These steps for promoting l in Tree B are shown below:

Tree B with node k removed and node l promoted

As this illustrates, removal of a node with two children involves three steps:

  1. Finding the smallest value in the right subtree (or analogously, the largest value in the left subtree),
  2. Promoting that value to replace the node to be removed, and
  3. Replace the promoted node further down in the tree by its right subtree (or analogously, the left subtree).

As a technical note, a programmer should observe that changing a tree requires changing a pointer in the parent node (or changing the root pointer). Thus, in working with the removal of nodes, code typically must contain a variable to keep track of a parent node, while other variables keep track of nodes to be deleted and candidates for promotion.


This document is available on the World Wide Web as

    http://www.walker.cs.grinnell.edu/courses/153.sp05/readings/reading-more-BST-trees.shtml

created April 21, 2002
last revised March 24, 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.