CSC 207 Grinnell College Spring, 2012 Algorithms and Object-Oriented Design

# An Introduction to Trees in Java

## Summary

This reading introduces the concept of a tree data structure, describes a binary search tree as a specific type of tree, and considers how such a tree structure might be implemented in Java.

## Some Definitions

A general tree is defined recursively as follows:

• A null tree (or null list) is a tree.
• If I is a data element and if T1, T2, T3, ..., Tn are n trees, then

is also a tree.

### Terminology

I is called the root or root node, and each Ti is called a subtree of the tree. Nodes that have only null subtrees are called leaves or leaf nodes.

Since this definition is recursive, it may be applied multiple times to construct more complex trees, such as the one shown below:

In this example, e, k, l, m, n, h, o, p, and q are leaves (or trees with null subtrees); a is the root of the overall tree. f is the root of a tree with k and l as subtrees, etc. Similarly, b is the root of a tree with two subtrees, one containing e and one containing f, k, and l.

## Binary Search Trees

A binary search tree or BST is a special type of tree, in which

• each node has at most a left and a right subtree
• for each node, all values in the left subtree come before the value in the root, and
• for each node, all values in the right subtree come after the value in the root.

A schematic view of such a tree follows:

## Implementing Binary Search Trees in Java

The implementation of a binary search tree in Java follows a similar approach to the implementation of lists in either C or Java. First, we consider each data element in a tree to be an object of a TreeNode class. Then we define a BSTree class which combines the nodes into an overall structure.

### A TreeNode Class

For a binary search tree (or any binary tree, for that matter), each node will contain data, and each node will have a left and right field to designate the relevant subtrees — although either or both of the subtrees could be null. Thus, a TreeNode should have fields data, left, and right, together with constructors and methods to access and modify these fields.

For the most part, the code for TreeNode can be parallel to QueueNode from the lab on queues with singly-linked lists. However, binary search trees require that nodes conform to a special ordering — all nodes in a left subtree must have data smaller than in a root node, while all nodes in a right subtree must have larger (or equal) data. In order to maintain this property throughout the tree, we must be able to test the relative ordering of data. That is, the data stored in a TreeNode must implement the Comparable interface.

Class TreeNode shows a typical declaration for such a node class.

### A Binary Search Tree Class

Before implementing any class, we must identify the appropriate operations. Some common operations for a Binary Search Tree class include:

• construct a tree,
• retrieve an entry, given the name,
• update an entry,
• print all entries (alphabetically by name), and
• remove an entry.

In the following discussion, we outline an algorithm for several of these operations, sometimes describing an iterative approach and sometimes using recursiion; often either iteration or recursion could be used for these operations.

For our implementation, we also need an image of how a BSTree class will package tree information. Again, we use the discussion of lists as a model — considering various tree nodes to point to their subtrees within the BSTree. Thus, the BSTree class itself only need specify the initial node or root. Thus, the first binary search tree identified in this lab might be annotated as follows as a BSTree object:

## A BSTree Class, with Commentary

Class BSTree implements a binary search tree, including several methods already identified.

The reader is encouraged to review various elements of the code in conjunction with the following commentary on the various methods.

#### Construction

As with lists, an initial binary search tree will be empty. This may be implemented by setting the root field to null.

#### lookup

Searching in a binary search tree proceeds downward from the root. Following the recursive patterns that are familiar from Scheme, we identify the following cases:

• Base Case 1: The desired item is found and can be returned.
• Base Case 2: A null is encountered at the bottom of the tree. With nothing more to search, the search can report the item is not found.
• Recursive Case 1: The desired item comes after the item in the current node, and the search should continue in the right subtree.
• Recursive Case 2: The desired item comes before the item in the current node, and the search should continue in the left subtree.

#### print

A recursive algorithm starts at the root and applies the following steps for each node:

• print the node's left subtree,
• print the data at the node,
• print the node's right subtree.

The above sequence of events is called an in-order traversal of a tree. Similarly, printing the data at the node, then the left subtree, and then the right subtree is called a pre-order traversal. Printing the data in the node last gives a post-order traversal.

#### insert

For variety, we use an iterative approach to insert entries into a tree. To start, a simple base case checks whether the tree is empty. If so, a new node is generated, initialized with the relevant data, and identified as the new root.

To understand the rest of the insertion process, consider the insertion of the number 153 into the following tree (which repeats the tree given above).

To insert 153, we start at the top of the tree. Checking that 153 comes after the value in the root (123), we advance to the right subtree. We now check that 153 comes before 285, so we advance to the left subtree of 285. Again, we compare 153 with value 185, and realize we should move left. Here, however, we discover there are no further nodes. Thus, we create a new node, place 153 in that node, and identify the new node as the left child of 185's node.

In the code for insert, the variable ptr keeps track of where we are as we work downward node-by-node from the root. To test if an item (newEntry for the BSTree code) comes before the value in a node, we use the following sequence:

• ptr identifies a given node, so
• ptr.getData() identifies an entry at that node. Also,
• newEntry specifies the new data to add, so
• newEntry.compareTo is a method to compare data. Thus,
• newEntry.compareTo(ptr.getData()) uses the comparison method to compare the information for the new newEntry with the information for the node ptr.getData().

## Example: A Campus Directory

To illustrate the use of a binary tree structure, consider the application of developing a campus directory that contains both students and faculty. In this context, directory information might be stored in an Entry class with two subclasses, Student and Faculty:

This class hierarchy allows students and faculty to be intermixed within a campus directory. For efficiency in searching the directory for a name, the entries will be stored in a binary search tree (class BSTree). This requires the Entry class to implement Comparable. A simple implementation of these classes is reasonably straight forward:

With this framework, class DirectoryBST illustrates how a binary search tree might be used (with class BSTree) to store and retrieve information with a campus directory.

In this application, note that a user would search a directory by name, and a successful search would return additional information about the individual.

This document is available on the World Wide Web as

```http://www.walker.cs.grinnell.edu/courses/207.sp12/readings/reading-intro-trees.shtml
```