CSC 153: Computer Science Fundamentals Grinnell College Spring, 2005
 
Laboratory Exercise Reading
 

Lists in Java

Summary

Our experience with Scheme through the first part of this course indicates that lists can be a particularly helpful structure for the storage and processing of a wide variety of data. Lists provide a very flexible context for processing, and unlike arrays, we do not need to specify a maximum size of a list when we create it. This reading discusses how lists might be implemented in Java, following an approach analogous to the lists we studied previously in Scheme.

Since Scheme incorporates lists as a built-in data structure, Scheme supplies several built-in operations (e.g., cons, car, and cdr) for list processing, and we could program in Scheme using lists without considering mechanics of how lists were implemented. To process lists in Java, however, we first need to consider some internal details of Scheme lists. We then can translate these details to Java.

Background: Lists in Scheme

Lists in Scheme are implemented based on a graphical model, called a box-and-pointer representation. The basic idea is to use a rectangle - divided in half - to represent the result of the cons. From the first half of the rectangle, we draw an arrow to the head of a list; from the second half of the rectangle, we draw an arrow to the rest of the list. For example, (cons 'a '()) would be represented as follows:
the list (a)
Here, the line to a indicates that this is the head of the list. The diagonal line through the right half of the rectangle indicates that nothing comes later in this list. Since (cons 'a '()) gives the list (a), this diagram represents (a) as well.

Now consider the list (cons 'b '(a)) or (b a). Here, we draw another rectangle, where the head points to b and the tail points to the representation of (a) that we already have seen. The result is:
the list (b a)
Similarly, the list (d c b a) is constructed as


     (cons 'd (cons 'c (cons 'b (cons 'a '())))) 
and would be drawn as follows:
the list (d c b a)
A similar approach may be used for lists, which have components which are sublists. For example, consider the list ((a) b (c d) e) This is a list with four components, so at the top level we will need four rectangles, just as in the previous example for the list (d c b a). Here, however, the first component designates the list (a), which itself involves the box-and-pointer diagram already discussed. Similarly, the list (c d) has two boxes for its two components (just as we discussed for (b a) earlier). The resulting diagram follows:
the list ((a) b (c d) e)
Throughout these diagrams, the null list is represented by a null pointer or line. Thus, the list containing the null list, ( ( ) ) - that is (cons '() '()) - is represented by a rectangle with lines through both halves:
the list containing the null list

Representing a Box-and-Pointer in Java

In computer science, this box-and-pointer representation is a primary mechanism used to describe lists -- not just in Scheme, but in most contexts. An implementation of lists in Java, therefore, typically utilizes this graphical perspective and involves two main elements:

A generic ListNode class contains two elements -- one for data and the other to identify the next node on a list. To use a ListNode effectively, a programmer must be able to access and change each of these elements. In considering what to store in a ListNode, one choice is to designate Java's Object class, as this is the base of the class hierarchy in Java. All Java classes ultimately are derived from Object, so any object could be stored as an Object. Program ~walker/java/examples/lists/ListNode.java contains the definition of this type of generic ListNode.

Some Designs for a List Class

While the ListNode class provides appropriate support to build lists that implement box-and-pointer representations, the design of a List class may combine these ListNodes in one of several ways. For example, here are some basic issues:

To clarify this second point, consider the Scheme statements:


(define x '(b c))
(define y (cons 'a x))

Thus, we can consider y to be the list (a b c). The following figure shows two possible structures that could result:

Two alternatives

In the first option, the nodes of the original list are copied, and thus are explicitly distinct from those in the new list. In the second option, a new node is created for the cons node, a new value is added within that node, but the next part of that list refers to the old list.

For the example shown, both options may be reasonable. However, suppose we now change the second element of x from c to d, using Scheme's set-cdr! operation. (I.e., the new x is the list (b d).) In the first option, y is not affected, while in the new approach y becomes the list (a b d). Since y refers to x when nodes are reused, any change to x also affects y. This may or may not be the desired result of changing x.

Overall both approaches have some advantages in certain cases. However, the first approach requires considerable overhead to duplicate nodes. Furthermore, in a purely functional context, lists are not altered during processing. In such a context, we could reuse nodes without fear of altering other lists unexpectedly, as old lists are never changed. Both of these observations explain why Scheme uses the second approach -- reusing nodes when possible.

Implementing Scheme List Operations in Java

To illustrate how to implement Scheme-style list operations in Java, we consider writing a ListLikeScheme class for the Scheme operations cons, car, cdr, null?, and length. (Actually, we add a size operation as well. Both length and size return the number of items in the list. Thus, size is redundant, but we use length and size to illustrate both iterative and recursive processing, respectively.)

In implementing Scheme lists, we consider a list class as a framework which contains both a sequence of ListNodes and the desired operations. We use a variable first to identify the first node in the list. As in the box-and-pointer representation, each subsequent node then identifies the next node in the list. For example, in this model, the list (a b c) is represented as follows:

Modeling the list (a b c)

As this picture suggests, the list (a b c) is represented by an object of class ListLikeScheme. Inside the object, a first variable identifies the node for element a, and each subsequent node specifies its successor.

This picture also suggests how several methods should work:

These definitions and methods combine to give program ~walker/java/examples/lists/ListLikeScheme.java.


This document is available on the World Wide Web as

http://www.walker.cs.grinnell.edu/courses/153.sp05/readings/reading-lists-in-java.shtml

created May 4, 2000
last revised March 24, 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.