Our experience with Scheme through the first part of this course indicates that lists can be a particularly helpful structure for the storage and processing of a wide variety of data. Lists provide a very flexible context for processing, and unlike arrays, we do not need to specify a maximum size of the array when we create it. In this lab, we consider how lists might be implemented in Java, following an approach analogous to the lists we studied previously in Scheme.
Since Scheme incorporates lists as a built-in data structure, Scheme supplies several built-in operations (e.g., cons, car, and cdr) for list processing, and we could program in Scheme using lists without considering mechanics of how lists were implemented. To process lists in Java, however, we first need to consider some internal details of Scheme lists. We then can translate these details to Java.
Lists in Scheme are implemented based on a graphical model, called a
box-and-pointer representation. The basic idea is to use a
rectangle - divided in half - to represent the result of the cons.
From the first half of the rectangle, we draw an arrow to the head of a
list; from the second half of the rectangle, we draw an arrow to the rest
of the list. For example, (cons 'a '()) would be represented as
follows:
Here, the line to a indicates that this is the head of the list.
The diagonal line through the right half of the rectangle indicates that
nothing comes later in this list. Since (cons 'a '()) gives
the list (a), this diagram represents (a) as well.
Now consider the list (cons 'b '(a)) or (b a). Here, we
draw another rectangle, where the head points to b and the tail
points to the representation of (a) that we already have seen.
The result is:
Similarly, the list (d c b a) is constructed as
(cons 'd (cons 'c (cons 'b (cons 'a '())))) and would be drawn as
follows:
A similar approach may be used for lists, which have components which
are sublists. For example, consider the list ((a) b (c d) e)
This is a list with four components, so at the top level we will need
four rectangles, just as in the previous example for the list (d c b
a). Here, however, the first component designates the list
(a), which itself involves the box-and-pointer diagram already
discussed. Similarly, the list (c d) has two boxes for its two
components (just as we discussed for (b a) earlier). The
resulting diagram follows:
Throughout these diagrams, the null list is represented by a null pointer
or line. Thus, the list containing the null list, ( ( ) ) - that
is (cons '() '()) - is represented by a rectangle with lines
through both halves:
((x) y z) (x (y z)) ((a) b (c ()))
In computer science, this box-and-pointer representation is a primary mechanism used to describe lists -- not just in Scheme, but in most contexts. An implementation of lists in Java, therefore, typically utilizes this graphical perspective and involves two main elements:
A generic ListNode class contains two elements -- one for data and the other to identify the next node on a list. To use a ListNode effectively, a programmer must be able to access and change each of these elements. In considering what to store in a ListNode, one choice is to designate Java's Object class, as this is the base of the class hierarchy in Java. All Java classes ultimately are derived from Object, so any object could be stored as an Object. Program ListNode.java contains the definition of this type of generic ListNode.
While the ListNode class provides appropriate support to build lists that implement box-and-pointer representations, the design of a List class may combined these ListNodes in one of several ways. For example, here are some basic issues:
In Scheme, list processing follows a functional perspective: procedures such as cons, car, cdr, null?, and length take lists as parameters and return new lists or data. In contrast, Java often is motivated by an object-oriented perspective: messages are passed to objects, and objects may change themselves in response to these messages. In designing a List class in Java,
When modifying a list, perhaps with cons or cdr, should there be a connection between the old list and the new one; that is,
To clarify this second point, consider the Scheme statements:
(define x '(b c)) (define y (cons 'a x))
Thus, we can consider y to be the list (a b c). The following figure shows two possible structures that could result:
In the first option, the nodes of the original list are copied, and thus are explicitly distinct from those in the new list. In the second option, a new node is created for the cons node, a new value is added within that node, but the next part of that list refers to the old list.
For the example shown, both options may be reasonable. However, suppose we now change the second element of x from c to d, using Scheme's set-cdr! operation. (I.e., the new x is the list (b d).) In the first option, y is not affected, while in the new approach y becomes the list (a b d). Since y refers to x when nodes are reused, any change to x also affects y. This may or may not be the desired result of changing x.
Overall both approaches have some advantages in certain cases. However, the first approach requires considerable overhead to duplicate nodes. Furthermore, in a purely functional context, lists are not altered during processing. In such a context, we could reuse nodes without fear of altering other lists unexpectedly, as old lists are never changed. Both of these observations explain why Scheme uses the second approach -- reusing nodes when possible.
To illustrate how to implement Scheme-style list operations in Java, we consider writing a ListLikeScheme class for the Scheme operations cons, car, cdr, null?, and length. (Actually, we add a size operation as well. Both length and size return the number of items in the list. Thus, size is redundant, but we use length and size to illustrate both iterative and recursive processing, respectively.)
In implementing Scheme lists, we consider a list class as a framework which contains both a sequence of ListNodes and the desired operations. We use a variable first to identify the first node in the list. As in the box-and-pointer representation, each subsequent node then identifies the next node in the list. For example, in this model, the list (a b c) is represented as follows:
As this picture suggests, the list (a b c) is represented by an object of class ListLikeScheme. Inside the object, a first variable identifies the node for element a, and each subsequent node specifies its successor.
This picture also suggests how several methods should work:
The constructor should create a new object, and initialize first to null.
The car method should return the contents of the data field in the ListNode designated by first. This information is obtainable as first.getData() -- applying the getData method for that node. However, if the list contains no elements (i.e. is a null list), car might throw a NullPointerException.
The isNull method could test whether or not first is null.
For the cdr method, we must choose whether to alter the current list or return a new one. For illustration, we create and return a new list. Thus, we declare a local variable temp, initialize it with the ListLikeScheme constructor, and set its first variable to the next element after the first node. The relevant syntax is
List temp = new ListLikeScheme (); temp.first = first.getNext(); return temp;
For variety, we implement cons to alter the current list object rather than to return a new one. In this form, cons must create a new ListNode, put the data in it, update the next field and set the variable first to this new node. Since much of this work can be done with a ListNode constructor, the actual code is quite short:
first = new ListNode (newData, rest.first);
The size method parallels the recursive list processing we enjoyed in Scheme. Using a husk-and-kernel approach, the real work is done in a kernel with keeps recursing and adding one until a null object is identified. Movement from one ListNode to the next is accomplished by utilizing the getNext method from the current node. As in Scheme, the husk just starts the process -- in this case, calling the kernel with the first node and a count of 0.
The length method parallels an iterative movement through the nodes, counting as it goes. Here, a separate variable ptr is used to keep track of how far processing has proceeded through the sequence of nodes. As in the recursive version, processing continues until a null object is found.
Finally, our ListLikeScheme class includes a toString method which formats a string of the data elements just as might be seen in Scheme. As noted in the lab on generalization, Java often utilizes a toString method and inheritance as part of printing.
These definitions and methods combine to give program ListLikeScheme.java.
Copy ListNode.java and ListLikeScheme.java to your account. Compile and run them to check that they work. Also, read through the code. Ask questions about any sections you do not understand.
Add a method second to ListLikeScheme which returns the second element in a list (if present), or which throws a NullPointerException if the list is null or has only one element. (In this and subsequent exercise, you will want to add lines to main to test your methods.)
Add a method count which counts how many times a specified object appears on a list. count should have one parameter -- the object to be counted.
Note: If item is the object to be counted, you can compare item with any other element with the test item.equals(element). Here you will need to check item with the data in successive nodes.
Add a method last which returns the last item on the list.
Add a method toReverseString which returns a string representing the elements of the list -- ordered from last to first. In writing this code, you should modify the iterative method toString.
Add a method toReverseStringAlt which uses recursion rather than iteration to return the same result as toReverseString.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/153.sp00/lab-lists-in-java.html
created May 4, 2000 by Henry M. Walker
last revised May 4, 2000
Henry Walker (walker@cs.grinnell.edu)