CSC 153 | Grinnell College | Spring, 2005 |
Computer Science Fundamentals | ||
Laboratory Exercise | ||
This laboratory exercises provides some practice with the insertion sort and Quicksort and outlines a mechanism for determining experimentally the efficiency of these sorting algorithms.
Many applications require the maintenance of ordered data. In Java (and Scheme), such data often are structured in an array (or vector). The reading for this lab provides considerable detail on both the insertion sort and quicksort. You should read that material before proceeding further in this lab.
The insertion sort assumes that the first part of an array is ordered and then adds successive items to this array segment until the entire array is sorted.
A main helper method adds add one item to an ordered array segment. More specifically, suppose items A[0], ..., A[k-1] are ordered in array A. The following code inserts an item into the array, so that items A[0], ..., A[k] become ordered:
int i = k-1;
while ((i >= 0) && a[i] > item){
a[i+1] = a[i];
i--;
}
a[i+1] = item;
Consider the array
Review the steps in the above code for this array segment with
k-1 = 7 and with item having the value 17. Explain
what happens and why.
Using this basic insertion step, an array A can be sorted iteratively. This outline gives rise the the following code, called an insertion sort.
public static void insertionSort (int [] a) {
// method to sort using the insertion sort
for (int k = 1; k < a.length; k++) {
int item = a[k];
int i = k-1;
while ((i >= 0) && a[i] > item){
a[i+1] = a[i];
i--;
}
a[i+1] = item;
}
}
insertionSort
, so that the array is sorted in descending
order rather than ascending order.
The Quicksort, originally devised by C. A. R. Hoare [Computing Journal, Volume 5 (1962), pages 10-15], proceeds by applying an initial step that partitions the array into two pieces. The algorithm then is applied recursively to each of the two pieces. This approach of dividing processing into pieces and applying recursion to the two pieces turns out to be extremely general and powerful. Such an approach is called a divide and conquer algorithm.
The following code implements this basic step:
int left=first+1;
int right=last;
int temp;
while (right >= left) {
// search left to find small array item
while ((right >= left) && (a[first] <= a[right]))
right--;
// search right to find large array item
while ((right >= left) && (a[first] >= a[left]))
left++;
// swap large left item and small right item, if needed
if (right > left) {
temp = a[left];
a[left] = a[right];
a[right] = temp;
}
}
// put a[first] in its place
temp = a[first];
a[first] = a[right];
a[right] = temp;
Additional details are given in the reading for this lab.
Given the above code to place the first element of an array segment appropriately and rearrange small and large items, the full array may be sorted by applying the algorithm recursively to the first part and the last part of the array. The base case of the recursion arises if there are no further elements in an array segment to sort.
This gives rise the the following code, called a quicksort.
public static void quicksort (int [] a) {
// method to sort using the quicksort
quicksortKernel (a, 0, a.length-1);
}
private static void quicksortKernel (int [] a, int first, int last) {
int left=first+1;
int right=last;
int temp;
while (right >= left) {
// search left to find small array item
while ((right >= left) && (a[first] <= a[right]))
right--;
// search right to find large array item
while ((right >= left) && (a[first] >= a[left]))
left++;
// swap large left item and small right item, if needed
if (right > left) {
temp = a[left];
a[left] = a[right];
a[right] = temp;
}
}
// put a[first] in its place
temp = a[first];
a[first] = a[right];
a[right] = temp;
// recursively apply algorithm to a[first]..a[right-1]
// and a[right+1]..a[last], provided these segments contain >= 2 items
if (first < right-1)
quicksortKernel (a, first, right-1);
if (right+1 < last)
quicksortKernel (a, right+1, last);
}
This code illustrates the husk-and-kernel programming style which arose frequently earlier in CSC 153.
quicksort
and/or quicksortKernel
, so that
the array is sorted in descending order rather than ascending order.
The quicksort is called a divide-and-conquer algorithm, because the first step normally divides the array into two pieces and the approach is applied recursively to each piece.
Suppose this code is applied an array containing n randomly ordered data. For the most part, we might expect that the quicksort's divide-and-conquer strategy will divide the array into two pieces, each of size n/2, after the first main step. Applying the algorithm to each half, in turn, will divide the array further -- roughly 4 pieces, each of size n/4. Continuing a third time, we would expect to get about 8 pieces, each of size n/8.
Applying this process i times, would would expect to get about 2i pieces, each of size n/2i.
This process continues until each array piece just has 1 element in it, so 1 = n/2i or 2i = n or i = log2 n. Thus, the total number of main steps for this algorithms should be about log2 n. For each main step, we must examine the various array elements to move the relevant first items of each array segment into their correct places, and this requires us to examine roughly n items.
Altogether, for random data, this suggests that the quicksort requires about log2 n main steps with n operations per main step. Combining these results, quicksort on random data has O(n log2 n).
The Java programming language provides a mechanism to compute run times of various programs. Specifically, System.currentTimeMillis() returns the amount of time (in milliseconds) since January 1, 1970. This method allows us to use the computer to time various algorithms.
long start_time;
long end_time;
(Here, long allows these variables to store larger integers than
the typical int integer data type.) Next, place the following
lines just before and after the call to each sorting algorithm in the body
of main.
start_time = System.currentTimeMillis();
// the call insertionSort (c); goes here
end_time = System.currentTimeMillis();
Finally, place the following print statements after the printing of the
c, d, and e arrays.
out.println ("start: " + start_time);
out.println ("end: " + end_time);
out.println ("elapsed: " + (end_time - start_time));
Now, compile and run the program, and explain what happens.
In your test, the elapsed time likely was shown as 0, since the time to sort a dozen items is smaller than a millisecond -- the accuracy of the clock. To get measurements that better illustrate efficiency issues, we must ask the machine to sort a larger data set. Rather than typing many data elements into the program during execution, we could generate test data as follows:
line = in.readLine(); c[i] = new Integer(line).intValue();by c[i] = i.
line = in.readLine(); c[i] = new Integer(line).intValue();by c[i] = -i.
line = in.readLine(); c[i] = new Integer(line).intValue();by (int)(Math.random()*10000)
Data Size | Ascending Data | Random Data | Descending Data |
---|---|---|---|
1000 | |||
2000 | |||
4000 | |||
8000 | |||
16000 |
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/153.sp05/labs/lab-sorting.shtml
created 24 April 2001 last revised 28 April 2006 |
![]() ![]() |
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |