CSC 207  Grinnell College  Spring, 2012 
Algorithms and ObjectOriented Design  
This laboratory exercises provides some practice with the insertion sort and quicksort and outlines a mechanism for determining experimentally the efficiency of these sorting algorithms.
Many applications require the maintenance of ordered data. In Java (and Scheme and C), such data often are structured in an array (or vector). The reading for this lab provides considerable detail on both the insertion sort and quicksort. You should read that material before proceeding further in this lab.
The insertion sort assumes that the first part of an array is ordered and then adds successive items to this array segment until the entire array is sorted.
A main helper method adds add one item to an ordered array segment. More specifically, suppose items A[0], ..., A[k1] are ordered in array A. The following code inserts an item into the array, so that items A[0], ..., A[k] become ordered:
int i = k1;
while ((i >= 0) && a[i] > item){
a[i+1] = a[i];
i;
}
a[i+1] = item;
Consider the array
Review the steps in the above code for this array segment with
k1 = 7 and with item having the value 17. Explain
what happens and why.
Using this basic insertion step, an array A can be sorted iteratively. This outline gives rise the the following code, called an insertion sort.
public static void insertionSort (int [] a) {
// method to sort using the insertion sort
for (int k = 1; k < a.length; k++) {
int item = a[k];
int i = k1;
while ((i >= 0) && a[i] > item){
a[i+1] = a[i];
i;
}
a[i+1] = item;
}
}
insertionSort
, so that the array is sorted in descending
order rather than ascending order.
The Quicksort, originally devised by C. A. R. Hoare [Computing Journal, Volume 5 (1962), pages 1015], proceeds by applying an initial step that partitions the array into two pieces. The algorithm then is applied recursively to each of the two pieces. This approach of dividing processing into pieces and applying recursion to the two pieces turns out to be extremely general and powerful. Such an approach is called a divide and conquer algorithm.
The reading for this lab identifies two main approaches for applying this divideandconquer approach to sorting.
The first approach is motivated by the following pictorial loop invariant:
This loop invariant motivates the following code to implement the basic step of the quicksort:
int left=first+1;
int right=last;
int temp;
while (right >= left) {
// search left to find small array item
while ((right >= left) && (a[first] <= a[right]))
right;
// search right to find large array item
while ((right >= left) && (a[first] >= a[left]))
left++;
// swap large left item and small right item, if needed
if (right > left) {
temp = a[left];
a[left] = a[right];
a[right] = temp;
}
}
// put a[first] in its place
temp = a[first];
a[first] = a[right];
a[right] = temp;
Additional details are given in the reading for this lab.
Given the above code to place the first element of an array segment appropriately and rearrange small and large items, the full array may be sorted by applying the algorithm recursively to the first part and the last part of the array. The base case of the recursion arises if there are no further elements in an array segment to sort.
This gives rise the the following code, called a quicksort.
public static void quicksort (int [] a) { // method to sort using the quicksort quicksortKernel (a, 0, a.length1); } private static void quicksortKernel (int [] a, int first, int last) { int left=first+1; int right=last; int temp; while (right >= left) { // search left to find small array item while ((right >= left) && (a[first] <= a[right])) right; // search right to find large array item while ((right >= left) && (a[first] >= a[left])) left++; // swap large left item and small right item, if needed if (right > left) { temp = a[left]; a[left] = a[right]; a[right] = temp; } } // put a[first] in its place temp = a[first]; a[first] = a[right]; a[right] = temp; // recursively apply algorithm to a[first]..a[right1] // and a[right+1]..a[last], provided these segments contain >= 2 items if (first < right1) quicksortKernel (a, first, right1); if (right+1 < last) quicksortKernel (a, right+1, last); }
This code illustrates the huskandkernel programming style which arose frequently earlier in CSC 151.
quicksort
and/or quicksortKernel
, so that
the array is sorted in descending order rather than ascending order.
The second implementation of the quicksort is motivated by the following pictorial loop invariant.
This loop invariant motivates the following code to implement the basic step of the quicksort:
// progress through array, // moving small elements to a[first+1] .. a[left1] // and moving large elements to a[left] .. a[right1] while (right <= last) { if (a[first] < a[right]) right++; else { // swap a[left] and a[right] temp = a[right]; a[right] = a[left]; a[left] = temp; left++; right++; } } // put a[first] in its place temp = a[first]; a[first] = a[left1]; a[left1] = temp;
Review the code segment and explain why the "then" and "else" clauses of the if statement in the loop maintain the loop invariant.
Explain why the swap at the end of the code swaps a[first] with a[left1] rather than with a[left].
Explain why the loop guard for this approach is right <= last
Explain why the recursive calls involve left2 and left rather than left1 and left+1 that are used in Approach 1.
Check your understanding of this approach to the quicksort by repeating steps 810 for this alternative approach.
Program SortShell4.java provides a shell for testing this alternative quicksort. Run it with several sets of test data.
The quicksort is called a divideandconquer algorithm, because the first step normally divides the array into two pieces and the approach is applied recursively to each piece.
Suppose this code is applied an array containing n randomly ordered data. For the most part, we might expect that the quicksort's divideandconquer strategy will divide the array into two pieces, each of size n/2, after the first main step. Applying the algorithm to each half, in turn, will divide the array further  roughly 4 pieces, each of size n/4. Continuing a third time, we would expect to get about 8 pieces, each of size n/8.
Applying this process i times, would would expect to get about 2^{i} pieces, each of size n/2^{i}.
This process continues until each array piece just has 1 element in it, so 1 = n/2^{i} or 2^{i} = n or i = log_{2} n. Thus, the total number of main steps for this algorithms should be about log_{2} n. For each main step, we must examine the various array elements to move the relevant first items of each array segment into their correct places, and this requires us to examine roughly n items.
Altogether, for random data, this suggests that the quicksort requires about log_{2} n main steps with n operations per main step. Combining these results, quicksort on random data has O(n log_{2} n).
Identify circumstances under which the best case and the worst case analyses could occur. What is the order of each sorting algorithm in the best case and in the worst case? (While class discussion may help answer this question for the insertion sort, and the discussion provides an answer for the quicksort, you still need to tabulate those answers, and you will need to formulate the answer for the refined quicksort.)
Suppose each algorithm requires t milliseconds to sort N items. Using the order analysis you just gave, how long would you expect the algorithm would take to support 2N items. Briefly explain your answer.
The Java programming language provides a mechanism to compute run times of various programs. Specifically, System.currentTimeMillis() returns the amount of time (in milliseconds) since January 1, 1970. This method allows us to use the computer to time various algorithms.
long start_time;
long end_time;
(Here, long allows these variables to store larger integers than
the typical int integer data type.) Next, place the following
lines just before and after the call to each sorting algorithm in the body
of main.
start_time = System.currentTimeMillis();
// the call insertionSort (c); goes here
end_time = System.currentTimeMillis();
Finally, place the following print statements after the printing of the
c, d, and e arrays.
out.println ("start: " + start_time);
out.println ("end: " + end_time);
out.println ("elapsed: " + (end_time  start_time));
Now, compile and run the program, and explain what happens.
In your test, the elapsed time likely was shown as 0, since the time to sort a dozen items is smaller than a millisecond  the accuracy of the clock. To get measurements that better illustrate efficiency issues, we must ask the machine to sort a larger data set. Rather than typing many data elements into the program during execution, we could generate test data as follows:
line = in.readLine(); c[i] = new Integer(line).intValue();by c[i] = i.
line = in.readLine(); c[i] = new Integer(line).intValue();by c[i] = i.
line = in.readLine(); c[i] = new Integer(line).intValue();by (int)(Math.random()*10000)
Data Size  Ascending Data  Random Data  Descending Data 

10000  
20000  
40000  
80000  
160000 
Summarize the results in the table in a few sentences, and relate the results obtained to the order analysis you did earlier.
Modify SortShell5.java to use the alternative implementation, and run the code for a few cases that you timed in step 26. Is there any noticeable difference in time between the two approaches with the alternative loop invariants?
The insertion sort has the characteristic that processing times may vary dramatically. That is, it can be extremely efficient for some types of data, but its efficiency decreases dramatically for some data sets. Would you describe the refined quicksort as having this characteristic as well? Briefly explain.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/207.sp12/labs/labsorting.shtml
created 24 April 2001 revised 28 April 2006 revised 1113 March 2012 

For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. 