|CSC 105||Grinnell College||Spring, 2005|
|An Algorithmic and Social Overview of Computer Science|
Previously, you have considered the efficiency of sorting algorithms -- programs to search a collection of data for a specified item. Our work proceeded in three main steps:
We now consider another common computing task: placing a collection of data in order -- in this case, in ascending order. Reordering of data as necessary, so the result is in ascending order, is called sorting. Here, we consider three sorting algorithms:
Each of these algorithms have been outlined in class.
The insertion sort has been described in class. Write a paragraph or two explanation of how an insertion sort works.
A quicksort is an effective, but more complex, sorting algorithm. Developed by C. A. R. Hoare in 1962, the quicksort proceeds by trying to place a selected array element in its correct position, with items located before the selected element smaller than it and items located after being larger. With this basic step, the process then is repeated on the smaller first and last halves of the array.
While details require some care, the basic idea allows processing to divide the array in half with each main step -- in a way analogous to a binary search. The work involved with each main step is significantly greater for the quicksort than for the binary search. However, the approach of dividing the array in half repeated often yields considerable efficiencies.
When analyzing sorting algorithms, it often happens that the analysis of efficiency depends considerably on the nature of the data. For example, if the original data set already is almost ordered, a sorting algorithm may behave rather differently than if the data set originally contains random data or is ordered in the reverse direction. For this reason, the analysis for sorting algorithms often considers separate cases, depending on the original nature of the data.
Consider the insertion sort:
The program sortTest performs both an insertion sort and quicksort for three types of data sets: data initially already in ascending order, data in random order, and data initially in descending order. In each case, the user specifies the number of data items to be sorted. The program then sorts the three data sets and records the time required for this work. As with our searching experiments, due to the speed of the computer and the limitations of the clocking mechanism available, we repeat each experiment several times -- in this case 500 rather than the 60,000 times done in the searching experiment. As with searching, this means that times are magnified, so we can easily see differences.
The mechanics of running these search experiments are similar to those for the searching experiments:
Log onto a computer in MathLAN, and open a terminal window.
In the terminal window, type the command:
The program itself can be run with the command
The program will ask you to enter the size of the sequence to be searched. Suggested values might be selected from the range from 500 to 10,000, although you can choose smaller or larger sizes for the array.
As noted for the experiments on searching, after you run the above command the first time, you can use the upward-arrow key at your keyboard to retrieve the same command again. After hitting the upward-arrow key to get the desired command, hit Return or Enter to run the program again.
In this part of the lab, you are to gather experimental data regarding the times for sorting various size arrays using both the insertion sort and quicksort. In running your experiments, you should record separately the results for the different types of data sets.
Run sortTest for a variety of array sizes between 500 and 10,000, recording your results in each case in a spreadsheet.
Use the spreadsheet to plot the times for various array sizes. The horizontal axis should indicate the size of the array, and the vertical axis should indicate time. Construct separate graphs for sorting data initially in ascending order, in random order, and in descending order. You should conduct sufficient experiments, so that a fairly consistent pattern emerges.
Describe (in words) the nature of the graphs you have observed.
Examine the graphs to determine how the efficiency of the two sorting algorithms varies with different types of initial data.
In each case, explain your answer.
The permutation sort generates and checks all permutations of a data set until it finds one that is ordered. If the permutation sort is lucky, it may happen upon an ordered arrangement quickly. If the algorithm is unlucky, the permutation sort may try all permutations of the data set before finding the ordered one.
If a data set has n elements, then the number of permutations may be computed as follows:
Altogether, the number of permutations possible for a data set with n elements is n*(n-1)*(n-2)*(n-3)*...*3*2*1. This value is called n factorial, written n!. Altogether, a permutation sort might have to generate n! permutations of data before finding the one that is ordered.
For your information, the first 10 values of n! are given in the following table:
As you can see, the factorial function increases very rapidly!
In the permutation sort, once a permutation is generated, then it must be examined to determine if it is ordered. This typically requires about n steps -- one for each item in the array.
Altogether, with generating and checking permutations, the amount of work for a permutation sort can be proportional to n * n!
Program permutationSortTest performs a permutation sort on different types of data, given the size of the array. Note that this program does not do this work multiple times, in contrast to the previous experiments you have run. The timings you record in these experiments reflect the actual time required to perform the sort just once.
Run the program permutationSortTest, following the same steps followed for sortTest above. In collecting data, use array sizes of 4, 5, 6, 7, 8, 9. (You can try 10 as well, but have something else to do as you are waiting.)
As with insertion sort and quicksort, plot the times you experienced using a spreadsheet
In reviewing your data, explain how the permutation sort provides an example of an algorithm that is caught by the combinatorial explosion.
created December 31, 2003
last revised March 1, 2005