CSC 161  Grinnell College  Spring, 2009 
Imperative Problem Solving and Data Structures  
This laboratory exercise applies the concept of loop invariants to problems involving array structures, specifically to the binary search.
The idea of a binary search is reasonably straightforward, but 'Binary search is one of the trickiest "simple" algorithms to program correctly.' [e.g., Wikipedia, article on Binary Search]. In this reading, we will see that loop invariants provide a helpful tool that allows us to actually get the code right.
Acknowledgement: The following fourparagraph description is a slightly edited version of Henry M. Walker, Computer Science 2: Principles of Software Engineering, Data Types, and Algorithms, Little, Brown, and Company, 1989, Section 10.1, p. 389, with programming examples translated from Pascal to C. This material is used with permission from the copyright holder.
The binary search involves looking for an item within an array that has already been sorted. We begin with an array of data a[0], ..., a[size1], and we wish to search for a particular item. The approach is to look for item in the middle of the array and make inferences about where to look next. Overall, the binary search allows us to divide the amount of data under consideration in half each time.
To understand how this is done, we consider how we might look up a name in a telephone book. We begin by opening the telephone book to the middle. If we are lucky, we see the name on the page in front of us. However, even if we are unlucky, we can tell which half of the book contains the name.
Once we know which half the name is in, we turn to the middle of that half. Again, we might be lucky and find the name immediately. Otherwise, we can restrict our attention to just that part where the name must be. (We are now looking at just onequarter of the original book.)
As we proceed in subsequent steps, we continue looking at the middle page of the section remaining, and dividing that section in halves until we find the name or until we run out of pages to look at.
As we focus more closely on the binary search, we need to consider more clearly just what result we might want when we are done. Here are some two of the various possibilities:
Here, we ask for the second result. In practice, if data are in the first part of a large array, then the index returned will indicate where to insert a new item so the array will remain ordered; we would just slide larger elements to the right within the large array and insert the new item.
To describe processing, we first translate the algorithm to a general picture:
In this picture, array elements on the left of the array have been determined to be smaller than the desired item, and elements on the right have been determined to be larger. The variables left and right mark the boundaries of these checked regions, and middle marks the location halfway between left and right.
Although this highlevel picture presents a useful vision for the algorithm, three details require clarification:
If there are an odd number of items remaining unchecked, then middle can indicate exactly the middle array element to be checked. However, if there are an even number of items, should middle be rounded up or down? In C, the two likely computations are:
middle = (left + right) / 2; /* when dealing with integers, C rounds down */ middle = (left+right+1) / 2; /* adding 1 ensures rounding up in C */
For example, the following figure shows six unprocessed elements, so middle may be either the third or fourth element in the array segment.
In coding the binary search, any combination of the above choices can lead to correct code. Difficulties arise, however, when a programmer does not carefully plan which picture to follow. When the meanings of variables change within the code, the code likely fails — at least in some cases, and fixing the identified errors often creates new ones.
To illustrate the use of pictorial loop invariants in developing code, we choose one variation of assignments from above and develop the code. Then, to show other choices also might work, we choose a different variation and develop code for that as well.
In this variation, we choose left and right to be the unprocessed items next to the boundary; we defer the choice of computation for middle until later.
With this choice of loop invariant, we initialize left and right to the extreme ends of the array which have not been processed:
left = 0; right = size  1; middle = ??? /* one of the computations above, does it matter? */
When we consider a guard for our loop, we need to decide when to continue and when to exit. To determine the right conditions, we extend our picture of the loop invariant to when the unprocessed area has shrunk to nothing:
At first, this diagram may seem peculiar — left and right have moved past each other, but let's examine this carefully.
Translating this picture into C code, we first identify the needed condition for continuing the loop. We only stop when right < left or when we have found the desired item, so the main loop should begin:
while ((left <= right) && (a[middle] != item)) {
Within the loop, we will compare a[middle] with item and update either left or right, but what should the update value be? In order to maintain the loop invariant, we need to change the left or right variable to an unprocessed value, and we have already checked a[middle]. Thus, we should move up or down from middle in our assignment:
if (a[middle] < item) left = middle + 1; else right = middle  1;
Finally, what about the computation of middle? We have already noted that at the end we want middle == left. Also, from the picture, we know that at the end left = right + 1. Let's try these values for left or right in the two computations above:
Rounding down: middle = (left + right) / 2; = (right + 1 + right) / 2 /* substitution */ = (2*right + 1) / 2 = right + 1/2 = right /* C's integer division rounds down */ Rounding up: middle = (left + right + 1) / 2; = (right + 1 + right + 1) / 2 /* substitution */ = (2*right + 2) / 2 = right + 2/2 = right + 1 = left
This shows that if we round up, middle will have the needed value, but if we round down, our computation will be off by one.
Putting all the pieces together, we get the following code based on this loop invariant:
/* Binary Search, Version 1 */ left = 0; right = size  1; middle = (left + right + 1) / 2; /* we must round up */ while ((left <= right) && (a[middle] != item)) { if (a[middle] < item) left = middle + 1; else right = middle  1; middle = (left + right + 1) / 2; }
As we have discussed, middle is the index where either a[middle] == item or middle is the place to insert item to keep the array elements ordered.
In this variation, we choose left as in version 1, but we choose right to be the last processed item next to the boundary; as before, we defer the choice of computation for middle until later.
With this choice of loop invariant, we initialize left to the extreme left end of the array which have not been processed, but we must initialize right to just to the right of the array. Again, we leave computation of middle until later.
left = 0; right = size; middle = ??? /* one of the computations above, does it matter? */
When we consider a guard for our loop, we need to decide when to continue and when to exit. To determine the right conditions, we extend our picture of the loop invariant to when the unprocessed area has shrunk to nothing:
In this case, we want left, middle, and right all come together just after the small elements, and they designate the first large element. Again we look at the diagram carefully:
Translating this picture into C code, we first identify the needed condition for continuing the loop. We only stop when right == left or when we have found the desired item, so the main loop should begin:
while ((left < right) && (a[middle] != item)) {
Within the loop, we will compare a[middle] with item and update either left or right, but what should the update value be? In order to maintain the loop invariant, we need to change the left variable to an unprocessed value, but we should change right a processed one. In either case, we have already checked a[middle]. This gives rise to the following assignments:
if (a[middle] < item) left = middle + 1; else right = middle;
Finally, what about the computation of middle? We have already noted that at the end we want middle == left == right. Let's try these these values for left or right in the two computations above:
Rounding down: middle = (left + right) / 2; = (right + right) / 2 /* substitution */ = (2*right) / 2 = right /* C's integer division rounds down */ Rounding up: middle = (left + right + 1) / 2; = (right + right + 1) / 2 /* substitution */ = (2*right + 1) / 2 = right + 1/2 = right /* C's integer division rounds down */
This shows that we will get the same result whether we round up or down, so the choice of rounding does not seem to matter. Typically, we round down because it seems a bit simpler.
Putting all the pieces together, we get the following code based on this loop invariant:
/* Binary Search, Version 2 */ left = 0; right = size; middle = (left + right) / 2; /* rounding does not matter here, so we round down for simplicity */ while ((left < right) && (a[middle] != item)) { if (a[middle] < item) left = middle + 1; else right = middle; middle = (left + right) / 2; }
Both versions of code developed for this lab are available in program ~walker/c/examples/binarysearches.c. Also, it is useful to observe that both binary search algorithms ran correctly the first time they were run.
We can follow a similar approach to develop code for the binary search, based on the other two loop invariants as well.
Such code development can be the basis for wonderful test questions.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/161.sp09/readings/readingloopinvpic..shtml
created 20 April 2008 last revised 5 October 2011 

For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. 