CSC 153 | Grinnell College | Spring, 2005 |
Computer Science Fundamentals | ||
Laboratory Exercise | ||
This laboratory exercise provides experience with the basic elements of processing files, when the files are viewed as streams of data.
The outline that follows shows a pseudocode solution to this problem:
Here is the Scheme implementation of the pseudocode:
(define sum-of-file (lambda (source-file-name) ;Pre-condition: source-file-name is the logical name of a file of numbers ;Post-condition: returns sum of numbers in the given file (let ((source (open-input-file source-file-name))) ; Open the file. (let loop ((total 0) ; Initialize the running total. (next (read source))) ; Try to read a number. (if (eof-object? next) ; If you get the end-of-file object, (begin (close-input-port source) ; close the file total) ; and report the final total. (loop (+ next total) ; Otherwise, add the number to ; the running total, (read source))))))) ; try to read another number, ; and repeat the loop.A typical interaction using this procedure would look like this:
> (sum-of-file "/home/walker/151s/labs/file1.dat") 200The file
/home/walker/151s/labs/file1.dat
contains the
four numbers
50 50 75 25
(read source)
appears twice in the
above code. What is the purpose of each appearance of this expression?
In reviewing this processing, note that the data file was viewed as containing a sequence or stream of numbers. Processing data in this file then involved reading the numbers, value-by-value, until we reached the end of the file.
Copy one file to another. That is, create a second file which is character-by-character identical to the first.
The following Scheme code provides a solution of this problem.
(define copy-file (lambda (source-file-name target-file-name) ;Pre-condition: source-file-name is the logical name of a file ;Post-condition: copies contents of source file to target file (let ((source (open-input-file source-file-name)) (target (open-output-file target-file-name))) (let loop ((ch (read-char source))) (if (eof-object? ch) (begin (close-input-port source) (close-output-port target)) (begin (write-char ch target) (loop (read-char source))))))))
Check that this procedure works as claimed by using it to copy the file /home/walker/151s/labs/file2.dat to a file named lab.data in your account.
Modify this procedure so that every lower-case letter that is read in is converted to upper case before being written to the output file.
Write a Scheme procedure tally-char
that takes two
arguments, the name of an input file and a character, and returns a tally
of the number of occurrences of that character in the specified file.
(tally-char "/home/walker/151s/labs/file1.dat" #\5) ===> 4 (tally-char "/home/walker/151s/labs/file2.dat" #\0) ===> 16 (tally-char "/home/walker/151s/labs/file2.dat" #\newline) ===> 3
Hint: Within a main loop, add a parameter to contain the desired count, and update the count appropriately in any recursive call(s).
Assume that a sentence is any sequence of characters ending with a period, question mark, or exclamation point. Modify tally-char, to get a procedure count-sentences, which determines the number of sentences in a file.
File /home/walker/151s/labs/lab-file-description contains the first paragraph of introductory material in this lab, starting "Up to this point, ...". Check that this paragraph contains 2 sentences:
(count-sentences "/home/walker/151s/labs/lab-file-description") ===> 2
Write and test a Scheme procedure that takes two arguments -- the name of an input file containing zero or more integers, and the name of an output file to be created by the procedure -- and copies each integer from the input file to the output file if it is in the range from 0 to 99. Values outside of this range should be read in but not copied out again. The idea is that this procedure will act as a filter, ensuring that only the values that are in the correct range will make it into the output file.
Line breaks in the input file should be ignored. In the output file, arrange for each integer to be printed on a line by itself.
Approximate the number of words in a file.
The following code solves this problem by reading the file word by word:
(define count-words (lambda (source-file-name) ;Pre-condition: source-file-name is the logical name of a file ;Post-condition: returns number of words in the given file (letrec ((source (open-input-file source-file-name)) (print-result (lambda (count) (display "File ") (display source-file-name) (display " contains ") (display count) (display " words.") (newline) )) (find-start-word (lambda (next-char count) (cond ((eof-object? next-char) (begin (close-input-port source) (print-result count) )) ((char-alphabetic? next-char) (find-end-word (read-char source) (+ 1 count))) (else (find-start-word (read-char source) count)) ) )) (find-end-word (lambda (next-char count) (cond ((eof-object? next-char) (begin (close-input-port source) (print-result count) )) ((char-whitespace? next-char) (find-start-word (read-char source) count)) (else (find-end-word (read-char source) count)) ) )) ) (find-start-word (read-char source) 0) ) ) )
Check that this program works by running it on the file /home/walker/151s/labs/lab-file-description which contains the first paragraph of introductory material in this lab, starting "Up to this point, ...". (The paragraph contains 44 words.)
Describe in several sentences how this program works.
count-words is incomplete, in that it contains few comments beyond pre- and post-conditions. Add appropriate commentary to clarify the purpose of each of main part of the code.
Modify count-words so that a word is considered to be only a sequence of letters. That is, for this part, a word is a sequence of letters -- without punctuation or digits.
Challenge Problem: Use the ideas of count-words and count-sentences to write procedure average-words, which determines the average number of words in a sentence. Note that for efficiency, average-words should only read through the file once.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/153.sp05/labs/lab-file-intro.shtml
material for Problems 1 and 2 created in two labs on March 11, 1997
by John D. Stone material merged and reorganized April 5, 1999 by Clif Flynt and Henry M. Walker last revised February 7, 2005 by Henry M. Walker |
![]() ![]() |
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |