CSC 153: Computer Science Fundamentals Grinnell College Spring, 2005
 
Laboratory Exercise Reading
 

Strings

Abstract

This reading builds upon background material on character data and discusses string processing within Scheme. Such processing includes string literals, zero-based indexing, string procedures, and string predicates.

Literal Strings:

A string is a sequence of characters. The external form of a string is the characters enclosed in double-quotes:

   "This is a string!"

Special characters can be included in a string by escaping them with a back-slash:

   "Type \"stop\" to quit."

Zero-based Indexing:

While much work with strings does not require access to individual characters within a string, some procedures reference positions in a string. In such cases, Scheme numbers character positions within strings starting at position 0. For example, consider the string:


   "I am very excited by the Scheme programming language!!!"

Scheme regards the first character (I) as being in position 0, followed by a blank or space character in position 1. The letters 'a' and 'm' follow in positions 2 and 3, respectively.

Some String Procedures:

Some common string procedures are shown in the following table:

Procedure Sample Call Result of Example Comment
string? (string?
"sample string")
True (#t) is argument a string?
string-length (string-length
"sample string")
13 number of characters in string
string-append (string-append "Big" "Small") "BigSmall" concatenate two strings
substring (substring
"sample string" 3 10)
"ple str" extract characters from first to before second designated position from string
string-ref (string-ref
"sample string" 4)
#\l return character at given position
string->list(string->list "example") (#\e #\x #\a #\m #\p #\l #\e) makes a list of the characters in a string
list->string (list->string '(#\e #\x #\a #\m #\p #\l #\e)) "example" makes a string of the characters in a list
symbol->string (symbol->string 'example) "example" change a given symbol to a string
string->symbol (string->symbol "example") example convert a given string to a symbol

Some Comparisons of Strings:

Scheme also provides various predicates to compare two strings are equal:

Procedure Comment
string=? Are two strings equal?
string<? Does first string come first?
string>? Does first string come after?
string<=? Is first string equal the second or does the first come before the second?
string>=? Are the strings equal or does the first come after the second?

Scheme also provides string predicates which are case-insensitive:

string-ci=? Same as string=?, but ignoring case
string-ci<? Same as string<?, but considering uppercase and lowercase letters to be equivalent
string-ci>? Same as string>?, but ignoring case
string-ci<=? Same as string<=?, but ignoring case
string-ci>=? Same as string>=?, but ignoring case

Example:

Consider the problem of counting the number of vowels within a string.

Approach 1:

Convert the letters of the string to a list, and recursively count the vowels on the list. This might lead to the following code (which assumes a previously defined procedure vowel? that determines if a given character is a vowel).


(define number-vowels
   (lambda (str)
   ;Pre-condition:  str is a character string
   ;Post-condition:  returns number of vowels in str
      (number-vowels-kernel (string->list str))
   )
)

(define number-vowels-kernel
   (lambda (ls)
   ;Pre-condition:  ls is a list of characters
   ;Post-condition:  returns number of vowels in ls
      (cond ((null? ls) 0)
            ((vowel? (car ls)) (+ 1 (number-vowels-kernel (cdr ls))))
            (else (number-vowels-kernel (cdr ls)))
      )
   )
)

Approach 2:

Examine each letter in the string, and increase your count (from 0) each time a vowel is encountered. This approach motivates the following code, which moves position by position from the start of the string to the end:


(define number-vowels
   (lambda (str)
   ;Pre-condition:  str is a character string
   ;Post-condition:  returns number of vowels in str
      (count-vowels-by-position str 0 0)
   )
)

(define count-vowels-by-position
   (lambda (str current-count current-position)
   ;Pre-condition:  str is a character string; counts are 0
   ;Post-condition:  returns number of vowels in str
      (cond ((= current-position (string-length str)) current-count)
            ((vowel? (string-ref str current-position))
                  (count-vowels-by-position str 
                              (+ 1 current-count)
                              (+ 1 current-position)))
            (else (count-vowels-by-position str current-count
                              (+ 1 current-position)))
      )
   )
)

Approach 3:

Outline: Proceed with recursion directly. The base case involves the empty string, which contains zero vowels. For other cases, examine the first letter and add one, if necessary, to the result of applying the procedure to the substring consisting of all letters except the first.

Encryption

A common approach for encoding messages involves replacing one letter by another throughout the message. Such an encoding method is called monoalphabetic substitution. As an example, consider the following encoding scheme:


Plain alphabet:   ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher alphabet:  XDQTVBKRAUGMZHYWCJOSENILPF

Now consider the message, "THIS IS A MESSAGE TO ENCODE." For each letter in the message, we encode it by looking up each letter in the plain alphabet and replacing it by the corresponding in the cipher alphabet. Characters not in the plain alphabet (e.g., punctuation) are left unchanged. Thus, the letter T is replaced by the letter S, "THIS" becomes "SRAO", and the entire message is encoded as "SRAO AO X ZVOOXKV SY VHQYTV." Note that the space and period characters are not changed.

The following procedure encodes a letter following this approach:


(define encode-char
   (lambda (ch plain cipher)
   ;Pre-condition:  ???
   ;Post-condition: ???
      (encode-char-kernel ch plain cipher 0)
   )
)

(define encode-char-kernel
   (lambda (ch plain cipher position)
   ;Pre-condition:  ???
   ;Post-condition: ???
      (cond ((= position (string-length plain)) ch)
            ((char-ci=? ch (string-ref plain position))
                 (string-ref cipher position))
            (else (encode-char-kernel ch plain cipher (+ position 1))))
   )
)

Using these procedures, a message may be encoded as follows:


(define encode-message
   (lambda (str plain cipher)
   ;Pre-condition:  str is a character string
   ;                plain and cipher are as in encode-char
   ;Post-condition: returns transformation of str 
   ;                    using monoalphabetic substitution
      (list->string (encode-message-kernel (string->list str) plain cipher))
   )
)

(define encode-message-kernel
   (lambda (lst plain cipher)
   ;Pre-condition:  lst is a character string
   ;                plain and cipher are as in encode-char
   ;Post-condition: returns transformation of lst
   ;                    using monoalphabetic substitution
      (if (null? lst) 
          '()
          (cons (encode-char (car lst) plain cipher)
                (encode-message-kernel (cdr lst) plain cipher))
      )
   )
)

The corresponding procedure for deciphering a message would have the following form:


(define decode-message
   (lambda (str plain cipher)
   ;Pre-condition:  str is a character string
   ;                plain and cipher are as in encode-char
   ;Post-condition: returns a decoding of str 
   ;                    using monoalphabetic substitution
      '(--- details of deciphering would go here ---)
   )
)

This document is available on the World Wide Web as

http://www.walker.cs.grinnell.edu/courses/153.sp05/readings/reading-strings.shtml

created March 5, 1997
last revised February 1, 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.