Review of Character Processing:
Literal Strings: A string is a sequence of characters. The external form of a string is the characters enclosed in double-quotes:
"This is a string!"Special characters can be included in a string by escaping them with a back-slash:
"Type \"stop\" to quit."Zero-based Indexing: While much work with strings does not require access to individual characters within a string, some procedures reference positions in a string. In such cases, Scheme numbers character positions within strings starting at position 0. For example, consider the string:
"I am very excited by the Scheme programming language!!!"Scheme regards the first character (I) as being in position 0, followed by a blank or space character in position 1. The letters 'a' and 'm' follow in positions 2 and 3, respectively.
Some String Procedures: Some common string procedures are shown in the following table:
Procedure | Sample Call | Result of Example | Comment |
---|---|---|---|
string? | (string? "sample string") | True (#t) | is argument a string? |
string-length | (string-length "sample string") | 13 | number of characters in string |
string-append | (string-append "Big" "Small") | "BigSmall" | concatenate two strings |
substring | (substring "sample string" 3 10) | "ple str" | extract characters from first to before second designated position from string |
string-ref | (string-ref "sample string" 4) | #\l | return character at given position |
string->list | (string->list "example") | (#\e #\x #\a #\m #\p #\l #\e) | makes a list of the characters in a string |
list->string | (list->string '(#\e #\x #\a #\m #\p #\l #\e)) | "example" | makes a string of the characters in a list |
symbol->string | (symbol->string 'example) | "example" | change a given symbol to a string |
string->symbol | (string->symbol "example") | example | convert a given string to a symbol |
Some Comparisons of Strings: Scheme also provides various predicates to compare two strings are equal:
Procedure | Comment |
---|---|
string=? | Are two strings equal? |
string | Does first string come first? |
string>? | Does first string come after? |
string<=? | Is first string equal the second or does the first come before the second? |
string>=? | Are the strings equal or does the first come after the second? |
Scheme also provides string predicates which are case-insensitive:
string-ci=? | Same as string=?, but ignoring case |
string-ci | Same as string, but considering uppercase and lowercase letters to be equivalent |
string-ci>? | Same as string>?, but ignoring case |
string-ci<=? | Same as string<=?, but ignoring case |
string-ci>=? | Same as string>=?, but ignoring case |
Example: Consider the problem of counting the number of vowels within a string.
Approach 1: Convert the letters of the string to a list, and recursively count the vowels on the list. This might lead to the following code (which uses vowel? from earlier in this lab).
(define number-vowels (lambda (str) ;Pre-condition: str is a character string ;Post-condition: returns number of vowels in str (number-vowels-kernel (string->list str)) ) ) (define number-vowels-kernel (lambda (ls) ;Pre-condition: ls is a list of characters ;Post-condition: returns number of vowels in ls (cond ((null? ls) 0) ((vowel? (car ls)) (+ 1 (number-vowels-kernel (cdr ls)))) (else (number-vowels-kernel (cdr ls))) ) ) )
(define number-vowels (lambda (str) ;Pre-condition: str is a character string ;Post-condition: returns number of vowels in str (count-vowels-by-position str 0 0) ) ) (define count-vowels-by-position (lambda (str current-count current-position) ;Pre-condition: str is a character string; counts are 0 ;Post-condition: returns number of vowels in str (cond ((= current-position (string-length str)) current-count) ((vowel? (string-ref str current-position)) (count-vowels-by-position str (+ 1 current-count) (+ 1 current-position))) (else (count-vowels-by-position str current-count (+ 1 current-position))) ) ) )
A common approach for encoding messages involves replacing one letter by another throughout the message. Such an encoding method is called monoalphabetic substitution. As an example, consider the following encoding scheme:
Plain alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ Cipher alphabet: XDQTVBKRAUGMZHYWCJOSENILPF
Now consider the message, "THIS IS A MESSAGE TO ENCODE." For each letter in the message, we encode it by looking up each letter in the plain alphabet and replacing it by the corresponding in the cipher alphabet. Characters not in the plain alphabet (e.g., punctuation) are left unchanged. Thus, the letter T is replaced by the letter S, "THIS" becomes "SRAO", and the entire message is encoded as "SRAO AO X ZVOOXKV SY VHQYTV." Note that the space and period characters are not changed.
The following procedure encodes a letter following this approach:
(define encode-char (lambda (ch plain cipher) ;Pre-condition: ??? ;Post-condition: ??? (encode-char-kernel ch plain cipher 0) ) ) (define encode-char-kernel (lambda (ch plain cipher position) ;Pre-condition: ??? ;Post-condition: ??? (cond ((= position (string-length plain)) ch) ((char-ci=? ch (string-ref plain position)) (string-ref cipher position)) (else (encode-char-kernel ch plain cipher (+ position 1)))) ) )
Using these procedures, a message may be encoded as follows:
(define encode-message (lambda (str plain cipher) ;Pre-condition: str is a character string ; plain and cipher are as in encode-char ;Post-condition: returns transformation of str ; using monoalphabetic substitution (list->string (encode-message-kernel (string->list str) plain cipher)) ) ) (define encode-message-kernel (lambda (lst plain cipher) ;Pre-condition: lst is a character string ; plain and cipher are as in encode-char ;Post-condition: returns transformation of lst ; using monoalphabetic substitution (if (null? lst) '() (cons (encode-char (car lst) plain cipher) (encode-message-kernel (cdr lst) plain cipher)) ) ) )
Check that encode-message works correctly with the data from the above example:
(encode-message "THIS IS A MESSAGE TO ENCODE." "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "XDQTVBKRAUGMZHYWCJOSENILPF")
Add pre- and post-conditions to the above procedures. Be careful to specify all assumptions regarding the plain and cipher alphabets.
Write the corresponding procedure
(define dencode-message (lambda (str plain cipher) ;Pre-condition: str is a character string ; plain and cipher are as in encode-char ;Post-condition: returns a decoding of str ; using monoalphabetic substitution '(--- your part replaces this line ---) ) )
Note this is VERY EASY. In particular, you solution should fit easily on a single line.
Plain alphabet: abcdefghijklmnopqrstuvwxyz Cipher alphabet: rstlnejpaxkdzvqmhbyuofcgiw
Thus, an upper case T would be replaced by S as before, but a lower case t would be replaced by the letter t.
How would you change the code and/or the call to encode-message and decode-message to allow these different substitutions for upper and lower case letters? Be sure to run tests to check your conclusions!
In the above procedures, the user must supply the plain and cipher alphabets. However, in all cases, one would expect that the plain alphabet would be always be the same (i.e., the alphabet with both uppercase and lower case letters in their usual order). Revise encode-message and decode-message, so only a cipher alphabet must be supplied by the use. Again, be sure to test the resulting code.
Write a procedure that reverses the letters in a string. Thus, (string-reverse "this is a string") should return "gnirts a si siht"
A palindrome is a string which reads the same from front to back and from back to front. For example, "this is a palindromemordnilap a si siht" is a palindrome. Write a procedure palindrome that checks if a string is a palindrome.
This document is available on the World Wide Web as
http://www.math.grin.edu/~walker/courses/153.sp00/lab-strings.html