CSC 153: Computer Science Fundamentals Grinnell College Spring, 2005
 
Laboratory Exercise Reading
 

Character Data

Abstract

This reading introduces character data in Scheme.

Reference

Throughout this lab, you may want to refer to the Revised (5) report on the algorithmic language Scheme for information on various procedures involving characer data.

Character Data

Scheme differentiates between symbols and characters. Quoted characters like 'a are treated as symbols and are not recognized as character data:


    'a --> a
    (symbol? 'a) --> #t
    (char? 'a) --> #f

Character data in Scheme must be entered using the #\name notation instead of quoting, where name is the name of a character:


    #\a --> #\a
    (symbol? #\a) --> #f
    (char? #\a) --> #t
    (char? #\space) --> #t

Character Coding

For many purposes, we can use characters directly. However, behind the scenes, characters are stored in a coded form, and that form may vary from one machine to another. Thus, Scheme provides the char->integer procedure, that converts an individual character into its corresponding integer representation. Similarly, the procedure integer->char converts the integer back to a character. For example, (char->integer #\a) gives the character coding for the lowercase letter a, (char->integer #\G) gives the character coding for uppercase letter G, (char->integer #\;) gives the character coding for a semicolon, and (char->integer #\5) gives the character coding for the digit 5. (Note the character 5 is considered to be different from the number 5.)

Comparing Characters

Scheme allows characters to be compared in several ways. Normally, a comparison examines the underlying code. That is, one character is considered less than another if the code of the first is smaller than the code for the second. Some common comparison procedures are given in the following table:

Procedure Comment
char=? Are two characters equal?
char<? Does first character come first?
char>? Does first character come after?
char<=? Is first character equal the second or does the first come before the second?
char>=? Are the characters equal or does the first come after the second?

Ignoring Capitalization

While it sometimes is convenient to distinguish between uppercase and lowercase letters, at other times, one wants to ignore capitalization. Thus, Scheme also provides character predicates which are case-insensitive:

char-ci=? Same as char=?, but ignoring case
char-ci<? Same as char<?, but considering uppercase and lowercase letters to be equivalent
char-ci>? Same as char>?, but ignoring case
char-ci<=? Same as char<=?, but ignoring case
char-ci>=? Same as char>=?, but ignoring case

To help process characters and strings of characters, Scheme provides several procedures to accomplish common tasks. For example, string->list converts a character string to a list of characters. Similarly, list->string converts a list of characters to a string. Such procedures allow a simple mechanism to analyze characters individually.

For example, the following procedure counts the number of times the letter A appears within a string of characters in either uppercase or lowercase form:


(define count-As
   (lambda (str)
   ;Pre-condition:  str is a character string
   ;Post-condition:  returns number of As in str
      (count-As-kernel (string->list str))
   )
)

(define count-As-kernel
   (lambda (ls)
   ;Pre-condition:  str is a list of characters
   ;Post-condition:  returns number of As in ls, ignoring case differences
      (cond ((null? ls) 0)
            ((char-ci=? #\a (car ls)) (+ 1 (count-As-kernel (cdr ls))))
            (else (count-As-kernel (cdr ls)))
      )
   )
)


This document is available on the World Wide Web as

http://www.walker.cs.grinnell.edu/courses/153.sp05/readings/reading-characters.shtml

created February 26, 1997
last revised January 31, 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.