CSC 153: Computer Science Fundamentals Grinnell College Spring, 2005
 
Laboratory Exercise Reading
 

CGI Programming

Abstract

The lab discusses some basic elements of CGI programming, which allows a Web developer to tailor documents to an individual Web user.

CGI Scripts

As you have seen in the previous lab, the Common Gateway Interface (CGI) provides a mechanism that allows a Web browser to provide information to a server, as part of a request for a document. In addition, a Web document can specify that a Web server should run a CGI script and program as part of a response for a document. One simple example of a cgi script is what-i-know-script.cgi, as shown below:


#!/bin/csh

/usr/bin/mzscheme --mute-banner --version --load /home/walker/public_html/cgi-bin/what-i-know-script.ss
As explained in a previous lab, the primary purpose of this script is to run mzscheme Scheme and load the Scheme program ~walker/public_html/cgi-bin/what-i-know-script.ss .

Retrieving Information from the Browser

The previous lab included the script http://www.walker.cs.grinnell.edu/cgi-bin/what-i-know-script.cgi.

In running this script, you can see that the CGI script and program receives and prints information received from your browser. On our system, using MzScheme, the particular mechanism through which this information is transferred is a non-standard, predefined procedure named getenv, which takes a string as its argument and returns another string as its value.

For example, the following five arguments are used in the what-i-know-script.ss program:

In using the Web, your browser automatically sends along this type of information as part of its communication with a Web server. Within MzScheme, the getenv procedure retrieves pieces of this information. Thus, the call (getenv "REMOTE_HOST") returns the name of the computer running your browser.

Query Strings

In addition to the information automatically sent by the browser, you can supply information to a Web server by appending a string at the end of a URL. As an example, consider the URL http://www.walker.cs.grinnell.edu/cgi-bin/what-i-know-script.cgi?CSC-153-is-cool

In this example, the URL begins with the program or page name already discussed (http://www.walker.cs.grinnell.edu/cgi-bin/what-i-know-script.cgi). This is followed by a question mark (?) and the data CSC-153-is-cool. In working with the Web, this data string is called a query string, and a question mark is used to separate the page information from the query string; everything before the question mark identifies the Web server and page, and everything after is the query string data.

Forms

A special case of query strings arises when a user is asked to fill in boxes within a Web page. For example, consider the following framework that is commonly used in directory searches and other search engines:


Enter the person's first and last names below:

First Name: Last Name:


Such a capability is illustrated with the interface http://www.walker.cs.grinnell.edu/cgi-bin/fac-directory.html. This example allows you to retrieve information about a member of the 1998-1999 Mathematics and Computer Science Department at Grinnell College.

In this form, the box for the last name is identified as "lastname", and the box for the first name is identified as "firstname". With this background, we can trace the full interaction of using this form, assuming you search for instructor "Henry" "Walker":

  1. You enter "Henry" and "Walker" in the boxes above and press the submit button.
  2. Behind the scenes, the form indicates the cgi script http://www.walker.cs.grinnell.edu/cgi-bin/fac-directory.cgi. When you click submit, the Web browser uses this page as the base for its URL.
  3. The Web browser adds information as a query string for this URL:

    http://www.walker.cs.grinnell.edu/cgi-bin/fac-directory.cgi?firstname=Henry&lastname=Walker

    Note that the end of this URL, following the file name fac-directory.cgi, there is a question mark ?, followed by the data: firstname=Henry&lastname=Walker

  4. The CGI script fac-directory.cgi calls the program fac-directory.ss, in the process described previously.
  5. Program fac-directory.ss retrieves this query string with the getenv procedure using the parameter "QUERY_STRING". Program fac-directory.ss then extracts specific name information, and looks up data in a data file http://www.walker.cs.grinnell.edu/cgi-bin/math-cs.faculty.98 .

To reiterate, in the last step above, the procedure call


(getenv "QUERY_STRING")

returns the string firstname=Henry&lastname=Walker. More generally, the procedure call (getenv "QUERY_STRING") looks at the URL that the browser used to activate the CGI program. If that URL just ends in .cgi, then (getenv "QUERY_STRING") returns #f. However, if there is a question mark after the .cgi, and a string of characters after that, then (getenv "QUERY_STRING") returns the characters in this additional string. The user can supply almost any kind of information to the CGI program through a query string, and the CGI program recovers it by decoding and parsing that string.

In CGI programming using forms, a query string usually consists of a sequence of equations separated by ampersands, with some attribute on the left-hand side of each equation and the value of that attribute on the right-hand side. For instance, in our form example, the query string had the form firstname=Henry&lastname=Walker

Because the user often wants to supply attribute values that contain spaces, slashes, question marks, or other special characters that would wreak havoc if attached to URLs, CGI requires that such characters be encoded. The conventional encoding is to replace each space with a plus sign and each special character with a sequence of three characters beginning with a percent sign. The CGI program is expected to decode the strings recovered from the query string. This is usually done with the help of some ``library routine'' -- a procedure that someone else has written. In this lab, you'll find it convenient to use the extract-attributes procedure in http://www.walker.cs.grinnell.edu/public_html/cgi-bin/cgi-utilities.scm. This Scheme code, written by John Stone, takes a query string as argument and returns a list of pairs, with the car in each pair being a fully decoded attribute and the corresponding cdr being its fully decoded value:


(extract-attributes "firstname=Henry&lastname=Walker")
===> (("firstname" . "Henry") ("lastname" . "Walker"))
In this form of an association list, a program can extract specific information for key values with the assoc procedure, as discussed in a lab on pairs and association lists.


This document is available on the World Wide Web as

http://www.walker.cs.grinnell.edu/courses/153.sp06/readings/reading-cgi-programming.shtml

created 3 November 1998
last revised 7 March 2006 by Henry M. Walker
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.