CSC 223 Grinnell College Fall, 2006
 
Software Design
 

Laboratory Exercise: Introduction to Databases and Java

This laboratory exercise provides a basic overview of databases, together with how databases may be accessed using a programming language. Specific details involve the use of the mySQL database and the Java programming language, running on MathLAN. However, many of the elements discussed here apply with many other databases and programming languages.

What is a Database?

A database is a software system that facilitates the efficient storage, retrieval, modification, and deletion of data. Although the general field of databases has evolved considerably over the years, most modern database systems organize data into tables. Databases based on tables are called relational databases, and most contemporary database packages follow this model.

Database systems typically support sophisticated operations that allow a user to combine information from multiple tables and select items that meet specified complex criteria. Much of this usage requires considerably more study and insight than can be introduced a single laboratory exercise (or even in a single course). Thus, this lab focuses on just the simplest organization and manipulation of data within a single database table. Be sure to talk to the instructor if you are considering using databases in your group's CSC 223 project!

Running Example

To illustrate concepts throughout this laboratory exercise, we consider using a database to store information for a campus directory, based on the same collection of data used in the lab on cgi programming, html forms, and query strings. In that lab, directory information was stored in a comma-delimited file, and a Java program read this information line-by-line. All processing involved reading the entire file, and the Java program explicitly tested each directory entry to determine its relevance for the application. That lab did not consider the possibility that a directory entry might change.

In this lab, we will see that a database routinely will do filtering for us, so much of our processing can be limited to only those directory entries that are relevant.

Tables: Fields and Values

The basic organizational unit within a relational database is the table. Data are organized into rows, called records. Columns indicate a type of information; and each column is given a title, called a field name. For our campus directory example, the first records for the 1998-1999 Mathematics and Computer Science Department at Grinnell College might look as follows:


+---------+---------+-----------+------------------------------------+------------------------+-----+------------+
|facultyID|first    |last       |title                               |email                   |phone|office      |
+---------+---------+-----------+------------------------------------+------------------------+-----+------------+
|       1 |Arnold   |Adelberg   |Myra Steele Professor of Mathematics|adelbe@math.grin.edu    |4201 |Science 2405|
|       2 |Nathaniel|Borenstein |Noyce Visiting Professor            |borenste@cs.grinnell.edu|3017 |Science 1131|
|       3 |Marc     |Chamberland|Assistant Professor of Mathematics  |chamberl@math.grin.edu  |4207 |Science 2410|
|       4 |Pamela   |Ferguson   |Professor of Mathematics            |fergusoa@ac.grin.edu    |3017 |Science 1131|

In this table, the first row shows the field names. Each subsequent row provides a separate directory entry, beginning with a unique facultyID number. Thus, the first record provides information for Arnold Adelberg, the Myra Steele Professor of Mathematics, whose office was Science 2405 and whose telephone extension was 4201. Prof. Adelberg's facultyID in this record is the integer 1.

Preliminaries with mySQL

The software package mySQL provides a full-service database capabilities, with several characteristics:

We now consider each of these characteristics in somewhat more detail.

Create database

On MathLAN, the mySQL database runs on the Web server, aiken.cs.grinnell.edu. The MathLAN System Administrator controls the primary use of mySQL databases. In particular, the MathLAN Administrator must create any new database and grant initial privileges to a user. Thus, if your group wants to use a mySQL database for your CSC 223 project, you should contact the MathLAN System Administrator.

For this laboratory exercise, we will use the database csc223 that has been created specifically for this discussion.

Grant user permissions

Once a database has been established, the user(s) designated by the MathLAN Administrator can grant access to the database to other users. For the csc223 database, this adding of permissions utilizes a SQL statement of the form:


GRANT insert, select, update, delete
    ON csc223.*
    TO username IDENTIFIED BY 'password';

This allows the person with the given username to access all tables related to the csc223 database using the given password. Specific capabilities allow the specified user to add new records to any table (insert), retrieve records (select), modify existing records (update), and remove current records (delete). Additional permissions are possible:

For the current lab, permission has been granted for a group account to insert, select, update, delete, but not to change the nature of the tables. Username and password information will be distributed in class, rather than posted on this Web site.

Create tables

The creation of a table involves a specification of the various fields for that table. For the current directory example, this might be done with the following statement:


create table faculty
( facultyID bigint(20) unsigned not null auto_increment,
  first     varchar(25) not null,
  last      varchar(25) not null,
  title     varchar(255) not null,
  email     varchar(255) not null,
  phone     bigint(20) unsigned not null,
  office    varchar(255) not null,
  primary key (facultyID)
);

This SQL statement establishes a new table, called faculty. This table has 7 fields:

Further, the specification indicates that none of these fields can be null: that is, each of these fields must contain data, and cannot be omitted from a record.

The facultyID field has two additional properties. First, it is a "primary key", meaning that the value in this field must be unique; two records are not allowed to have the same facultyID. In this case, the primacy key is a single field, although in other tables a primary key could involve a combination of fields that together would identify a record uniquely. Second, when information is inserted into this table, one could specify the facultyID explicitly, but one also could ask the database to take then next highest integer value for the facultyID field; if this field is not specified, the database will auto_increment the highest field value to maintain a new, unique integer for this field.

Four Basic Operations

A database has four basic operations for data manipulation: insertion of new records into the database, selection of specified records, updating or changing of records, and deletion of existing records. We now consider some simple versions of these operations.

Insert

The insert statement adds new records to the database. The simplest version of this statement specifies each field, in the order given in the original create statement. Thus, the following statement would add Arnold Adelberg's record to the campus directory:


INSERT INTO faculty VALUES
(NULL, "Arnold",    "Adelberg",
              "Myra Steele Professor of Mathematics",
              "adelbe@math.grin.edu",     4201,"Science 2405");

As this example illustrates, it is common for keywords in an SQL statement to be capitalized, although this is not necessary.

The complete statement to add all records to this departmental database lists each record in parentheses, separated by commas.

When inserting records, we could include the facultyID number explicitly, using a statement:


INSERT INTO faculty VALUES
(1, "Arnold",    "Adelberg",
              "Myra Steele Professor of Mathematics",
              "adelbe@math.grin.edu",     4201,"Science 2405");

However, since facultyID is defined with auto_increment, we can leave that field as NULL, in which case the database will determine an appropriate value.

Select

Once records are inserted into a database, a select statement allows us to retrieve those records. The simplest version of select is:


     SELECT faculty.* FROM faculty;

Here, faculty.* indicates that all fields of a record are to be retrieved, and from faculty indicates that the records are to come from the faculty table within the database.

Rather than retrieve all records, we also can select records that meet given criteria:


     SELECT faculty.* FROM faculty WHERE last="Moore";
     SELECT faculty.* FROM faculty WHERE last="Moore" AND first="Emily";
     SELECT faculty.* FROM faculty WHERE last="Moore" AND first="Henry";

The first of these statements will retrieve those records whose last names match "Moore", namely "Thomas Moore" and "Emily Moore". The second of these queries retrieves the record for only "Emily Moore". The third query retrieves no records for this record information.

The select statement also allows you to sort the records that are returned. Two examples follow:


     SELECT faculty.* FROM faculty WHERE last="Moore" ORDER BY first ASC;
     SELECT faculty.* FROM faculty ORDER BY phone ASC, last DESC;

The first of these statements would return "Emily Moore" followed by "Thomas Moore". Replacing ASC by DESC would sort in descending alphabetical order and place "Thomas Moore" first.

The second statement sorts all records in the database by phone number. Further, if two phone numbers agree, then those records are sorted in reverse alphabetical order by last name.

Delete

The Delete statement removes records from the database.
In using Delete be sure to limit the scope of your work, since this statement has the potential to remove all records from the database.

The following statement removes all records for those with the last name "Hill" from the database:


   DELETE FROM faculty WHERE last="Hill";

Update

The Update statement changes fields within records.
As with Delete be sure to limit the scope of a Update, since this statement has the potential to change all records in the database.

The following statement updates Adelberg's e-mail and office information from the 1998-1999 directory:


   UPDATE faculty
      SET email="adelbe@math.grinnell.edu",
          office="Carnegie Hall 308"
      WHERE last="Adelberg";

Using mySQL with Java over the Web

Although databases packages may be used directly through a terminal window or a direct interface, Web applications typically proceed by utilizing a program (written in languages, such as Java, PHP, perl, or asp) that interacts with the database over a special connection. In the rest of this lab, we consider various details for writing Web-based Java programs that connect with mySQL on MathLAN. The use of other languages on other computer systems is likely to be similar, although specifics may vary.

Processing with Java over the Web to access records in mySQL utilizes the following basic steps:

  1. The user's browser calls a cgi script that runs on a Web server.
  2. The cgi script sets a CLASSPATH and calls the Java program, passing along a query string and/or cookies if appropriate.
  3. The Java program does most of the work:
    1. Cookies or a query string are passed as parameters to main.
    2. The header for a Web page is printed.
    3. The Java program registers the database to initialize a database driver.
    4. The Java program uses the driver to establish a connection with the database.
    5. SQL statements are composed and executed, as part of general processing.
    6. Before finishing, the Java program releases the resources it has used (the SQL statement resource, the database driver).
    7. The footer for the Web page is printed.

The Directory Example, Continued

These various steps for using a database within a Java program will be explained through a cgi/Java script to list faculty in a departmental directory. This particular script prints an entire directory, placing entries in reverse alphabetical order by last and first name. (Hey, this lab's author is "Walker".)

The next several sections of this lab explain the various steps for this database processing within a Java program.

cgi Preliminaries

Processing begins with a cgi script that parallels the work discussed in the lab on html and cgi programing. A sample cgi script is shown below:


    #!/bin/bash
    export CLASSPATH=":/home/walker/public_html/cgi-bin/database:/usr/share/java/mysql.jar"
    /opt/jdk1.5.0/bin/java facDirJavaDb

Some Notes:

The discussion of the next several steps refer to program facDirJavaDb.java that produces the above directory listing. (Actually, this listing has been slightly edited, as explained below, but it is almost the working program.)

In the program, two methods, printWebPageHeader and printWebPageFooter, handle the html chores in just the same manner as discussed in the earlier html/cgi lab.

Register the database

With mySQL, we can load the Java Database Connector (JDBC) driver with the statement


     Class.forName("com.mysql.jdbc.Driver").newInstance();

Class.forName returns and loads a new object for the given driver class. In principle, newInstance is not needed (although it does no real harm). However, documentation indicates that some Java environments contain errors which are resolved by creating a new instance of the given class.

Make a database connection

The connection to the database has the form:


     // Step 2: Establish the connection to the database. 
     String url = "jdbc:mysql://localhost/csc223?user=***&password=***";
     Connection conn = DriverManager.getConnection(url);

To decipher the string url:

Database Processing

After a connection is established, processing proceeds by creating a SQL statement object, forming SQL strings, and executing them. In the current example, this work is accomplished with these Java statements:


     Statement stmt = null;
     ResultSet rs = null;

     // Step 3: Set up a database query, execute it, and get a ResultSet
     stmt = conn.createStatement();
     rs = stmt.executeQuery("SELECT faculty.* FROM faculty ORDER BY last DESC, first DESC");
Alternatively, the query string could be defined first and then executed:

     Statement stmt = null;
     ResultSet rs = null;

     // Step 3: Set up a database query, execute it, and get a ResultSet
     stmt = conn.createStatement();
     String query = "SELECT faculty.* FROM faculty ORDER BY last DESC, first DESC";
     rs = stmt.executeQuery(query);

The first declarations perform the initial setup; the latter are within a try-catch block, as statements creating and execution may fail (and generate an exception).

Execute the query

When executing a SELECT statement, the executeQuery method provides a mechanism to obtain the results easily and simply. For other SQL statements, the relevant query would be


     stmt.execute(query);

where query is a string for an insert, delete, or update statement.

When a program will execute several SQL statements, processing can proceed with only one statement conn.createStatement(). Each SQL statement then can be executed in sequence.

Working with a RecordSet

As noted above, the result of stmt.execute("SELECT ...") is called a RecordSet. Think of a RecordSet as an iterated type, where each item functions like a Hashtable. To expand on this overview,

Clean up

After processing within this program is completed, the program should release the database connection and statement. As shown in the sample program, this is done with statements rs.close() and stmt.close(), each within a try-catch block.

Another Example

As a second example using SQL statements, program facDirJavaDbReset.java

Most of this program follows a similar approach to the previous program. Here, however, the SQL statement is formed in a sequence of steps, by concatenating many strings together.

This second program may be run with the script fac-dir-java-db-reset.cgi

Two final programming notes highlight common coding practices:

Some Practice: First Steps for this Lab

  1. Copy sample scripts/programs, fac-dir-java-db.cgi and facDirJavaDb.java, to the public_html/cgi-bin your account (MathLAN users will find the relevant programs in directory ~walker/java/examples/database)

  2. Edit the cgi scrips, so the CLASSPATH references your cgi folder rather than mine.

  3. Compile the Java program, and set both the .cgi and .class files to be publicly accessible.

  4. Run the program in your browser.

  5. Edit the SELECT statement, as follows:

    1. filter the records to include only records with last name "McRitchie" or later in alphabetical order.
    2. order the records by phone number
  6. Change the filtering clause, so that it returns no records at all.

Directory Lookup

The lab on html and cgi programming demonstrated an application in which names were entered into an html form, a cgi script was called to start a Java program, and the Java program searched through a file to locate directory entries that matched the names from the form. Much of that program involved reading the file and analyzing names.

  1. Using the facDirectoryJava.java as a base, replace the file handling code by interaction with the mySQL database.

  2. Write a new cgi script/Java program that inserts your name into the database as a new record.

  3. Write one more program that changes your record in the database (but no other record).

Note: If your program contains an error that changes the records for the 1998-1999 faculty in the database, you can use the program fac-dir-java-db-reset.cgi to reset those records (but not others that might have been added.).

Work to Turn In

This lab is not required, but steps 7, 8, and 9 may be turned in for extra credit.


This document is available on the World Wide Web as

http://www.walker.cs.grinnell.edu/courses/223.fa05/labs/forms-and-java.shtml

created 31 October 2005
last revised 2 November 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.