CSC 223 | Grinnell College | Fall, 2006 |
Software Design | ||
This laboratory exercise provides a basic overview of databases, together with how databases may be accessed using a programming language. Specific details involve the use of the mySQL database and the Java programming language, running on MathLAN. However, many of the elements discussed here apply with many other databases and programming languages.
A database is a software system that facilitates the efficient storage, retrieval, modification, and deletion of data. Although the general field of databases has evolved considerably over the years, most modern database systems organize data into tables. Databases based on tables are called relational databases, and most contemporary database packages follow this model.
Database systems typically support sophisticated operations that allow a user to combine information from multiple tables and select items that meet specified complex criteria. Much of this usage requires considerably more study and insight than can be introduced a single laboratory exercise (or even in a single course). Thus, this lab focuses on just the simplest organization and manipulation of data within a single database table. Be sure to talk to the instructor if you are considering using databases in your group's CSC 223 project!
To illustrate concepts throughout this laboratory exercise, we consider using a database to store information for a campus directory, based on the same collection of data used in the lab on cgi programming, html forms, and query strings. In that lab, directory information was stored in a comma-delimited file, and a Java program read this information line-by-line. All processing involved reading the entire file, and the Java program explicitly tested each directory entry to determine its relevance for the application. That lab did not consider the possibility that a directory entry might change.
In this lab, we will see that a database routinely will do filtering for us, so much of our processing can be limited to only those directory entries that are relevant.
The basic organizational unit within a relational database is the table. Data are organized into rows, called records. Columns indicate a type of information; and each column is given a title, called a field name. For our campus directory example, the first records for the 1998-1999 Mathematics and Computer Science Department at Grinnell College might look as follows:
+---------+---------+-----------+------------------------------------+------------------------+-----+------------+ |facultyID|first |last |title |email |phone|office | +---------+---------+-----------+------------------------------------+------------------------+-----+------------+ | 1 |Arnold |Adelberg |Myra Steele Professor of Mathematics|adelbe@math.grin.edu |4201 |Science 2405| | 2 |Nathaniel|Borenstein |Noyce Visiting Professor |borenste@cs.grinnell.edu|3017 |Science 1131| | 3 |Marc |Chamberland|Assistant Professor of Mathematics |chamberl@math.grin.edu |4207 |Science 2410| | 4 |Pamela |Ferguson |Professor of Mathematics |fergusoa@ac.grin.edu |3017 |Science 1131|
In this table, the first row shows the field names. Each subsequent row provides a separate directory entry, beginning with a unique facultyID number. Thus, the first record provides information for Arnold Adelberg, the Myra Steele Professor of Mathematics, whose office was Science 2405 and whose telephone extension was 4201. Prof. Adelberg's facultyID in this record is the integer 1.
The software package mySQL provides a full-service database capabilities, with several characteristics:
We now consider each of these characteristics in somewhat more detail.
On MathLAN, the mySQL database runs on the Web server, aiken.cs.grinnell.edu. The MathLAN System Administrator controls the primary use of mySQL databases. In particular, the MathLAN Administrator must create any new database and grant initial privileges to a user. Thus, if your group wants to use a mySQL database for your CSC 223 project, you should contact the MathLAN System Administrator.
For this laboratory exercise, we will use the database csc223 that has been created specifically for this discussion.
Once a database has been established, the user(s) designated by the MathLAN Administrator can grant access to the database to other users. For the csc223 database, this adding of permissions utilizes a SQL statement of the form:
GRANT insert, select, update, delete ON csc223.* TO username IDENTIFIED BY 'password';
This allows the person with the given username to access all tables related to the csc223 database using the given password. Specific capabilities allow the specified user to add new records to any table (insert), retrieve records (select), modify existing records (update), and remove current records (delete). Additional permissions are possible:
For the current lab, permission has been granted for a group account to insert, select, update, delete, but not to change the nature of the tables. Username and password information will be distributed in class, rather than posted on this Web site.
The creation of a table involves a specification of the various fields for that table. For the current directory example, this might be done with the following statement:
create table faculty ( facultyID bigint(20) unsigned not null auto_increment, first varchar(25) not null, last varchar(25) not null, title varchar(255) not null, email varchar(255) not null, phone bigint(20) unsigned not null, office varchar(255) not null, primary key (facultyID) );
This SQL statement establishes a new table, called faculty. This table has 7 fields:
Further, the specification indicates that none of these fields can be null: that is, each of these fields must contain data, and cannot be omitted from a record.
The facultyID field has two additional properties. First, it is a "primary key", meaning that the value in this field must be unique; two records are not allowed to have the same facultyID. In this case, the primacy key is a single field, although in other tables a primary key could involve a combination of fields that together would identify a record uniquely. Second, when information is inserted into this table, one could specify the facultyID explicitly, but one also could ask the database to take then next highest integer value for the facultyID field; if this field is not specified, the database will auto_increment the highest field value to maintain a new, unique integer for this field.
A database has four basic operations for data manipulation: insertion of new records into the database, selection of specified records, updating or changing of records, and deletion of existing records. We now consider some simple versions of these operations.
The insert statement adds new records to the database. The simplest version of this statement specifies each field, in the order given in the original create statement. Thus, the following statement would add Arnold Adelberg's record to the campus directory:
INSERT INTO faculty VALUES (NULL, "Arnold", "Adelberg", "Myra Steele Professor of Mathematics", "adelbe@math.grin.edu", 4201,"Science 2405");
As this example illustrates, it is common for keywords in an SQL statement to be capitalized, although this is not necessary.
The complete statement to add all records to this departmental database lists each record in parentheses, separated by commas.
When inserting records, we could include the facultyID number explicitly, using a statement:
INSERT INTO faculty VALUES (1, "Arnold", "Adelberg", "Myra Steele Professor of Mathematics", "adelbe@math.grin.edu", 4201,"Science 2405");
However, since facultyID is defined with auto_increment, we can leave that field as NULL, in which case the database will determine an appropriate value.
Once records are inserted into a database, a select statement allows us to retrieve those records. The simplest version of select is:
SELECT faculty.* FROM faculty;
Here, faculty.* indicates that all fields of a record are to be retrieved, and from faculty indicates that the records are to come from the faculty table within the database.
Rather than retrieve all records, we also can select records that meet given criteria:
SELECT faculty.* FROM faculty WHERE last="Moore"; SELECT faculty.* FROM faculty WHERE last="Moore" AND first="Emily"; SELECT faculty.* FROM faculty WHERE last="Moore" AND first="Henry";
The first of these statements will retrieve those records whose last names match "Moore", namely "Thomas Moore" and "Emily Moore". The second of these queries retrieves the record for only "Emily Moore". The third query retrieves no records for this record information.
The select statement also allows you to sort the records that are returned. Two examples follow:
SELECT faculty.* FROM faculty WHERE last="Moore" ORDER BY first ASC; SELECT faculty.* FROM faculty ORDER BY phone ASC, last DESC;
The first of these statements would return "Emily Moore" followed by "Thomas Moore". Replacing ASC by DESC would sort in descending alphabetical order and place "Thomas Moore" first.
The second statement sorts all records in the database by phone number. Further, if two phone numbers agree, then those records are sorted in reverse alphabetical order by last name.
The Delete statement removes records from the database.
In using Delete be sure to limit the scope of your work, since
this statement has the potential to remove all records from the
database.
The following statement removes all records for those with the last name "Hill" from the database:
DELETE FROM faculty WHERE last="Hill";
The Update statement changes fields within records.
As with Delete be sure to limit the scope of a Update,
since this statement has the potential to change all records in the
database.
The following statement updates Adelberg's e-mail and office information from the 1998-1999 directory:
UPDATE faculty SET email="adelbe@math.grinnell.edu", office="Carnegie Hall 308" WHERE last="Adelberg";
Although databases packages may be used directly through a terminal window or a direct interface, Web applications typically proceed by utilizing a program (written in languages, such as Java, PHP, perl, or asp) that interacts with the database over a special connection. In the rest of this lab, we consider various details for writing Web-based Java programs that connect with mySQL on MathLAN. The use of other languages on other computer systems is likely to be similar, although specifics may vary.
Processing with Java over the Web to access records in mySQL utilizes the following basic steps:
These various steps for using a database within a Java program will be explained through a cgi/Java script to list faculty in a departmental directory. This particular script prints an entire directory, placing entries in reverse alphabetical order by last and first name. (Hey, this lab's author is "Walker".)
The next several sections of this lab explain the various steps for this database processing within a Java program.
Processing begins with a cgi script that parallels the work discussed in the lab on html and cgi programing. A sample cgi script is shown below:
#!/bin/bash export CLASSPATH=":/home/walker/public_html/cgi-bin/database:/usr/share/java/mysql.jar" /opt/jdk1.5.0/bin/java facDirJavaDb
Some Notes:
The discussion of the next several steps refer to program facDirJavaDb.java that produces the above directory listing. (Actually, this listing has been slightly edited, as explained below, but it is almost the working program.)
In the program, two methods, printWebPageHeader and printWebPageFooter, handle the html chores in just the same manner as discussed in the earlier html/cgi lab.
With mySQL, we can load the Java Database Connector (JDBC) driver with the statement
Class.forName("com.mysql.jdbc.Driver").newInstance();
Class.forName returns and loads a new object for the given driver class. In principle, newInstance is not needed (although it does no real harm). However, documentation indicates that some Java environments contain errors which are resolved by creating a new instance of the given class.
The connection to the database has the form:
// Step 2: Establish the connection to the database. String url = "jdbc:mysql://localhost/csc223?user=***&password=***"; Connection conn = DriverManager.getConnection(url);
To decipher the string url:
After a connection is established, processing proceeds by creating a SQL statement object, forming SQL strings, and executing them. In the current example, this work is accomplished with these Java statements:
Statement stmt = null; ResultSet rs = null; // Step 3: Set up a database query, execute it, and get a ResultSet stmt = conn.createStatement(); rs = stmt.executeQuery("SELECT faculty.* FROM faculty ORDER BY last DESC, first DESC");Alternatively, the query string could be defined first and then executed:
Statement stmt = null; ResultSet rs = null; // Step 3: Set up a database query, execute it, and get a ResultSet stmt = conn.createStatement(); String query = "SELECT faculty.* FROM faculty ORDER BY last DESC, first DESC"; rs = stmt.executeQuery(query);
The first declarations perform the initial setup; the latter are within a try-catch block, as statements creating and execution may fail (and generate an exception).
When executing a SELECT statement, the executeQuery method provides a mechanism to obtain the results easily and simply. For other SQL statements, the relevant query would be
stmt.execute(query);
where query is a string for an insert, delete, or update statement.
When a program will execute several SQL statements, processing can proceed with only one statement conn.createStatement(). Each SQL statement then can be executed in sequence.
As noted above, the result of stmt.execute("SELECT ...") is called a RecordSet. Think of a RecordSet as an iterated type, where each item functions like a Hashtable. To expand on this overview,
while (rs.next()) { }
After processing within this program is completed, the program should release the database connection and statement. As shown in the sample program, this is done with statements rs.close() and stmt.close(), each within a try-catch block.
As a second example using SQL statements, program facDirJavaDbReset.java
Most of this program follows a similar approach to the previous program. Here, however, the SQL statement is formed in a sequence of steps, by concatenating many strings together.
This second program may be run with the script fac-dir-java-db-reset.cgi
Two final programming notes highlight common coding practices:
String query = "SELECT * from faculty "; if (nameGiven) query += "ORDER by last ASC;"; else query += "ORDER by phone ASC;";
String query = "Select * from faculty "; query += " where last = \"" + nameFromForm + "\"";
In this last example, note that quotes are included within the string on each side of the variable nameFromForm, as this is required by SQL syntax.
Copy sample scripts/programs, fac-dir-java-db.cgi and facDirJavaDb.java, to the public_html/cgi-bin your account (MathLAN users will find the relevant programs in directory ~walker/java/examples/database)
Edit the cgi scrips, so the CLASSPATH references your cgi folder rather than mine.
Compile the Java program, and set both the .cgi and .class files to be publicly accessible.
Run the program in your browser.
Edit the SELECT statement, as follows:
Change the filtering clause, so that it returns no records at all.
The lab on html and cgi programming demonstrated an application in which names were entered into an html form, a cgi script was called to start a Java program, and the Java program searched through a file to locate directory entries that matched the names from the form. Much of that program involved reading the file and analyzing names.
Using the facDirectoryJava.java as a base, replace the file handling code by interaction with the mySQL database.
Write a new cgi script/Java program that inserts your name into the database as a new record.
Write one more program that changes your record in the database (but no other record).
Note: If your program contains an error that changes the records for the 1998-1999 faculty in the database, you can use the program fac-dir-java-db-reset.cgi to reset those records (but not others that might have been added.).
This lab is not required, but steps 7, 8, and 9 may be turned in for extra credit.
This document is available on the World Wide Web as
http://www.walker.cs.grinnell.edu/courses/223.fa05/labs/forms-and-java.shtml
created 31 October 2005 last revised 2 November 2005 |
![]() ![]() |
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |