| CSC 161 | Grinnell College | Spring, 2012 |
| Imperative Problem Solving and Data Structures | ||
This page outlines a variety of techniques that are available within C for working with strings. Discussion of techniques falls into several categories:
All examples that follow may be found in the C-program, string-processing.c
In C, a string is an array of characters, and each character can be accessed directly through in index within the array.
Example:
char myArray[17] = "computer science";
int i; for (i = 0; i < strlen(myArray); i++)
printf ("\tmyArray[%d] = %c\n", i, myArray[i]);
Given a string, a common task is to extract a substring from a string. For example, in Scheme, one might use the substring function:
(substring str start end)
According to the Scheme R5 documentation, "`Substring' returns a newly allocated string formed from the characters of string beginning with index start (inclusive) and ending with index end (exclusive)."
In C, there is not a function with the name substring, but the function strncpy accomplishes the same result; one copies end-start characters from str, starting with the offset start. If the extracted string does not go to the end of str, then one must remember to add a null character at the end.
The header of strncpy reads:
char *strncpy(char *result, const char *original, int numberCharacters);
In addition to copying the characters from original to result, the function returns a pointer to result.
Several examples follow:
/* extracting substrings from a string */
/* the example uses myArray from above as the starting point */
printf ("extracting substrings from a string\n");
char subArray[20]; /* for a string copy, remember to have allocated space! */
strncpy (subArray, myArray, 8);
subArray[8] = 0; /* null terminate array */
printf ("\t extract the first 8 characters (plus the null): %s\n", subArray);
strncpy (subArray, myArray+3, strlen(myArray)); /* copying past end of myArray fills will nulls */
printf ("\t extract all characters after the first 3: %s\n", subArray);
int start = strlen(myArray) - 8;
strncpy (subArray, myArray+start, 8);
subArray[strlen(myArray)-8] = 0; /* null terminate array */
printf ("\t extract the last 8 characters (plus the null): %s\n", subArray);
Some applications require a string to be separated into pieces. For example, consider the string
We might want to break it into a sequence of substrings, separated by a given separator or delimiter. This task is accomplished by the following function:
char *strtok(char *strptr, const char *delimiter);
The first call to strtok returns the first string before the given delimiter. For subsequent calls, a null string is used for the strptr parameter, and these calls return the next elements in the sequence. When no more elements are present, strtok returns NULL.
Warnings:
The following example places the months given above into an array of 12 string points:
char year[] = "January,February,March,April,May,June,July,August,September,October,November,December" ;
/* copy original array */
char * yearCopy = malloc(sizeof(char)*(strlen(year) + 1));
strcpy (yearCopy, year);
/* array to receive the string pieces, including space for NULL strtok at end */
char * months[13];
/* loop puts strings into successive elements of months array */
months[0] = strtok(yearCopy, ",");
i = 1;
while (months[i] = strtok(NULL, ",")) /* continue as long as token not NULL */
{
i++;
}
printf ("result of using strtok to break up a string of months\n");
for (i = 0; i < 12; i++)
printf ("\t month %d: %s\n", i, months[i]);
printf ("original year array: %s\n", year);
Input from a command line (and input from scanf using %s format) is considered a string. The stdlib.h library contains several functions to convert strings to numbers:
int atoi(const char *strptr); /* converts a string to an int */
double atof(const char *strptr); /* converts a string to a double */
| Examples: | |||
|---|---|---|---|
| Function call | Result | Comments | |
| atoi("1234") | 1234 | int type returned | |
| atoi("3.14") | 3 | int type returned; digits after the decimal point ignored | |
| atof("1234") | 1234.0 | double type returned | |
| atof("3.14") | 3.14 | double type returned |