Random-Access Files
This chapter introduces the concept of random file access. Random file access enables you to read or write any data in your disk file without having to read or write every piece of data before it. You can quickly search for, add, retrieve, change, and delete information in a random-access file. Although you need a few new functions to access files randomly, you find that the extra effort pays off in flexibility, power, and speed of disk access.
This chapter introduces
-
Random-access files
-
File records
-
The seekg() function
-
Special-purpose file I/ O functions
With C++’s sequential and random-access files, you can do everything you would ever want to do with disk data.
A record to a file is like a structure to variables.
You do not have to
Random File Record s
Random files exemplify the power of data processing with C++. Sequential file processing is slow unless you read the entire file into arrays and process them in memory. As explained in Chapter 30, however, you have much more disk space than RAM, and most disk files do not even fit in your RAM at one time. Therefore, you need a way to quickly read individual pieces of data from a file in any order and process them one at a time.
Generally, you read and write file records. A record to a file is analogous to a C++ structure. A record is a collection of one or more data values (called fields) you read and write to disk. Generally, you store data in structures and write the structures to disk where they are called records. When you read a record from disk, you generally read that record into a structure variable and process it with your program.
Unlike most programming languages, not all disk data for C++ programs has to be stored in record format. Typically, you write a stream of characters to a disk file and access that data either sequen- tially or randomly by reading it into variables and structures.
The process of randomly accessing data in a file is simple. Think about the data files of a large credit card organization. When you make a purchase, the store calls the credit card company to receive authorization. Millions of names are in the credit card company’s files. There is no quick way the credit card company could read every record sequentially from the disk that comes before yours. Sequential files do not lend themselves to quick access. It is not feasible, in many situations, to look up individual records in a data file with sequential access.
The credit card companies must use a random file access so their computers can go directly to your record, just as you go directly to a song on a compact disk or record album. The functions you use are different from the sequential functions, but the power that results from learning the added functions is worth the effort.
When your program reads and writes files randomly, it treats
rewrite an entire file
to change random- access file data.
the file like a big array. With arrays, you know you can add, print, or remove values in any order. You do not have to start at the first
array element, sequentially looking at the next one, until you get the element you need. You can view your random-access file in the same way, accessing the data in any order.
Most random file records are fixed-length records. Each record (usually a row in the file) takes the same amount of disk space. Most of the sequential files you read and wrote in the previous chapters were variable-length records. When you are reading or writing sequentially, there is no need for fixed-length records be- cause you input each value one character, word, string, or number at a time, and look for the data you want. With fixed-length records, your computer can better calculate where on the disk the desired record is located.
Although you waste some disk space with fixed-length records (because of the spaces that pad some of the fields), the advantages of random file access compensate for the “wasted” disk space (when the data do not actually fill the structure size).
Opening Random-Acces s Files
Just as with sequential files, you must open random-access files before reading or writing to them. You can use any of the read access modes mentioned in Chapter 30 (such as i os:: i n ) only to read a file randomly. However, to modify data in a file, you must open the file in one of the update modes, repeated for you in Table 31.1.
Table 31.1. Random-access update modes.
Mode Description
app Open the file for appending (adding to it)
at e Seek to end of file on opening it
i n Open file for reading
out Open file for writing
bi nary Open file in binary mode
t runc Discard contents if file exists
nocreat e If file doesn’t exist, open fails
norepl ace If file exists, open fails unless appending or seeking to end of file on opening
There is really no difference between sequential files and random files in C++. The difference between the files is not physical, but lies in the method you use to access them and update them.
Examples
- Suppose
you want to write a program to create a file of your friends’ names. The following open() function call suffices, assuming f p is declared as a file pointer:
f p. open(“ NAMES. DAT” , i os:: out ) ; i f ( !f p)
{ cout << “ \ n*** Cannot open f il e *** \ n” ; }
No update open() access mode is needed if you are only creating the file. However, what if you wanted to create the file, write names to it, and give the user a chance to change any of the names before closing the file? You then have to open the file like this:
f p. open(“ NAMES. DAT” , i os:: i n | i os:: out ) ; i f ( !f p)
cout << “ \ n*** Cannot open f il e *** \ n” ;
This code enables you to create the file, then change data you wrote to the file.
- As
with sequential files, the only difference between using a binary open() access mode and a text mode is that the file you create is more compact and saves disk space. You cannot, however, read that file from other programs as an ASCII text file. The previous open() function can be rewritten to create and allow updating of a binary file. All other file- related commands and functions work for binary files just as they do for text files.
f p. open(“ NAMES. DAT” , i os:: i n | i os:: out | i os:: bi nar y) ; i f ( !f p)
cout << “ \ n*** Cannot open f il e *** \ n” ;
You can read forwards or
The seekg( ) Functio n
C++ provides a function that enables you to read to a specific point in a random-access data file. This is the seekg() function. The format of seekg() is
f il e_pt r . seekg( l ong_num, or i gi n ) ;
f il e_pt r is the pointer to the file that you want to access, initialized with an open() statement. l ong_num is the number of bytes
backwards from any
point in the file with
seekg( ) .
in the file you want to skip. C++ does not read this many bytes, but literally skips the data by the number of bytes specified in l ong_num*.* Skipping the bytes on the disk is much faster than reading them. If l ong_num is negative, C++ skips backwards in the file (this allows for rereading of data several times). Because data files can be large, you must declare l ong_num as a long integer to hold a large amount of bytes.
or i gi n is a value that tells C++ where to begin the skipping of bytes specified by l ong_num. or i gi n can be any of the three values shown in Table 31.2.
Table 31.2. Possible or i gi n values.
Description |
or i gi n |
Equivalent |
---|---|---|
Beginning of file |
SEEK_SET |
i os: : beg |
Current file position |
SEEK_CUR |
i os: : cur |
End of file |
SEEK_END |
i os: : end |
The origins SEEK_SET, SEEK_CUR, and SEEK_END are de- fined in stdio.h. The equivalents i os:: beg , i os:: cur , and i os:: end are defined in fstream.h.
Examples
- No
matter how far into a file you have read, the following seekg() function positions the file pointer back to the begin- ning of a file:
f p. seekg( 0L, SEEK_SET) ; / / Posi t i on f il e poi nt er at begi nni ng.
The constant 0L passes a long integer 0 to the seekg() func- tion. Without the L, C++ passes a regular integer and this does not match the prototype for seekg() that is located in fstream.h. Chapter 4, “Variables and Literals,” explained the use of data type suffixes on numeric constants, but the suffixes have not been used until now.
This seekg() function literally reads “move the file pointer 0 bytes from the beginning of the file.”
- The following example reads a file named MYFILE.TXT twice, once to
send the file to the screen and once to send the file to the printer. Three file pointers are used, one for each device (the file, the screen, and the printer).
/ / Fil ename: C31TWI C. CPP
/ / Wr i t es a f il e t o t he pr i nt er , r er eads i t,
/ / and sends i t t o t he scr een.
#i ncl ude <f st r eam. h> #i ncl ude <st dli b. h> #i ncl ude <st di o. h>
i f st r eam i n_f il e; / / I nput f il e poi nt er . of st r eam scr n; / / Scr een poi nt er .
of st r eam pr nt ; / / Pr i nt er poi nt er .
voi d mai n()
{
char i n_char ;
i n_f il e. open(“ MYFI LE. TXT” , i os:: i n) ; i f ( ! i n_f il e)
{
cout << “ \ n*** Err or openi ng MYFI LE. TXT *** \ n” ; exi t ( 0) ;
}
scr n. open(“ CON” , i os:: out ) ; / / Open scr een devi ce. whil e ( i n_f il e. get ( i n_char))
{ scr n << i n_char ; } / / Out put char act er s t o t he scr een. scr n. cl ose() ; / / Cl ose scr een because i t i s no
/ / l onger needed.
i n_f il e. seekg( 0L, SEEK_SET) ; / / Reposi t i on f il e poi nt er . pr nt. open(“ LPT1” , i os:: out ) ; / / Open pr i nt er devi ce. whil e ( i n_f il e. get ( i n_char))
{ pr nt << i n_char ; } / / Out put char act er s t o t he
/ / pr i nt er .
pr nt. cl ose() ; / / Al ways cl ose all open f il es. i n_f il e. cl ose() ;
r et ur n;
}
You also can close then reopen a file to position the file pointer at the beginning, but using seekg() is a more efficient method.
Of course, you could have used regular I/ O functions to write to the screen, rather than having to open the screen as a separate device.
- The
following seekg() function positions the file pointer at the 30th byte in the file. (The next byte read is the 31st byte.)
f il e_pt r . seekg( 30L, SEEK_SET) ; / / Posi t i on f il e poi nt er
/ / at t he 30t h byt e.
This seekg() function literally reads “move the file pointer 30 bytes from the beginning of the file.”
If you write structures to a file, you can quickly seek any structure in the file using the si zeof () function. Suppose you want the 123rd occurrence of the structure tagged with
i nvent or y . You would search using the following seekg()
function:
f il e_pt r . seekg(( 123L * si zeof ( st r uct i nvent or y)) , SEEK_SET) ;
- The following program writes the letters of the alphabet to a file
called ALPH.TXT. The seekg() function is then used to read and display the ninth and 17th letters (I and Q).
/ / Fil ename: C31ALPH. CPP
/ / St or es t he al phabet i n a f il e, t hen r eads
/ / t wo l ett er s f r om i t.
#i ncl ude <f st r eam. h> #i ncl ude <st dli b. h> #i ncl ude <st di o. h>
f st r eam f p; voi d mai n()
{
char ch; / / Hol ds A t hr ough Z.
/ / Open i n updat e mode so you can r ead f il e aft er wr i t i ng t o i t. f p. open(“ al ph.t xt ” , i os:: i n | i os:: out ) ;
i f ( !f p)
{
cout << “ \ n*** Err or openi ng f il e *** \ n” ; exi t ( 0) ;
}
f or ( ch = ‘ A’ ; ch <= ‘ Z’ ; ch++)
{ f p << ch; } / / Wr i t e l ett er s.
f p. seekg( 8L, i os:: beg) ; / / Ski p ei ght l ett er s, poi nt t o I. f p >> ch;
cout << “ The f i r st char act er i s “ << ch << “ \ n” ;
f p. seekg( 16L, i os:: beg) ; / / Ski p 16 l ett er s, poi nt t o Q. f p >> ch;
cout << “ The second char act er i s “ << ch << “ \
n” ; f p. cl ose() ;
r et ur n;
}
- To
point to the end of a data file, you can use the seekg() function to position the file pointer at the last byte. Subse- quent seekg() s should then use a negative l ong_num value to skip backwards in the file. The following seekg() function makes the file pointer point to the end of the file:
f il e_pt r . seekg( 0L, SEEK_END) ; / / Posi t i on f il e
/ / poi nt er at t he end.
This seekg() function literally reads “move the file pointer 0 bytes from the end of the file.” The file pointer now points to the end-of-file marker, but you can seekg() backwards to find other data in the file.
- The following program reads the ALPH.TXT file (created in
Exercise 4) backwards, printing each character as it skips back in the file.
/ / Fil ename: C31BACK. CPP
/ / Reads and pr i nt s a f il e backwar ds.
#i ncl ude <f st r eam. h> #i ncl ude <st dli b. h> #i ncl ude <st di o. h>
i f st r eam f p; voi d mai n()
{
i nt ct r ; / / St eps t hr ough t he 26 l ett er s i n t he f il e. char i n_char ;
f p. open(“ ALPH. TXT” , i os:: i n) ; i f ( !f p)
{
cout << “ \ n*** Err or openi ng f il e *** \ n” ; exi t ( 0) ;
}
f p. seekg(- 1L, SEEK_END) ; / / Poi nt t o l ast byt e i n
/ / t he f il e.
f or ( ct r = 0; ct r < 26; ct r ++)
{
f p >> i n_char ;
f p. seekg(- 2L, SEEK_CUR) ; cout << i n_char ;
}
f p. cl ose() ; r et ur n;
}
This program also uses the SEEK_CURor i gi n value. The last seekg() in the program seeks two bytes backwards from the current position, not the beginning or end as the previous examples have. The f or loop towards the end of the program performs a “skip-two-bytes-back, read-one-byte-forward” method to skip through the file backwards.
- The following program performs the same actions as Ex- ample 4
(C31ALPH.CPP), with one addition. When the letters I and Q are found, the letter x is written over the I and
Q. The seekg() must be used to back up one byte in the file to overwrite the letter just read.
/ / Fil ename: C31CHANG. CPP
/ / St or es t he al phabet i n a f il e, r eads t wo l ett er s f r om i t,
/ / and changes t hose l ett er s t o x s.
#i ncl ude <f st r eam. h> #i ncl ude <st dli b. h> #i ncl ude <st di o. h>
f st r eam f p; voi d mai n()
{
char ch; / / Hol ds A t hr ough Z.
/ / Open i n updat e mode so you can r ead f il e aft er wr i t i ng t o i t. f p. open(“ al ph.t xt ” , i os:: i n | i os:: out ) ;
i f ( !f p)
{
cout << “ \ n*** Err or openi ng f il e *** \ n” ; exi t ( 0) ;
}
f or ( ch = ‘ A’ ; ch <= ‘ Z’ ; ch++)
{ f p << ch; } / / Wr i t e l ett er s
f p. seekg( 8L, SEEK_SET) ; / / Ski p ei ght l ett er s, poi nt t o I. f p >> ch;
/ / Change t he Q t o an x. f p. seekg(- 1L, SEEK_CUR) ; f p << ‘ x’ ;
cout << “ The f i r st char act er i s “ << ch << “ \ n” ;
f p. seekg( 16L, SEEK_SET) ; / / Ski p 16 l ett er s, poi nt t o Q. f p >> ch;
cout << “ The second char act er i s “ << ch << “ \
n” ;
/ / Change t he Q t o an x. f p. seekg(- 1L, SEEK_CUR) ; f p << ‘ x’ ;
f p. cl ose() ; r et ur n;
}
The file named ALPH.TXT now looks like this:
ABCDEFGHxJKLMNOPxRSTUVWXYZ
This program forms the basis of a more complete data file management program. After you master the seekg() func- tions and become more familiar with disk data files, you will begin to write programs that store more advanced data structures and access them.
The mailing list application in Appendix F is a good example of what you can do with random file access. The user is given a chance to change names and addresses already in the file. The program, using random access, seeks for and changes selected data without rewriting the entire disk file.
Other Helpful I/O Function s
There are several more disk I/ O functions available that you might find useful. They are mentioned here for completeness. As you perform more powerful disk I/ O, you might find a use for many of these functions. Each of these functions is prototyped in the fstream.h header file.
-
r ead( arr ay , count ) : Reads the data specified by count into the
array or pointer specified by arr ay . r ead() is called a buffered I/O function. r ead() enables you to read much data with a single function call.
-
wr i t e( arr ay , count ) : Writes count arr ay bytes to the
specified file. wr i t e() is a buffered I/ O function. wr i t e() enables you to write much data in a single function call.
-
r emove( f il ename ) : Erases the file named by f il ename . r
emove() returns a 0 if the file was erased successfully and - 1 if an error occurred.
Many of these (and other built-in I/ O functions that you learn in your C++ programming career) are helpful functions that you could duplicate using what you already know.
The buffered I/ O file functions enable you to read and write entire arrays (including arrays of structures) to the disk in a single function call.
Examples
- The
following program requests a filename from the user and erases the file from the disk using the r emove() function.
/ / Fil ename: C31ERAS. CPP
/ / Er ases t he f il e speci f i ed by t he user .
#i ncl ude <st di o. h>
#i ncl ude <i ost r eam. h>
voi d mai n()
{
char f il ename[ 12];
cout << “ What i s t he f il ename you want me t o er ase? “ ; ci n >> f il ename;
i f (r emove( f il ename) == - 1)
{ cout << “ \ n*** I coul d not r emove t he f il e *** \ n” ; } el se
{ cout << “ \ nThe f il e “ << f il ename << “ i s now r emoved\ n” ; } r et ur n;
}
- The following function is part of a larger program that receives
inventory data, in an array of structures, from the user. This function is passed the array name and the number of elements (structure variables) in the array. The wr i t e() function then writes the complete array of structures to the disk file pointed to by f p .
voi d wr i t e_st r( i nvent or y i t ems[ ] , i nt i nv_cnt )
{
f p. wr i t e( i t ems, i nv_cnt * si zeof ( i nvent or y) ; r et ur n;
}
If the inventory array had 1,000 elements, this one-line function would still write the entire array to the disk file. You could use the r ead() function to read the entire array of structures from the disk in a single function call.
Review Question s
The answers to the review questions are in Appendix B.
-
What
is the difference between records and structures?
-
True or false: You have to create a random-access file before
reading from it randomly.
-
What happens to the file pointer as you read from a file?
-
What are the
two buffered file I/ O functions?
-
What is wrong with this program?
#i ncl ude <f st r eam. h> i f st r eam f p;
voi d mai n()
{
char i n_char ;
f p. open( i os:: i n | i os:: bi nar y) ; i f ( f p. get ( i n_char))
{ cout << i n_char ; } / / Wr i t e t o t he scr een f p. cl ose() ;
r et ur n;
}
Review Exercise s
-
Write
a program that asks the user for a list of five names, then writes the names to a file. Rewind the file and display its contents on-screen using the seekg() and get () functions.
-
Rewrite the program in Exercise 1 so it displays every other
character in the file of names.
-
Write
a program that reads characters from a file. If the input character is a lowercase letter, change it to uppercase. If the input character is an uppercase letter, change it to lowercase. Do not change other characters in the file.
-
Write
a program that displays the number of nonalphabetic characters in a file.
-
Write
a grade-keeping program for a teacher. Allow the teacher to enter up to 10 students’ grades. Each student has three grades for the semester. Store the students’ names and their three grades in an array of structures and store the data on the disk. Make the program menu-driven. Include op- tions of adding more students, viewing the file’s data, or printing the grades to the printer with a calculated class average.
Summary
C++ supports random-access files with several functions. These functions include error checking, file pointer positioning, and the opening and closing of files. You now have the tools you need to save your C++ program data to disk for storage and retrieval.
The mailing-list application in Appendix F offers a complete example of random-access file manipulation. The program enables the user to enter names and addresses, store them to disk, edit them, change them, and print them from the disk file. The mailing-list program combines almost every topic from this book into a com- plete application that “puts it all together.”