Communicating with Files
Often you need programs that can read information from files or can write results into a file. One such form of program-file communication is file redirection, This method is simple but limited. For example, suppose you want to write an interactive program that asks you for book titles and then saves the complete listing in a file. If you use redirection, as in
books > bklist
your interactive prompts are redirected into bklist. Not only does this put unwanted text into bklist, it prevents you from seeing the questions you are supposed to answer.
C, as you might expect, offers more powerful methods of communicating with files. It enables you to open a file from within a program and then use special I/O functions to read from or write to that file. Before investigating these methods, however, let’s briefly review the nature of a file.
What Is a File?
A file is a named section of storage, usually on a disk. You think of stdio.h, for instance, as the name of a file containing some useful information. To the operating system, however, a file is a bit more complicated. A large file, for example, could wind up stored in several scattered fragments, or it might contain additional data that allows the operating system to determine what kind of file it is. However, these are the operating system’s concerns, not yours (unless you are writing operating systems). Your concern is how files appear to a C program.
C views a file as a continuous sequence of bytes, each of which can be read individually. This corresponds to the file structure in the Unix environment, where C grew up. Because other environments may not correspond exactly to this model, ANSI C provides two ways to view files: the text view and the binary view.
The Text View and the Binary View
The two ANSI-mandated views of a file are binary and text. In the binary view, each and every byte of the file is accessible to a program. In the text view, what the program sees can differ from what is in the file. With the text view, the local environment’s representation of such things as the end of a line are mapped to the C view when a file is read. Similarly, the C view is mapped to the local representation of output. For example, MS-DOS text files represent the end of a line with the carriage-return/linefeed combination: \r\n. Macintosh text files represent the end of a line with just a carriage-return, \r. C programs represent the end of a line with just \n. Therefore, when a C program takes the text view of an MS-DOS text file, it converts \r\n to \n when reading from a file, and it converts \n to \r\n when writing to a file. When a C program takes the text view of a Macintosh text file, it converts the \r to \n when reading from a file, and it converts \n to \r when writing to a file.
You aren’t restricted to using only the text view for an MS-DOS text file. You can also use the binary view of the same file. If you do, your program sees both the \r and the \n characters in the file; no mapping takes place (see following Figure). MS-DOS distinguishes between text and binary files, but C provides for text and binary views. Normally, you use the text view for text files and the binary view for binary files. However, you can use either view of either type of file, although a text view of a binary file works poorly.
Although ANSI C provides for both a binary view and a text view, these views can be implemented identically. For example, because Unix uses just one file structure, both views are the same for Unix implementations.
Levels of I/O
In addition to selecting the view of a file, you can, in most cases, choose between two levels of I/O (that is, between two levels of handling access to files). Low-level I/O uses the fundamental I/O services provided by the operating system. Standard high-level I/O uses a standard package of C library functions and stdio.h header file definitions. ANSI C supports only the standard I/O package because there is no way to guarantee that all operating systems can be represented by the same low-level I/O model. Because ANSI C establishes the portability of the standard I/O model, we will concentrate on it
C programs automatically open three files on your behalf. They are termed the standard input, the standard output, and the standard error output. The standard input, by default, is the normal input device for your system, usually your keyboard. Both the standard output and the standard error output, by default, are the normal output device for your system, usually your display screen.
The standard input, naturally, provides input to your program. It’s the file that is read by getchar(), gets(), and scanf(). The standard output is where normal program output goes. It is used by putchar(), puts(), and printf(). Redirection, causes other files to be recognized as the standard input or standard output. The purpose of the standard error output file is to provide a logically distinct place to send error messages. If, for example, you use redirection to send output to a file instead of to the screen, output sent to the standard error output still goes to the screen. This is good because if the error messages were routed to the file, you would not see them until you viewed the file.