I\'m quite new to C. I faced a problem while studying the last chapter of K&R.
I\'m trying to implement fopen()
and fillbuf()
function
The code shown in the question consists of parts, but not all, of the code from K&R "The C Programming Language, 2nd Edition" (1988; my copy is marked 'Based on Draft Proposed ANSI C'), pages 176-178, plus a sample main
program that is not from the book at all. The name of the type was changed from FILE
to myFILE
too, and fopen()
was renamed to myfopen()
. I note that the expressions in the code in the question have many fewer spaces than the original code in K&R. The compiler doesn't mind; human readers generally prefer spaces around operators.
As stated in another (later) question and answer, the diagnosis given by Mark Yisri in the currently accepted answer is incorrect — the problem is not a null pointer in the for
loop. The prescribed remedy works (as long as the program is invoked correctly), but the memory allocation is not necessary. Fortunately for all concerned, the fclose()
function was not included in the implementations, so it wasn't possible to close a file once it was opened.
In particular, the loop:
for (fp = _iob; fp < _iob + OPEN_MAX; fp++)
if ((fp->flag & (_READ | _WRITE)) == 0)
break;
is perfectly OK because the array _iob
is defined as:
FILE _iob[OPEN_MAX] = {
…initializers for stdin, stdout, stderr…
};
This is an array of structures, not structure pointers. The first three elements are initialized explicitly; the remaining elements are implicitly initialized to all zeros. Consequently, there is no chance of there being a null pointer in fp
as it steps through the array. The loop might also be written as:
for (fp = &_iob[0]; fp < &_iob[OPEN_MAX]; fp++)
if ((fp->flag & (_READ | _WRITE)) == 0)
break;
Empirically, if the code shown in the question (including the main()
, which was not — repeat not — written by K&R) is invoked correctly, it works without crashing. However, the code in the main()
program does not protect itself from:
argv[1]
.argv[1]
.These are very common problems, and with the main program as written, either could cause the program to crash.
Although it is hard to be sure 16 months later, it seems likely to me that the problem was in the way that the program was invoked rather than anything else. If the main program is written more-or-less appropriately, you end up with code similar to this (you also need to add #include <string.h>
to the list of included headers):
int main(int argc, char *argv[])
{
myFILE *fp;
int c;
if (argc != 2)
{
static const char usage[] = "Usage: mystdio filename\n";
write(2, usage, sizeof(usage) - 1);
return 1;
}
if ((fp = myfopen(argv[1], "r")) == NULL)
{
static const char filenotopened[] = "mystdio: failed to open file ";
write(2, filenotopened, sizeof(filenotopened) - 1);
write(2, argv[1], strlen(argv[1]));
write(2, "\n", 1);
return 1;
}
write(1, "opened\n", sizeof("opened\n"));
while ((c = getc(fp)) != EOF)
write(1, &c, sizeof(c));
return 0;
}
This can't use fprintf()
etc because the surrogate implementation of the standard I/O library is not complete. Writing the errors direct to file descriptor 2 (standard error) with write()
is fiddly, if not painful. It also means that I've taken shortcuts like assuming that the program is called mystdio
rather than actually using argv[0]
in the error messages. However, if it is invoked without any file name (or if more than one file name is given), or if the named file cannot be opened for reading, then it produces a more or less appropriate error message — and does not crash.
Note that the C standard reserves identifiers starting with underscores. You should not create function, variable or macro names that start with an underscore, in general. C11 §7.1.3 Reserved identifiers says (in part):
See also What does double underscore (__const) mean in C?
In fairness, K&R were essentially describing the standard implementation of the standard I/O library at the time when the 1st Edition was written (1978), modernized sufficiently to be using function prototype notation in the 2nd Edition. The original code was on pages 165-168 of the 1st Edition.
Even today, if you are implementing the standard library, you would use names starting with underscores precisely because they are reserved for use 'by the implementation'. If you are not implementing the standard library, you do not use names starting with underscores because that uses the namespace reserved for the implementation. Most people, most of the time, are not writing the standard library — most people should not be using leading underscores.
EDIT: Please see Jonathan Leffler's answer. It is more accurate and provides a better diagnosis. My answer works, but there is a better way to do things.
I see the problem.
myFILE *fp;
if(*mode!='r' && *mode!='w' && *mode!='a')
return NULL;
for(fp=_iob; fp<_iob+OPEN_MAX; fp++)
if((fp->flag & (_READ | _WRITE))==0) // marked line
break;
When you reach the marked line
, you try to dereference the fp
pointer. Since it is (likely, but not certainly) initialized to zero (but I should say NULL
), you are dereferencing a null pointer. Boom. Segfault.
Here's what you need to change.
myFILE *fp = (myFILE *)malloc(sizeof(myFILE));
Be sure to #include <malloc.h>
to use malloc.
Also your close
function should later free()
your myFILE
to prevent memory leaks.