问题
When using fgetc
to read the next character of a stream, you usually check that the end-of-file was not attained by
if ((c = fgetc (stream)) != EOF)
where c
is of int
type. Then, either the end-of-file has been attained and the condition will fail, or c
shall be an unsigned
char converted to int
, which is expected to be different from EOF
—for EOF
is ensured to be negative. Fine... apparently.
But there is a small problem... Usually the char
type has no more than 8 bits, while int
must have at least 16 bits, so every unsigned char
will be representable as an int
. Nevertheless, in the case char
would have 16 or 32 bits (I know, this is never the case in practice...), there is no reason why one could not have sizeof(int) == 1
, so that it would be (theoretically!) possible that fgetc (stream)
returns EOF
(or another negative value) but that end-of-file has not been attained...
Am I mistaken? Is it something in the C standard that prevents fgetc
to return EOF
if end-of-file has not been attained? (If yes, I could not find it!). Or is the if ((c = fgetc (stream)) != EOF)
syntax not fully portable?...
EDIT: Indeed, this was a duplicate of Question #3860943. I did not find that question at first search. Thank for your help! :-)
回答1:
You asked:
Is it something in the C standard that prevents
fgetc
to returnEOF
if end-of-file has not been attained?
On the contrary, the standard explicitly allows EOF
to be returned when an error occurs.
If a read error occurs, the error indicator for the stream is set and the
fgetc
function returnsEOF
.
In the footnotes, I see:
An end-of-file and a read error can be distinguished by use of the
feof
andferror
functions.
You also asked:
Or is the
if ((c = fgetc (stream)) != EOF)
syntax not fully portable?
On the theoretical platform where CHAR_BIT
is more than 8 and sizeof(int) == 1
, that won't be a valid way to check that end-of-file has been reached. For that, you'll have to resort to feof
and ferror
.
c = fgetc (stream);
if ( !feof(stream) && !ferror(stream) )
{
// Got valid input in c.
}
回答2:
I think you need to rely on stream error.
ch = fgetc(stream);
if (ferror(stream) && (ch == EOF)) /* end of file */;
From the standard
If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF.
Edit for better version
ch = fgetc(stream);
if (ch == EOF) {
if (ferror(stream)) /* error reading */;
else if (feof(stream)) /* end of file */;
else /* read valid character with value equal to EOF */;
}
回答3:
If you are reading a stream that is standard ASCII only, there's no risk of receiving the char equivalent to EOF before the real end-of-file, because valid ASCII char codes go up to 127 only. But it could happen when reading a binary file. The byte would need to be 255(unsigned) to correspond to a -1 signed char, and nothing prevents it from appearing in a binary file.
But about your specific question (if there's something in the standard), not exactly... but notice that fgetc promotes the character as an unsigned char, so it won't ever be negative in this case anyway. The only risk would be if you had explicitly or implicitly cast down the return value to signed char (for instance, if your c variable were signed char).
NOTE: as @Ulfalizer mentioned in the comments, there's one rare case in which you may need to worry: if sizeof(int)==1, and you're reading a file that contains non-ascii characters, then you may get a -1 return value that is not the real EOF. Notice that environments in which this happens are quite rare (to my knowledge, compilers for low-end 8-bit microcontrollers, like the 8051). In such a case, the safe option would be to test feof() as @pmg suggested.
回答4:
I agree with your reading.
C Standard says (C11, 7.21.7.1 The fgetc function p3):
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the endof-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF.
There is nothing in the Standard (assuming UCHAR_MAX > INT_MAX
) that disallows fgetc
in a hosted implementation to return a value equal to EOF
that is neither an end-of-file nor an error condition indicator.
来源:https://stackoverflow.com/questions/29975874/might-an-unsigned-char-be-equal-to-eof