问题
According to the C standard, any characters returned by fgetc
are returned in the form of unsigned char
values, "converted to an int
" (that quote comes from the C standard, stating that there is indeed a conversion).
When sizeof (int) == 1
, many unsigned char
values are outside of range. It is thus possible that some of those unsigned char
values might end up being converted to an int
value (the result of the conversion being "implementation-defined or an implementation-defined signal is raised") of EOF
, which would be returned despite the file not actually being in an erroneous or end-of-file state.
I was surprised to find that such an implementation actually exists. The TMS320C55x CCS manual documents UCHAR_MAX
having a corresponding value of 65535, INT_MAX
having 32767, fputs
and fopen
supporting binary mode... What's even more surprising is that it seems to describe the environment as a fully conforming, complete implementation (minus signals).
The C55x C/C++ compiler fully conforms to the ISO C standard as defined by the ISO specification ...
The compiler tools come with a complete runtime library. All library functions conform to the ISO C library standard. ...
Is such an implementation that can return a value indicating errors where there are none, really fully conforming? Could this justify using feof
and ferror
in the condition section of a loop (as hideous as that seems)? For example, while ((c = fgetc(stdin)) != EOF || !(feof(stdin) || ferror(stdin))) { ... }
回答1:
The function fgetc()
returns an int
value in the range of unsigned char
only when a proper character is read, otherwise it returns EOF
which is a negative value of type int
.
My original answer (I changed it) assumed that there was an integer conversion to int
, but this is not the case, since actually the function fgetc()
is already returning a value of type int
.
I think that, to be conforming, the implementation have to make fgetc()
to return nonnegative values in the range of int
, unless EOF
is returned.
In this way, the range of values from 32768 to 65535 will be never associated to character codes in the TMS320C55x implementation.
来源:https://stackoverflow.com/questions/30836207/can-an-implementation-that-has-sizeof-int-1-fully-conform