问题
Normally, to indicate EOF to a program attached to standard input on a Linux terminal, I need to press Ctrl+D once if I just pressed Enter, or twice otherwise. I noticed that the patch
command is different, though. With it, I need to press Ctrl+D twice if I just pressed Enter, or three times otherwise. (Doing cat | patch
instead doesn't have this oddity. Also, If I press Ctrl+D before typing any real input at all, it doesn't have this oddity.) Digging into patch
's source code, I traced this back to the way it loops on fread. Here's a minimal program that does the same thing:
#include <stdio.h>
int main(void) {
char buf[4096];
size_t charsread;
while((charsread = fread(buf, 1, sizeof(buf), stdin)) != 0) {
printf("Read %zu bytes. EOF: %d. Error: %d.\n", charsread, feof(stdin), ferror(stdin));
}
printf("Read zero bytes. EOF: %d. Error: %d. Exiting.\n", feof(stdin), ferror(stdin));
return 0;
}
When compiling and running the above program exactly as-is, here's a timeline of events:
- My program calls
fread
. fread
calls theread
system call.- I type "asdf".
- I press Enter.
- The
read
system call returns 5. fread
calls theread
system call again.- I press Ctrl+D.
- The
read
system call returns 0. fread
returns 5.- My program prints
Read 5 bytes. EOF: 1. Error: 0.
- My program calls
fread
again. fread
calls theread
system call.- I press Ctrl+D again.
- The
read
system call returns 0. fread
returns 0.- My program prints
Read zero bytes. EOF: 1. Error: 0. Exiting.
Why does this means of reading stdin have this behavior, unlike the way that every other program seems to read it? Is this a bug in patch
? How should this kind of loop be written to avoid this behavior?
UPDATE: This seems to be related to libc. I originally experienced it on glibc 2.23-0ubuntu3 from Ubuntu 16.04. @Barmar noted in the comments that it doesn't happen on macOS. After hearing this, I tried compiling the same program against musl 1.1.9-1, also from Ubuntu 16.04, and it didn't have this problem. On musl, the sequence of events has steps 12 through 14 removed, which is why it doesn't have the problem, but is otherwise the same (except for the irrelevant detail of readv
in place of read
).
Now, the question becomes: is glibc wrong in its behavior, or is patch wrong in assuming that its libc won't have this behavior?
回答1:
I've managed to confirm that this is due to an unambiguous bug in glibc versions prior to 2.28 (commit 2cc7bad). Relevant quotes from the C standard:
The byte input/output functions — those functions described in this subclause that perform input/output: [...],
fread
The byte input functions read characters from the stream as if by successive calls to the
fgetc
function.If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the
fgetc
function returnsEOF
. Otherwise, thefgetc
function returns the next character from the input stream pointed to bystream
.
(emphasis on "or" mine)
The following program demonstrates the bug with fgetc
:
#include <stdio.h>
int main(void) {
while(fgetc(stdin) != EOF) {
puts("Read and discarded a character from stdin");
}
puts("fgetc(stdin) returned EOF");
if(!feof(stdin)) {
/* Included only for completeness. Doesn't occur in my testing. */
puts("Standard violation! After fgetc returned EOF, the end-of-file indicator wasn't set");
return 1;
}
if(fgetc(stdin) != EOF) {
/* This happens with glibc in my testing. */
puts("Standard violation! When fgetc was called with the end-of-file indicator set, it didn't return EOF");
return 1;
}
/* This happens with musl in my testing. */
puts("No standard violation detected");
return 0;
}
To demonstrate the bug:
- Compile the program and execute it
- Press Ctrl+D
- Press Enter
The exact bug is that if the end-of-file stream indicator is set, but the stream is not at end-of-file, glibc's fgetc will return the next character from the stream, rather than EOF as the standard requires.
Since fread
is defined in terms of fgetc
, this is the cause of what I originally saw. It's previously been reported as glibc bug #1190 and has been fixed since commit 2cc7bad in February 2018, which landed in glibc 2.28 in August 2018.
来源:https://stackoverflow.com/questions/52674057/why-does-an-fread-loop-require-an-extra-ctrld-to-signal-eof-with-glibc