I am trying to efficiently read from the stdin
by using setvbuf
in `_IOFBF~ mode. I am new to buffering. I am looking for working examples
I am going to recommend trying full buffering with setvbuf
and ditching fread
. If the specification is that there is one number per line, I will take that for granted, use fgets
to read in a full line and pass it to strtoul
parse the number that is supposed to be on that line.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define INITIAL_BUFFER_SIZE 2 /* for testing */
int main(void) {
int n;
int divisor;
int answer = 0;
int current_buffer_size = INITIAL_BUFFER_SIZE;
char *line = malloc(current_buffer_size);
if ( line == NULL ) {
return EXIT_FAILURE;
}
setvbuf(stdin, (char*)NULL, _IOFBF, 0);
scanf("%d%d\n", &n, &divisor);
while ( n > 0 ) {
unsigned long dividend;
char *endp;
int offset = 0;
while ( fgets(line + offset, current_buffer_size, stdin) ) {
if ( line[strlen(line) - 1] == '\n' ) {
break;
}
else {
int new_buffer_size = 2 * current_buffer_size;
char *tmp = realloc(line, new_buffer_size);
if ( tmp ) {
line = tmp;
offset = current_buffer_size - 1;
current_buffer_size = new_buffer_size;
}
else {
break;
}
}
}
errno = 0;
dividend = strtoul(line, &endp, 10);
if ( !( (endp == line) || errno ) ) {
if ( dividend % divisor == 0 ) {
answer += 1;
}
}
n -= 1;
}
printf("%d\n", answer);
return 0;
}
I used a Perl script to generate 1,000,000 random integers between 0 and 1,000,000 and checked if they were divisible by 5 after compiling this program with gcc version 3.4.5 (mingw-vista special r3)
on my Windows XP laptop. The whole thing took less than 0.8 seconds.
When I turned buffering off using setvbuf(stdin, (char*)NULL, _IONBF, 0);
, the time went up to about 15 seconds.
Here's my byte-by-byte take on it:
/*
Buffered reading from stdin using fread in C,
http://stackoverflow.com/questions/2371292/buffered-reading-from-stdin-for-performance
compile with:
gcc -Wall -O3 fread-stdin.c
create numbers.txt:
echo 1000000 5 > numbers.txt
jot -r 1000000 1 1000000 $RANDOM >> numbers.txt
time -p cat numbers.txt | ./a.out
*/
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define BUFSIZE 32
int main() {
int n, k, tmp, ans=0, i=0, countNL=0;
char *endp = 0;
setvbuf(stdin, (char*)NULL, _IOFBF, 0); // turn buffering mode on
//setvbuf(stdin, (char*)NULL, _IONBF, 0); // turn buffering mode off
scanf("%d%d\n", &n, &k);
char singlechar = 0;
char intbuf[BUFSIZE + 1] = {0};
while(fread(&singlechar, 1, 1, stdin)) // fread byte-by-byte
{
if (singlechar == '\n')
{
countNL++;
intbuf[i] = '\0';
tmp = strtoul(intbuf, &endp, 10);
if( tmp % k == 0) ++ans;
i = 0;
} else {
intbuf[i] = singlechar;
i++;
}
if (countNL == n) break;
}
printf("%d integers are divisible by %d.\n", ans, k);
return 0;
}
You can use the value of n
to stop reading the input after you've seen n
integers.
Change the condition of the outer while
loop to:
while(n > 0 && fread(buf, sizeof('1'), BUFSIZE, stdin))
and change the body of the inner one to:
{
n--;
if(tmp%k == 0) ++ans;
}
The problem you're continuing to have is that because you never adjust buf
in the inner while
loop, sscanf
keeps reading the same number over and over again.
If you switch to using strtol()
intead of sscanf()
, then you can use the endptr
output parameter to move through the buffer as numbers are read.
The problem when you are not using redirection is that you are not causing EOF.
Since this appears to be Posix (based on the fact you are using gcc), just type ctrl-D
(i.e. while pressing the control button, press/release d) which will cause EOF to be reached.
If you are using Windows, I believe you use ctrl-Z
instead.
The outermost while()
loop will only exit when the read from stdin
returns EOF
. This can only happen when reaching the actual end-of-file on an input file, or if the process writing to an input pipe exits. Hence the printf()
statement is never executed. I don't think this has anything to do with the call to setvbuf()
.