After providing the same program which reads a random generated input file and echoes the same string it read to an output. The only difference is that on one side I\'m providin
You can also disable buffering with setbuf() function. When the buffering is disabled, fwrite() will be as slow as write() if not slower.
More information on this subject can be found there: http://www.gnu.org/s/libc/manual/html_node/Controlling-Buffering.html
write
(2) is the fundamental kernel operation.
fwrite
(3) is a library function that adds buffering on top of write
(2).
For small (e.g., line-at-a-time) byte counts, fwrite
(3) is faster, because of the overhead for just doing a kernel call.
For large (block I/O) byte counts, write
(2) is faster, because it doesn't bother with buffering and you have to call the kernel in both cases.
If you look at the source to cp
(1), you won't see any buffering.
Finally, there is one last consideration: ISO C vs Posix. The buffered library functions like fwrite
are specified in ISO C whereas kernel calls like write
are Posix. While many systems claim Posix-compatibility, especially when trying to qualify for government contracts, in practice it's specific to Unix-like systems. So, the buffered ops are more portable. As a result, a Linux cp
will certainly use write
but a C program that has to work cross-platform may have to use fwrite.
Timing my application with an input of 10Mb in size and echoing it to /dev/null, and making sure the file in not cached, I've found that libc's frwite is faster by a LARGE scale when using very small buffers (1 byte in case).
fwrite
works on streams, which are buffered. Therefore many small buffers will be faster because it won't run a costly system call until the buffer fills up (or you flush it or close the stream). On the other hand, small buffers being sent to write
will run a costly system call for each buffer - that's where you're losing the speed. With a 1024 byte stream buffer, and writing 1 byte buffers, you're looking at 1024 write
calls for each kilobyte, rather than 1024 fwrite
calls turning into one write
- see the difference?
For big buffers the difference will be small, because there will be less buffering, and therefore a more consistent number of system calls between fwrite
and write
.
In other words, fwrite(3)
is just a library routine that collects up output into chunks, and then calls write(2)
. Now, write(2)
, is a system call which traps into the kernel. That's where the I/O actually happens. There is some overhead for simply calling into the kernel, and then there is the time it takes to actually write something. If you use large buffers, you will find that write(2)
is faster because it eventually has to be called anyway, and if you are writing one or more times per fwrite then the fwrite buffering overhead is just that: more overhead.
If you want to read more about it, you can have a look at this document, which explains standard I/O streams.