I was running some benchmarks to find the most efficient way to write a huge array to a file in C++ (more than 1Go in ASCII).
So I compared std::ofstream with fprintf (s
Well, fprintf()
does have to do a bit more work at runtime, since it has to parse and process the format string. However, given the size of your output file I would expect those differences to be of little consequence, and would expect the code to be I/O bound.
I therefore suspect that your benchmark is flawed in some way.
fsync()/sync()
at the end?Have you set sync_with_stdio somewhere upstream of the code you have shown?
While what you report is opposite that of what is empirically seen, most people think and believe what you see should be the norm. iostreams are type-safe, whereas the printf family of functions are variadic functions that have to infer the types of the va_list from the format specifier.
I present here a really optimized way to write integers on a text files using unix functions open, read and write. They are also available on windows, just give you some warning you can work with. This implementation works only for 32 bits integer.
In your include file:
class FastIntegerWriter
{
private:
const int bufferSize;
int offset;
int file;
char* buffer;
public:
FastIntegerWriter(int bufferSize = 4096);
int Open(const char *filename);
void Close();
virtual ~FastIntegerWriter();
void Flush();
void Writeline(int value);
};
In your source file
#ifdef _MSC_VER
# include <io.h>
# define open _open
# define write _write
# define read _read
# define close _close
#else
# include <unistd.h>
#endif
#include <fcntl.h>
FastIntegerWriter::FastIntegerWriter(int bufferSize) :
bufferSize(bufferSize),
buffer(new char[bufferSize]),
offset(0),
file(0)
{
}
int FastIntegerWriter::Open(const char* filename)
{
this->Close();
if (filename != NULL)
this->file = open(filename, O_WRONLY | O_CREAT | O_TRUNC);
return this->file;
}
void FastIntegerWriter::Close()
{
this->Flush();
if (this->file > 0)
{
close(this->file);
this->file = 0;
}
}
FastIntegerWriter::~FastIntegerWriter()
{
this->Close();
delete[] this->buffer;
}
void FastIntegerWriter::Flush()
{
if (this->offset != 0)
{
write(this->file, this->buffer, this->offset);
this->offset = 0;
}
}
void FastIntegerWriter::Writeline(int value)
{
if (this->offset >= this->bufferSize - 12)
{
this->Flush();
}
// Compute number of required digits
char* output = this->buffer + this->offset;
if (value < 0)
{
if (value == -2147483648)
{
// Special case, the minimum integer does not have a corresponding positive value.
// We use an hard coded string and copy it directly to the buffer.
// (Thanks to Eugene Ryabtsev for the suggestion).
static const char s[] = "-2147483648\n";
for (int i = 0; i < 12; ++i)
output[i] = s[i];
this->offset += 12;
return;
}
*output = '-';
++output;
++this->offset;
value = -value;
}
// Compute number of digits (log base 10(value) + 1)
int digits =
(value >= 1000000000) ? 10 : (value >= 100000000) ? 9 : (value >= 10000000) ? 8 :
(value >= 1000000) ? 7 : (value >= 100000) ? 6 : (value >= 10000) ? 5 :
(value >= 1000) ? 4 : (value >= 100) ? 3 : (value >= 10) ? 2 : 1;
// Convert number to string
output[digits] = '\n';
for (int i = digits - 1; i >= 0; --i)
{
output[i] = value % 10 + '0';
value /= 10;
}
this->offset += digits + 1;
}
I guess this will outperform every other method to write to an ascii file :) you may get some more performance using windows low level apis WriteFile and ReadFile, but it don't worth the effort.
To use it...
int main()
{
FastIntegerWriter fw;
fw.Open("test.txt");
for (int i = -2000; i < 1000000; ++i)
fw.Writeline(i);
return 0;
}
If you don't specify any file it uses standard output (console).
fprintf("%d"
requires runtime parsing of the format string, once per integer. ostream& operator<<(ostream&, int)
is resolved by the compiler, once per compilation.
There is a file buffer in the ofstream, this may decrease the times accessing to the disk. in addition, fprintf is a function with variable parameters which will call some va_# functions, but ofstream won't.I think you can use fwrite() or putc() to have a test.