I\'ve been informed that my library is slower than it should be, on the order of 30+ times too slow parsing a particular file (text file, size 326 kb). The user suggested th
It should be slightly slower, but like what you said, it might not be the bottleneck. Why don't you profile your program and see if that's the case?
I don't think that'd make a difference. Especially if you're reading char by char, the overhead of I/O is likely to completely dominate anything else. Why do you read single bytes at a time? You know how extremely inefficient it is?
On a 326kb file, the fastest solution will most likely be to just read it into memory at once.
The difference between std::ifstream and the C equivalents, is basically a virtual function call or two. It may make a difference if executed a few tens of million times per second, otherwise, not reall. file I/O is generally so slow that the API used to access it doesn't really matter. What matters far more is the read/write pattern. Lots of seeks are bad, sequential reads/writes good.
I agree that you should profile. But if you're reading the file a character at a time, how about creating a memory-mapped file? That way you can treat the file like an array of characters, and the OS should take care of all the low-level buffering for you. The simplest and probably fastest solution is a win in my book. :)
I thinks that is unlikely your problem will be fixed by switching from fstream to FILE*, usually both are buffered by the C library. Also the OS can cache reads (linux is very good in that aspect). Given the size of the file you are accessing is pretty likely it will be entirely in RAM.
Like PolyThinker say your best bet is to run your program trough an profiler an determine where the problem is.
Also you are using seekg/tellg this can cause notable delays if your disk is heavily fragmented, because to read the file for the first time the disk have to move the heads to the correct position.
Here is an excellent benchmark which shows that under extreme conditions, fstream
s are actually quite slow... unless:
FILE*
.You shouldn't optimize prematurely, though. fstreams
are generally better, and if you need to optimize them down in the road, you can always do it later with little cost. In order to prepare for the worst in advance, I suggest creating a minimal proxy for fstream
now so that you can optimize it later, without need to touch anything else.
All benchmarks are evil. Just profile your code for the data you expect.
I performed an I/O performance comparison between Ruby, Python, Perl, C++ once. For my data, languages versions, etc C++'s variant was several times slower (it was a big suprise at that time).