There is a large text file of 6.53 GiB. Each line of it can be a data line or comment line. Comment lines are usually short, less than 80 characters, while a data line conta
Yes, there's a faster way to read lines and create strings.
Query the file size, then load it into a buffer. Then iterate over the buffer replacing the newlines with nuls and storing the pointer to the next line.
It will be quite a bit faster if, as is likely, your platform has a call to load a file into memory.
As I commented, on Linux & POSIX systems, you could consider using getline(3); I guess that the following could compile both as C and as C++ (assuming you do have some valid fopen
-ed FILE*fil;
...)
char* linbuf = NULL; /// or nullptr in C++
size_t linsiz = 0;
ssize_t linlen = 0;
while((linlen=getline(&linbuf, &linsiz,fil))>=0) {
// do something useful with linbuf; but no C++ exceptions
}
free(linbuf); linsiz=0;
I guess this might work (or be easily adapted) to C++. But then, beware of C++ exceptions, they should not go thru the while loop (or you should ensure that an appropriate destructor or catch
is doing free(linbuf);
).
Also getline
could fail (e.g. if it calls a failing malloc
) and you might need to handle that failure sensibly.
Well, the C standard library is a subset of the C++ standard library. From n4296 draft from C++ 2014 standard:
17.2 The C standard library [library.c]
The C++ standard library also makes available the facilities of the C standard library, suitably adjusted to ensure static type safety.
So provided you explain in a comment that a performance bottleneck requires it, it is perfectly fine to use fgets
in a C++ program - simply you should carefully encapsulate it in an utility class, in order to preserve the OO high level structures.