how to read extreme long lines from text file fast and safe in C++?

后端 未结 3 1023
逝去的感伤
逝去的感伤 2021-01-04 20:37

There is a large text file of 6.53 GiB. Each line of it can be a data line or comment line. Comment lines are usually short, less than 80 characters, while a data line conta

相关标签:
3条回答
  • 2021-01-04 20:40

    Yes, there's a faster way to read lines and create strings.

    Query the file size, then load it into a buffer. Then iterate over the buffer replacing the newlines with nuls and storing the pointer to the next line.

    It will be quite a bit faster if, as is likely, your platform has a call to load a file into memory.

    0 讨论(0)
  • 2021-01-04 20:52

    As I commented, on Linux & POSIX systems, you could consider using getline(3); I guess that the following could compile both as C and as C++ (assuming you do have some valid fopen-ed FILE*fil; ...)

    char* linbuf = NULL; /// or nullptr in C++
    size_t linsiz = 0;
    ssize_t linlen = 0;
    
    while((linlen=getline(&linbuf, &linsiz,fil))>=0) {
      // do something useful with linbuf; but no C++ exceptions
    }
    free(linbuf); linsiz=0;
    

    I guess this might work (or be easily adapted) to C++. But then, beware of C++ exceptions, they should not go thru the while loop (or you should ensure that an appropriate destructor or catch is doing free(linbuf);).

    Also getline could fail (e.g. if it calls a failing malloc) and you might need to handle that failure sensibly.

    0 讨论(0)
  • 2021-01-04 20:55

    Well, the C standard library is a subset of the C++ standard library. From n4296 draft from C++ 2014 standard:

    17.2 The C standard library [library.c]

    The C++ standard library also makes available the facilities of the C standard library, suitably adjusted to ensure static type safety.

    So provided you explain in a comment that a performance bottleneck requires it, it is perfectly fine to use fgets in a C++ program - simply you should carefully encapsulate it in an utility class, in order to preserve the OO high level structures.

    0 讨论(0)
提交回复
热议问题