c++ fastest way to read only last line of text file?

☆樱花仙子☆ 提交于 2020-01-08 17:18:09

问题


I would like to read only the last line of a text file (I'm on UNIX, can use Boost). All the methods I know require scanning through the entire file to get the last line which is not efficient at all. Is there an efficient way to get only the last line?

Also, I need this to be robust enough that it works even if the text file in question is constantly being appended to by another process.


回答1:


Use seekg to jump to the end of the file, then read back until you find the first newline. Below is some sample code off the top of my head using MSVC.

#include <iostream>
#include <fstream>
#include <sstream>

using namespace std;

int main()
{
    string filename = "test.txt";
    ifstream fin;
    fin.open(filename);
    if(fin.is_open()) {
        fin.seekg(-1,ios_base::end);                // go to one spot before the EOF

        bool keepLooping = true;
        while(keepLooping) {
            char ch;
            fin.get(ch);                            // Get current byte's data

            if((int)fin.tellg() <= 1) {             // If the data was at or before the 0th byte
                fin.seekg(0);                       // The first line is the last line
                keepLooping = false;                // So stop there
            }
            else if(ch == '\n') {                   // If the data was a newline
                keepLooping = false;                // Stop at the current position.
            }
            else {                                  // If the data was neither a newline nor at the 0 byte
                fin.seekg(-2,ios_base::cur);        // Move to the front of that data, then to the front of the data before it
            }
        }

        string lastLine;            
        getline(fin,lastLine);                      // Read the current line
        cout << "Result: " << lastLine << '\n';     // Display it

        fin.close();
    }

    return 0;
}

And below is a test file. It succeeds with empty, one-line, and multi-line data in the text file.

This is the first line.
Some stuff.
Some stuff.
Some stuff.
This is the last line.



回答2:


Jump to then end, and start reading blocks backwards until you find whatever your criteria for a line is. If the last block doesn't "end" with a line, you'll probably need to try and scan forward as well (assuming a really long line in an actively appended to file).




回答3:


Initially this was designed to read the last syslog entry. Given that the last character before the EOF is '\n' we seek back to find the next occurrence of '\n' and then we store the line into a string.

#include <fstream>
#include <iostream>

int main()
{
  const std::string filename = "test.txt";
  std::ifstream fs;
  fs.open(filename.c_str(), std::fstream::in);
  if(fs.is_open())
  {
    //Got to the last character before EOF
    fs.seekg(-1, std::ios_base::end);
    if(fs.peek() == '\n')
    {
      //Start searching for \n occurrences
      fs.seekg(-1, std::ios_base::cur);
      int i = fs.tellg();
      for(i;i > 0; i--)
      {
        if(fs.peek() == '\n')
        {
          //Found
          fs.get();
          break;
        }
        //Move one character back
        fs.seekg(i, std::ios_base::beg);
      }
    }
    std::string lastline;
    getline(fs, lastline);
    std::cout << lastline << std::endl;
  }
  else
  {
    std::cout << "Could not find end line character" << std::endl;
  }
  return 0;
}



回答4:


While the answer by derpface is definitely correct, it often returns unexpected results. The reason for this is that, at least on my operating system (Mac OSX 10.9.5), many text editors terminate their files with an 'end line' character.

For example, when I open vim, type just the single character 'a' (no return), and save, the file will now contain (in hex):

61 0A

Where 61 is the letter 'a' and 0A is an end of line character.

This means that the code by derpface will return an empty string on all files created by such a text editor.

While I can certainly imagine cases where a file terminated with an 'end line' should return the empty string, I think ignoring the last 'end line' character would be more appropriate when dealing with regular text files; if the file is terminated by an 'end line' character we properly ignore it, and if the file is not terminated by an 'end line' character we don't need to check it.

My code for ignoring the last character of the input file is:

#include <iostream>
#include <string>
#include <fstream>
#include <iomanip>

int main() {
    std::string result = "";
    std::ifstream fin("test.txt");

    if(fin.is_open()) {
        fin.seekg(0,std::ios_base::end);      //Start at end of file
        char ch = ' ';                        //Init ch not equal to '\n'
        while(ch != '\n'){
            fin.seekg(-2,std::ios_base::cur); //Two steps back, this means we
                                              //will NOT check the last character
            if((int)fin.tellg() <= 0){        //If passed the start of the file,
                fin.seekg(0);                 //this is the start of the line
                break;
            }
            fin.get(ch);                      //Check the next character
        }

        std::getline(fin,result);
        fin.close();

        std::cout << "final line length: " << result.size() <<std::endl;
        std::cout << "final line character codes: ";
        for(size_t i =0; i<result.size(); i++){
            std::cout << std::hex << (int)result[i] << " ";
        }
        std::cout << std::endl;
        std::cout << "final line: " << result <<std::endl;
    }

    return 0;
}

Which will output:

final line length: 1
final line character codes: 61 
final line: a

On the single 'a' file.

EDIT: The line if((int)fin.tellg() <= 0){ actually causes problems if the file is too large (> 2GB), because tellg does not just return the number of characters from the start of the file (tellg() function give wrong size of file?). It may be better to separately test for the start of the file fin.tellg()==tellgValueForStartOfFile and for errors fin.tellg()==-1. The tellgValueForStartOfFile is probably 0, but a better way of making sure would probably be:

fin.seekg (0, is.beg);
tellgValueForStartOfFile = fin.tellg();



回答5:


You can use seekg() to jump to the end of file, and read backward, the Pseudo-code is like:

ifstream fs
fs.seekg(ios_base::end)
bytecount = fs.tellg()
index = 1
while true
    fs.seekg(bytecount - step * index, ios_base::beg)
    fs.read(buf, step)
    if endlinecharacter in buf
        get endlinecharacter's index, said ei
        fs.seekg(bytecount - step*index + ei)
        fs.read(lastline, step*index - ei)
        break
    ++index



回答6:


I was also struggling on the problem because I ran uberwulu's code and also got blank line. Here is what I found. I am using the following .csv file as an example:

date       test1  test2
20140908       1      2
20140908      11     22
20140908     111    235

To understand the commands in the code, please notice the following locations and their corresponding chars. (Loc, char) : ... (63,'3') , (64,'5') , (65,-) , (66,'\n'), (EOF,-).

#include<iostream>
#include<string>
#include<fstream>

using namespace std;

int main()
{
    std::string line;
    std::ifstream infile; 
    std::string filename = "C:/projects/MyC++Practice/Test/testInput.csv";
    infile.open(filename);

    if(infile.is_open())
    {
        char ch;
        infile.seekg(-1, std::ios::end);        // move to location 65 
        infile.get(ch);                         // get next char at loc 66
        if (ch == '\n')
        {
            infile.seekg(-2, std::ios::cur);    // move to loc 64 for get() to read loc 65 
            infile.seekg(-1, std::ios::cur);    // move to loc 63 to avoid reading loc 65
            infile.get(ch);                     // get the char at loc 64 ('5')
            while(ch != '\n')                   // read each char backward till the next '\n'
            {
                infile.seekg(-2, std::ios::cur);    
                infile.get(ch);
            }
            string lastLine;
            std::getline(infile,lastLine);
            cout << "The last line : " << lastLine << '\n';     
        }
        else
            throw std::exception("check .csv file format");
    }
    std::cin.get();
    return 0;
}  



回答7:


I took alexandros' solution and spruced it up a bit

bool moveToStartOfLine(std::ifstream& fs)
{
    fs.seekg(-1, std::ios_base::cur);
    for(long i = fs.tellg(); i > 0; i--)
    {
        if(fs.peek() == '\n')
        {
            fs.get();
            return true;
        }
        fs.seekg(i, std::ios_base::beg);
    }
    return false;
}

std::string getLastLineInFile(std::ifstream& fs)
{
    // Go to the last character before EOF
    fs.seekg(-1, std::ios_base::end);
    if (!moveToStartOfLine(fs))
        return "";

    std::string lastline = "";
    getline(fs, lastline);
    return lastline;
}

int main()
{
    const std::string filename = "test.txt";
    std::ifstream fs;
    fs.open(filename.c_str(), std::fstream::in);
    if(!fs.is_open())
    {
        std::cout << "Could not open file" << std::endl;
        return -1;
    }

    std::cout << getLastLineInFile(fs) << std::endl;

    return 0;
}


来源:https://stackoverflow.com/questions/11876290/c-fastest-way-to-read-only-last-line-of-text-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!