How does std::getline decides to skip last empty line?

谁都会走 提交于 2019-12-10 20:35:24


I noticed some strange behaviour when reading a file by line. If the file ends with \n (empty line), it may be skipped...but not always, and I don't see what makes it be skipped or not.

I wrote this little function splitting a string into lines to reproduce the issue easily:

std::vector<std::string> SplitLines( const std::string& inputStr )
    std::vector<std::string> lines;

    std::stringstream str;
    str << inputStr;

    std::string sContent;
    while ( std::getline( str, sContent ) )
        lines.push_back( sContent );

    return lines;

When I test it (, I get those outputs:

(1) "a\nb"       was splitted to 2 line(s):"a" "b" 
(2) "a"          was splitted to 1 line(s):"a" 
(3) ""           was splitted to 0 line(s):
(4) "\n"         was splitted to 1 line(s):"" 
(5) "\n\n"       was splitted to 2 line(s):"" "" 
(6) "\nb\n"      was splitted to 2 line(s):"" "b" 
(7) "a\nb\n"     was splitted to 2 line(s):"a" "b" 
(8) "a\nb\n\n"   was splitted to 3 line(s):"a" "b" ""

So last \n is skipped for case (6), (7) and (8), fine. But why it's not for (4) and (5) then?

What's the rational behind this behaviour?


There is an interesting post that quicky mentioned this "strange" behaviour: getline() sets failbit and skips last line

As menioned by Rob's answer, \n is a terminator (that's actually why it's names End Of Line), not a separator, meaning that lines are defined as "ending by a '\n'", not as being "separated by a '\n'".

It was unclear to me how this answered the question, but it actually does. Reformulating as below, it becomes clear as water:

If your content counts x occurences of '\n', then you'll end up with x lines, or x+1 if there is some extra non '\n' characters at the end of the file.

(1) "a\nb"       splitted to 2 line(s):"a" "b"    (1 EOL + extra characters = 2 lines)
(2) "a"          splitted to 1 line(s):"a"        (0 EOL + extra characters = 1 line)
(3) ""           splitted to 0 line(s):           (0 EOL + no extra characters = 0 line)
(4) "\n"         splitted to 1 line(s):""         (1 EOL + no extra characters = 1 line) 
(5) "\n\n"       splitted to 2 line(s):"" ""      (2 EOL + no extra characters = 2 lines)
(6) "\nb\n"      splitted to 2 line(s):"" "b"     (2 EOL + no extra characters = 2 lines)
(7) "a\nb\n"     splitted to 2 line(s):"a" "b"    (2 EOL + no extra characters = 2 lines)
(8) "a\nb\n\n"   splitted to 3 line(s):"a" "b" "" (3 EOL + no extra characters = 3 lines)

