问题
I noticed some strange behaviour when reading a file by line. If the file ends with \n
(empty line), it may be skipped...but not always, and I don't see what makes it be skipped or not.
I wrote this little function splitting a string into lines to reproduce the issue easily:
std::vector<std::string> SplitLines( const std::string& inputStr )
{
std::vector<std::string> lines;
std::stringstream str;
str << inputStr;
std::string sContent;
while ( std::getline( str, sContent ) )
{
lines.push_back( sContent );
}
return lines;
}
When I test it (http://cpp.sh/72dgw), I get those outputs:
(1) "a\nb" was splitted to 2 line(s):"a" "b"
(2) "a" was splitted to 1 line(s):"a"
(3) "" was splitted to 0 line(s):
(4) "\n" was splitted to 1 line(s):""
(5) "\n\n" was splitted to 2 line(s):"" ""
(6) "\nb\n" was splitted to 2 line(s):"" "b"
(7) "a\nb\n" was splitted to 2 line(s):"a" "b"
(8) "a\nb\n\n" was splitted to 3 line(s):"a" "b" ""
So last \n
is skipped for case (6), (7) and (8), fine. But why it's not for (4) and (5) then?
What's the rational behind this behaviour?
回答1:
There is an interesting post that quicky mentioned this "strange" behaviour: getline() sets failbit and skips last line
As menioned by Rob's answer, \n
is a terminator (that's actually why it's names End Of Line), not a separator, meaning that lines are defined as "ending by a '\n'", not as being "separated by a '\n'".
It was unclear to me how this answered the question, but it actually does. Reformulating as below, it becomes clear as water:
If your content counts x
occurences of '\n', then you'll end up with x
lines, or x+1
if there is some extra non '\n' characters at the end of the file.
(1) "a\nb" splitted to 2 line(s):"a" "b" (1 EOL + extra characters = 2 lines)
(2) "a" splitted to 1 line(s):"a" (0 EOL + extra characters = 1 line)
(3) "" splitted to 0 line(s): (0 EOL + no extra characters = 0 line)
(4) "\n" splitted to 1 line(s):"" (1 EOL + no extra characters = 1 line)
(5) "\n\n" splitted to 2 line(s):"" "" (2 EOL + no extra characters = 2 lines)
(6) "\nb\n" splitted to 2 line(s):"" "b" (2 EOL + no extra characters = 2 lines)
(7) "a\nb\n" splitted to 2 line(s):"a" "b" (2 EOL + no extra characters = 2 lines)
(8) "a\nb\n\n" splitted to 3 line(s):"a" "b" "" (3 EOL + no extra characters = 3 lines)
来源:https://stackoverflow.com/questions/45610075/how-does-stdgetline-decides-to-skip-last-empty-line