How to test whether stringstream operator>> has parsed a bad type and skip it

只愿长相守 提交于 2019-11-25 21:46:55

问题


I am interested in discussing methods for using stringstream to parse a line with multiple types. I would begin by looking at the following line:

\"2.832 1.3067 nana 1.678\"

Now lets assume I have a long line that has multiple strings and doubles. The obvious way to solve this is to tokenize the string and then check converting each one. I am interested in skipping this second step and using stringstream directly to only find the numbers.

I figured a good way to approach this would be to read through the string and check if the failbit has been set, which it will if I try to parse a string into a double.

Say I have the following code:

string a(\"2.832 1.3067 nana 1.678\");

 stringstream parser;
 parser.str(a);

 for (int i = 0; i < 4; ++i)
 {
     double b;
     parser >> b;
     if (parser.fail())
     {
         std::cout << \"Failed!\" << std::endl;
         parser.clear();
     }
     std::cout << b << std::endl;
 }

It will print out the following:

2.832
1.3067
Failed!
0
Failed!
0

I am not surprised that it fails to parse a string, but what is happening internally such that it fails to clear its failbit and parse the next number?


回答1:


The following code works well to skip the bad word and collect the valid double values

istringstream iss("2.832 1.3067 nana 1.678");
double num = 0;
while(iss >> num || !iss.eof()) {
    if(iss.fail()) {
        iss.clear();
        string dummy;
        iss >> dummy;
        continue;
    }
    cout << num << endl;
}

Here's a fully working sample.


Your sample almost got it right, it was just missing to consume the invalid input field from the stream after detecting it's wrong format

 if (parser.fail()) {
     std::cout << "Failed!" << std::endl;
     parser.clear();
     string dummy;
     parser >> dummy;
 }

In your case the extraction will try to read again from "nana" for the last iteration, hence the last two lines in the output.

Also note the trickery about iostream::fail() and how to actually test for iostream::eof() in my 1st sample. There's a well known Q&A, why simple testing for EOF as a loop condition is considered wrong. And it answers well, how to break the input loop when unexpected/invalid values were encountered. But just how to skip/ignore invalid input fields isn't explained there (and wasn't asked for).




回答2:


Few minor differences to πάντα ῥεῖ's answer - makes it also handle e.g. negative number representations etc., as well as being - IMHO - a little simpler to read.

#include <iostream>
#include <sstream>
#include <string>

int main()
{
    std::istringstream iss("2.832 1.3067 nana1.678 x-1E2 xxx.05 meh.ugh");
    double num = 0;
    for (; iss; )
        if (iss >> num)
            std::cout << num << '\n';
        else if (!iss.eof())
        {
            iss.clear();
            iss.ignore(1);
        }
}

Output:

2.832
1.3067
1.678
-100
0.05

(see it running here)




回答3:


I have built up a more fine tuned version for this, that is able to skip invalid input character wise (without need to separate double numbers with whitespace characters):

#include <iostream>
#include <sstream>
#include <string>
using namespace std;

int main() {

    istringstream iss("2.832 1.3067 nana1.678 xxx.05 meh.ugh");
    double num = 0;
    while(iss >> num || !iss.eof()) {
        if(iss.fail()) {
            iss.clear();
            while(iss) {
                char dummy = iss.peek();
                if(std::isdigit(dummy) || dummy == '.') {
                    // Stop consuming invalid double characters
                    break;
                }
                else {
                    iss >> dummy; // Consume invalid double characters
                }
            }
            continue;
        }
        cout << num << endl;
    }
    return 0;
}

Output

 2.832
 1.3067
 1.678
 0.05

Live Demo




回答4:


If you like concision - here's another option that (ab?)uses && to get cout done only when a number's been parsed successfully, and when a number isn't parsed it uses the comma operator to be able to clear() stream error state inside the conditional before reading a character to be ignored...

#include <iostream>
#include <sstream>
#include <string>

int main()
{
    std::istringstream iss("2.832 1.3067 nana1.678 x-1E2 xxx.05 meh.ugh");
    double num = 0;
    char ignored;
    while (iss >> num && std::cout << num << '\n' ||
           (iss.clear(), iss) >> ignored)
        ;
}

http://ideone.com/WvtvfU




回答5:


You can use std::istringstream::eof() to validate input like this:

#include <string>
#include <sstream>
#include <iostream>

// remove white-space from each end of a std::string
inline std::string& trim(std::string& s, const char* t = " \t")
{
    s.erase(s.find_last_not_of(t) + 1);
    s.erase(0, s.find_first_not_of(t));
    return s;
}

// serial input
std::istringstream in1(R"~(
 2.34 3 3.f 3.d .75 0 wibble 
)~");

// line input
std::istringstream in2(R"~(
2.34
 3

3.f
3.d
.75
0
wibble 
)~");

int main()
{
    std::string input;

    // NOTE: This technique will not work if input is empty
    // or contains only white-space characters. Therefore
    // it is safe to use after a conditional extraction
    // operation >> but it is not reliable after std::getline()
    // without further checks.

    while(in1 >> input)
    {
        // input will not be empty and will not contain white-space.
        double d;
        if((std::istringstream(input) >> d >> std::ws).eof())
        {
            // d is a valid double
            std::cout << "d1: " << d << '\n';
        }
    }

    std::cout << '\n';

    while(std::getline(in2, input))
    {
        // eliminate blank lines and lines
        // containing only white-space (trim())
        if(trim(input).empty())
            continue;

        // NOW this is safe to use

        double d;
        if((std::istringstream(input) >> d >> std::ws).eof())
        {
            // d is a valid double
            std::cout << "d2: " << d << '\n';
        }
    }
}

This works because the eof() check ensures that only the double was entered and not garbage like 12d4.



来源:https://stackoverflow.com/questions/24504582/how-to-test-whether-stringstream-operator-has-parsed-a-bad-type-and-skip-it

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!