C++ regex exclusion double quotes not working

后端 未结 2 796
渐次进展
渐次进展 2021-01-25 00:32

I am considering input files with lines like

\"20170103\",\"MW JANE DOE\",\"NL01 INGB 1234 5678 90\",\"NL02 INGB 1234 5678 90\",\"GT\",\"Af\",\"12,34\",\"Interne         


        
相关标签:
2条回答
  • 2021-01-25 01:06

    You have a pattern with a capturing group. So, when your regex finds a match, the double quotes are part of the whole match value (that is stored in the [0]th element), but the captured part is stored in the [1]th element.

    So, you just need to access capturing group #1 contents:

    linePart=it->str(1);
    

    See regular-expressions.info Finding a Regex Match:

    When the function call returns true, you can call the str(), position(), and length() member functions of the match_results object to get the text that was matched, or the starting position and its length of the match relative to the subject string. Call these member functions without a parameter or with 0 as the parameter to get the overall regex match. Call them passing 1 or greater to get the match of a particular capturing group. The size() member function indicates the number of capturing groups plus one for the overall match. Thus you can pass a value up to size()-1 to the other three member functions.

    0 讨论(0)
  • 2021-01-25 01:23

    As others have said, regex_iterator::operator-> returns a match_results and match_results::str is defaulted to 0:

    The first sub_match (index 0) contained in a match_result always represents the full match within a target sequence made by a regex, and subsequent sub_matches represent sub-expression matches corresponding in sequence to the left parenthesis delimiting the sub-expression in the regex

    So the problem with your code is you're not using linePart = it->str(1).

    A better solution would be to use a regex_token_iterator. With whitch you could just use your re to directly initialize lineParts:

    vector<string> lineParts { sregex_token_iterator(cbegin(line), cend(line), re, 1), sregex_tokent_iterator() };
    

    But I'd just like to point out that c++14 introduced quoted does exactly what you're trying to do here, and more (it even handles escaped quotes for you!) It'd just be a shame not to use it.

    You probably are already getting your input from a stream, but just in the case you're not you'd need to initialize an istringstream, for the purposes of example I'll call mine: line. Then you can use quoted to populate lineParts like this:

    for(string linePart; line >> quoted(linePart); line.ignore(numeric_limits<streamsize>::max(), ',')) {
        lineParts.push_back(linePart);
    }
    

    Live Example

    0 讨论(0)
提交回复
热议问题