Parse quoted strings with boost::spirit

前端 未结 1 553
梦毁少年i
梦毁少年i 2020-11-27 21:53

I would like to parse a sentence where some strings may be unquoted, \'quoted\' or \"quoted\". The code below almost works - but it fails to match closing quotes. I\'m guess

相关标签:
1条回答
  • 2020-11-27 22:27

    The reference to qq becomes dangling after leaving the constructor, so that is indeed a problem.

    qi::locals is the canonical way to keep local state inside parser expressions. Your other option would be to extend the lifetime of qq (by making it a member of the grammar class, e.g.). Lastly, you might be interested in inherited attributes as well. This mechanism gives you a way to call a rule/grammar with 'parameters' (passing local state around).

    NOTE There are caveats with the use of the kleene operator +: it is greedy, and parsing fails if the string is not terminated with the expected quote.

    See another answer I wrote for more complete examples of treating arbitrary contents in (optionally/partially) quoted strings, that allow escaping of quotes inside quoted strings and more things like that:

    • How to make my split work only on one real line and be capable to skip quoted parts of string?

    I've reduced the grammar to the relevant bit, and included a few test cases:

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/fusion/adapted.hpp>
    
    namespace qi = boost::spirit::qi;
    
    template <typename Iterator>
    struct test_parser : qi::grammar<Iterator, std::string(), qi::space_type, qi::locals<char> >
    {
        test_parser() : test_parser::base_type(any_string, "test")
        {
            using namespace qi;
    
            quoted_string = 
                   omit    [ char_("'\"") [_a =_1] ]             
                >> no_skip [ *(char_ - char_(_a))  ]
                >> lit(_a)
            ; 
    
            any_string = quoted_string | +qi::alnum;
        }
    
        qi::rule<Iterator, std::string(), qi::space_type, qi::locals<char> > quoted_string, any_string;
    };
    
    int main()
    {
        test_parser<std::string::const_iterator> grammar;
        const char* strs[] = { "\"str1\"", 
                               "'str2'",
                               "'str3' trailing ok",
                               "'st\"r4' embedded also ok",
                               "str5",
                               "str6'",
                               NULL };
    
        for (const char** it = strs; *it; ++it)
        {
            const std::string str(*it);
            std::string::const_iterator iter = str.begin();
            std::string::const_iterator end  = str.end();
    
            std::string data;
            bool r = phrase_parse(iter, end, grammar, qi::space, data);
    
            if (r)
                std::cout << "Parsed:    " << str << " --> " << data << "\n";
            if (iter!=end)
                std::cout << "Remaining: " << std::string(iter,end) << "\n";
        }
    }
    

    Output:

    Parsed:    "str1" --> str1
    Parsed:    'str2' --> str2
    Parsed:    'str3' trailing ok --> str3
    Remaining: trailing ok
    Parsed:    'st"r4' embedded also ok --> st"r4
    Remaining: embedded also ok
    Parsed:    str5 --> str5
    Parsed:    str6' --> str6
    Remaining: '
    
    0 讨论(0)
提交回复
热议问题