I\'m hoping someone can shine a light through my ignorance of using the >
and >>
operators in spirit parsing.
I have a working gra
Is it worth preventing back-tracking?
Absolutely. Preventing back tracking in general is a proven way to improve parser performance.
Using expectation points (>
) does not essentially reduce backtracking: it will just disallow it. This will enable targeted error messages, prevent useless 'parsing-into-the-unknown'.
Why do I need the grouping operator ()
I'm not sure. I had a check using my what_is_the_attr helpers from here
ident >> op >> repeat(1,3)[any] >> "->" >> any
synthesizes attribute:
fusion::vector4<string, string, vector<string>, string>
ident >> op > repeat(1,3)[any] > "->" > any
synthesizes attribute:
fusion::vector3<fusion::vector2<string, string>, vector<string>, string>
I haven't found the need to group subexpressions using parentheses (things compile), but obviously DataT
needs to be modified to match the changed layout.
typedef boost::tuple<
boost::tuple<std::string, std::string>,
std::vector<std::string>,
std::string
> DataT;
The full code below shows how I'd prefer to do that, using adapted structs.
Does my above example really stop back-tracking after operationRule is matched (I suspect not, it seems that if the entire parser inside the (...) fails backtracking will be allowed)?
Absolutely. If the expectation(s) is not met, a qi::expectation_failure<>
exception is thrown. This by default aborts the parse. You could use qi::on_error to retry
, fail
, accept
or rethrow
. The MiniXML example has very good examples on using expectation points with qi::on_error
If the answer to the previous question is /no/, how do I construct a rule that allows backtracking if operation is /not/ matched, but does not allow backtracking once operation /is/ matched?
Why does the grouping operator destroy the attribute grammar - requiring actions?
It doesn't destroy the attribute grammar, it just changes the exposed type. So, if you bind an appropriate attribute reference to the rule/grammar, you won't need semantic actions. Now, I feel there should be ways to go without the grouping , so let me try it (preferrably on your short selfcontained sample). And indeed I have found no such need. I've added a full example to help you see what is happening in my testing, and not using semantic actions.
The full code show 5 scenarios:
OPTION 1: Original without expectations
(no relevant changes)
OPTION 2: with expectations
Using the modified typedef for DataT (as shown above)
OPTION 3: adapted struct, without expectations
Using a userdefined struct with BOOST_FUSION_ADAPT_STRUCT
OPTION 4: adapted struct, with expectations
Modifying the adapted struct from OPTION 3
OPTION 5: lookahead hack
This one leverages a 'clever' (?) hack, by making all >>
into expectations, and detecting the presence of a operationRule
-match beforehand. This is of course suboptimal, but allows you to keep DataT
unmodified, and without using semantic actions.
Obviously, define OPTION
to the desired value before compiling.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
#ifndef OPTION
#define OPTION 5
#endif
#if OPTION == 1 || OPTION == 5 // original without expectations (OR lookahead hack)
typedef boost::tuple<std::string, std::string, std::vector<std::string>, std::string> DataT;
#elif OPTION == 2 // with expectations
typedef boost::tuple<boost::tuple<std::string, std::string>, std::vector<std::string>, std::string> DataT;
#elif OPTION == 3 // adapted struct, without expectations
struct DataT
{
std::string identifier, operation;
std::vector<std::string> values;
std::string destination;
};
BOOST_FUSION_ADAPT_STRUCT(DataT, (std::string, identifier)(std::string, operation)(std::vector<std::string>, values)(std::string, destination));
#elif OPTION == 4 // adapted struct, with expectations
struct IdOpT
{
std::string identifier, operation;
};
struct DataT
{
IdOpT idop;
std::vector<std::string> values;
std::string destination;
};
BOOST_FUSION_ADAPT_STRUCT(IdOpT, (std::string, identifier)(std::string, operation));
BOOST_FUSION_ADAPT_STRUCT(DataT, (IdOpT, idop)(std::vector<std::string>, values)(std::string, destination));
#endif
template <typename Iterator>
struct test_parser : qi::grammar<Iterator, DataT(), qi::space_type, qi::locals<char> >
{
test_parser() : test_parser::base_type(test, "test")
{
using namespace qi;
quoted_string =
omit [ char_("'\"") [_a =_1] ]
>> no_skip [ *(char_ - char_(_a)) ]
> lit(_a);
any_string = quoted_string | +qi::alnum;
identifier = lexeme [ alnum >> *graph ];
operationRule = string("add") | "sub";
arrow = "->";
#if OPTION == 1 || OPTION == 3 // without expectations
test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string;
#elif OPTION == 2 || OPTION == 4 // with expectations
test = identifier >> operationRule > repeat(1,3)[any_string] > arrow > any_string;
#elif OPTION == 5 // lookahead hack
test = &(identifier >> operationRule) > identifier > operationRule > repeat(1,3)[any_string] > arrow > any_string;
#endif
}
qi::rule<Iterator, qi::space_type/*, qi::locals<char> */> arrow;
qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> operationRule;
qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> identifier;
qi::rule<Iterator, std::string(), qi::space_type, qi::locals<char> > quoted_string, any_string;
qi::rule<Iterator, DataT(), qi::space_type, qi::locals<char> > test;
};
int main()
{
std::string str("addx001 add 'str1' \"str2\" -> \"str3\"");
test_parser<std::string::const_iterator> grammar;
std::string::const_iterator iter = str.begin();
std::string::const_iterator end = str.end();
DataT data;
bool r = phrase_parse(iter, end, grammar, qi::space, data);
if (r)
{
using namespace karma;
std::cout << "OPTION " << OPTION << ": " << str << " --> ";
#if OPTION == 1 || OPTION == 3 || OPTION == 5 // without expectations (OR lookahead hack)
std::cout << format(delimit[auto_ << auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#elif OPTION == 2 || OPTION == 4 // with expectations
std::cout << format(delimit[auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#endif
}
if (iter!=end)
std::cout << "Remaining: " << std::string(iter,end) << "\n";
}
Output for all OPTIONS:
for a in 1 2 3 4 5; do g++ -DOPTION=$a -I ~/custom/boost/ test.cpp -o test$a && ./test$a; done
OPTION 1: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 2: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 3: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 4: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 5: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3