Boost::Spirit Expression Parser

前端未结

关注

 1  1649

I have another problem with my boost::spirit parser.

template
struct expression: qi::grammar


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2020-11-30 09:47
              
            
            
                                                                       
It isn't entirely clear to me what you are trying to achieve. Most importantly, are you not worried about operator associativity? I'll just show simple answers based on using right-recursion - this leads to left-associative operators being parsed.

The straight answer to your visible question would be to juggle a fusion::vector2<char, ast::expression> - which isn't really any fun, especially in Phoenix lambda semantic actions. (I'll show below, what that looks like).

Meanwhile I think you should read up on the Spirit docs


here in the old Spirit docs (eliminating left recursion); Though the syntax no longer applies, Spirit still generates LL recursive descent parsers, so the concept behind left-recursion still applies. The code below shows this applied to Spirit Qi
here: the Qi examples contain three calculator samples, which should give you a hint on why operator associativity matters, and how you would express a grammar that captures the associativity of binary operators. Obviously, it also shows how to support parenthesized expressions to override the default evaluation order.


Code:

I have three version of code that works, parsing input like: 

std::string input("1/2+3-4*5");


into an ast::expression grouped like (using BOOST_SPIRIT_DEBUG):

<expr>
  ....
  <success></success>
  <attributes>[[1, [2, [3, [4, 5]]]]]</attributes>
</expr>



  The links to the code are here:
  
  
  step_#1_reduce_semantic_actions.cpp
  step_#2_drop_rule.cpp
  step_#0_vector2.cpp
  


Step 1: Reduce semantic actions

First thing, I'd get rid of the alternative parse expressions per operator; this leads to excessive backtracking¹. Also, as you've found out, it makes the grammar hard to maintain. So, here is a simpler variation that uses a function for the semantic action:

_{¹check that using BOOST_SPIRIT_DEBUG!}

static ast::expression make_binop(char discriminant, 
     const ast::expression& left, const ast::expression& right)
{
    switch(discriminant)
    {
        case '+': return ast::binary_op<ast::add>(left, right);
        case '-': return ast::binary_op<ast::sub>(left, right);
        case '/': return ast::binary_op<ast::div>(left, right);
        case '*': return ast::binary_op<ast::mul>(left, right);
    }
    throw std::runtime_error("unreachable in make_binop");
}

// rules:
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];

simple = varname | number;
binop = (simple >> char_("-+*/") >> expr) 
    [ _val = phx::bind(make_binop, qi::_2, qi::_1, qi::_3) ]; 

expr = binop | simple;


Step 2: Remove redundant rules, use _val

As you can see, this has the potential to reduce complexity. It is only a small step now, to remove the binop intermediate (which has become quite redundant):

number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];

simple = varname | number;
expr = simple [ _val = _1 ] 
    > *(char_("-+*/") > expr) 
            [ _val = phx::bind(make_binop, qi::_1, _val, qi::_2) ]
    > eoi;


As you can see, 


within the expr rule, the _val lazy placeholder is used as a pseudo-local variable that accumulates the binops. Across rules, you'd have to use qi::locals<ast::expression> for such an approach. (This was your question regarding _r1).
there are now explicit expectation points, making the grammar more robust
the expr rule no longer needs to be an auto-rule (expr = instead of expr %=)


Step 0: Wrestle fusion types directly

Finally, for fun and gory, let me show how you could have handled your suggested code, along with the shifting bindings of _1, _2 etc.:

static ast::expression make_binop(
        const ast::expression& left, 
        const boost::fusion::vector2<char, ast::expression>& op_right)
{
    switch(boost::fusion::get<0>(op_right))
    {
        case '+': return ast::binary_op<ast::add>(left, boost::fusion::get<1>(op_right));
        case '-': return ast::binary_op<ast::sub>(left, boost::fusion::get<1>(op_right));
        case '/': return ast::binary_op<ast::div>(left, boost::fusion::get<1>(op_right));
        case '*': return ast::binary_op<ast::mul>(left, boost::fusion::get<1>(op_right));
    }
    throw std::runtime_error("unreachable in make_op");
}

// rules:
expression::base_type(expr) {
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];

simple = varname | number;
binop %= (simple >> (char_("-+*/") > expr)) 
    [ _val = phx::bind(make_binop, qi::_1, qi::_2) ]; // note _2!!!

expr %= binop | simple;


As you can see, not nearly as much fun writing the make_binop function that way!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复