Equation (expression) parser with precedence?

前端 未结 23 1528
遇见更好的自我
遇见更好的自我 2020-11-22 11:44

I\'ve developed an equation parser using a simple stack algorithm that will handle binary (+, -, |, &, *, /, etc) operators, unary (!) operators, and parenthesis.

<
23条回答
  •  失恋的感觉
    2020-11-22 12:38

    Algorithm could be easily encoded in C as recursive descent parser.

    #include 
    #include 
    
    /*
     *  expression -> sum
     *  sum -> product | product "+" sum
     *  product -> term | term "*" product
     *  term -> number | expression
     *  number -> [0..9]+
     */
    
    typedef struct {
        int value;
        const char* context;
    } expression_t;
    
    expression_t expression(int value, const char* context) {
        return (expression_t) { value, context };
    }
    
    /* begin: parsers */
    
    expression_t eval_expression(const char* symbols);
    
    expression_t eval_number(const char* symbols) {
        // number -> [0..9]+
        double number = 0;        
        while (isdigit(*symbols)) {
            number = 10 * number + (*symbols - '0');
            symbols++;
        }
        return expression(number, symbols);
    }
    
    expression_t eval_term(const char* symbols) {
        // term -> number | expression
        expression_t number = eval_number(symbols);
        return number.context != symbols ? number : eval_expression(symbols);
    }
    
    expression_t eval_product(const char* symbols) {
        // product -> term | term "*" product
        expression_t term = eval_term(symbols);
        if (*term.context != '*')
            return term;
    
        expression_t product = eval_product(term.context + 1);
        return expression(term.value * product.value, product.context);
    }
    
    expression_t eval_sum(const char* symbols) {
        // sum -> product | product "+" sum
        expression_t product = eval_product(symbols);
        if (*product.context != '+')
            return product;
    
        expression_t sum = eval_sum(product.context + 1);
        return expression(product.value + sum.value, sum.context);
    }
    
    expression_t eval_expression(const char* symbols) {
        // expression -> sum
        return eval_sum(symbols);
    }
    
    /* end: parsers */
    
    int main() {
        const char* expression = "1+11*5";
        printf("eval(\"%s\") == %d\n", expression, eval_expression(expression).value);
    
        return 0;
    }
    

    next libs might be useful: yupana - strictly arithmetic operations; tinyexpr - arithmetic operations + C math functions + one provided by user; mpc - parser combinators

    Explanation

    Let's capture sequence of symbols that represent algebraic expression. First one is a number, that is a decimal digit repeated one or more times. We will refer such notation as production rule.

    number -> [0..9]+
    

    Addition operator with its operands is another rule. It is either number or any symbols that represents sum "*" sum sequence.

    sum -> number | sum "+" sum
    

    Try substitute number into sum "+" sum that will be number "+" number which in turn could be expanded into [0..9]+ "+" [0..9]+ that finally could be reduced to 1+8 which is correct addition expression.

    Other substitutions will also produce correct expression: sum "+" sum -> number "+" sum -> number "+" sum "+" sum -> number "+" sum "+" number -> number "+" number "+" number -> 12+3+5

    Bit by bit we could resemble set of production rules aka grammar that express all possible algebraic expression.

    expression -> sum
    sum -> difference | difference "+" sum
    difference -> product | difference "-" product
    product -> fraction | fraction "*" product
    fraction -> term | fraction "/" term
    term -> "(" expression ")" | number
    number -> digit+                                                                    
    

    To control operator precedence alter position of its production rule against others. Look at grammar above and note that production rule for * is placed below + this will force product evaluate before sum. Implementation just combines pattern recognition with evaluation and thus closely mirrors production rules.

    expression_t eval_product(const char* symbols) {
        // product -> term | term "*" product
        expression_t term = eval_term(symbols);
        if (*term.context != '*')
            return term;
    
        expression_t product = eval_product(term.context + 1);
        return expression(term.value * product.value, product.context);
    }
    

    Here we eval term first and return it if there is no * character after it this is left choise in our production rule otherwise - evaluate symbols after and return term.value * product.value this is right choise in our production rule i.e. term "*" product

提交回复
热议问题