问题
I am trying to parse a sequence of expressions without delimiters in order to be able to parse ML/F# style function invocations:
myfunc expr1 expr2 expr3
However, the sequence of expressions is giving me a list of shift/reduce conflicts.
My guess is that the conflicts are caused by the recursive nature of my grammar, but I don't know how to fix these conflicts.
My (simplified) precedence rules and grammar looks like this:
/* Lowest precedence */
%left PLUS
%left TIMES
%left LPAR
/* Highest precedence */
Expr:
| CSTINT { CstI $1 }
| LPAR Expr RPAR { $2 }
| Expr TIMES Expr { Prim("*", $1, $3) }
| Expr PLUS Expr { Prim("+", $1, $3) }
| NAME ExprList { Call(Var $1, $2) }
ExprList:
| { [] }
| Expr ExprList { $1::$2 }
When I pass this to fsyacc I get a list of shift/reduce and reduce/reduce conflicts. An example shift/reduce conflict is
state 11: shift/reduce error on PLUS
The output from fsyacc for state 11 is:
state 11:
items:
Expr -> Expr . 'TIMES' Expr
Expr -> Expr . 'PLUS' Expr
ExprList -> Expr . ExprList
actions:
action 'EOF' (noprec): reduce ExprList -->
action 'LPAR' (explicit left 10000): shift 6
action 'RPAR' (noprec): reduce ExprList -->
action 'COMMA' (noprec): reduce ExprList -->
action 'PLUS' (explicit left 9998): shift 13
action 'TIMES' (explicit left 9999): shift 12
action 'NAME' (noprec): shift 14
action 'CSTINT' (noprec): shift 5
action 'error' (noprec): reduce ExprList -->
action '#' (noprec): reduce ExprList -->
action '$$' (noprec): reduce ExprList -->
immediate action: <none>
gotos:
goto Expr: 11
goto ExprList: 16
It has been a while since I took a course in compiler theory, so while I kind of know what shift/reduce and reduce/reduce conflicts are, I am not used to thinking about them. In particular, I don't see how reducing on PLUS
could lead to a valid parse. All in all, any insight into one or more of the following questions would be highly appreciated:
- Why is my grammar seemingly ambiguous?
- Can I fix it using precedence and/or associativity rules or, if not,
- Do I need to rewrite the grammar, and if so, roughly, how do I do it?
- Is yacc a suitable tool for a construct like this?
回答1:
1. Why is my grammar seemingly ambiguous?
Your grammar is ambiguous. It's not an illusion.
Suppose f is a function.
f x + 7
Is that f(x) + 7
or f(x+7)
?. Your grammar produces both.
IIRC, function application binds very tightly and associates to the left. So the above expression should parse as f(x) + 7
.
2. Can I fix it using precedence and/or associativity rules, or if not,
You can disambiguate function application with precedence and associativity rules; you just need to declare a precedence for it with %prec
. However, it ends up looking a bit ugly and ...
3. Do I need to rewrite the grammar, and if so, roughly, how do I do it?
... I don't believe it is correct to represent function application as Name ExprList
. It's a lot cleaner if you curry the arguments one at a time, at least while building the AST, and it looks prettier if you do it in the grammar rather than with precedence rules, which really weren't designed for invisible operators. See below.
4. Is yacc a suitable tool for a construct like this?
Sure, why not?
Here are two working (as far as I know) yacc grammars. The first uses precedence declarations for everything; the second separates out function application, which I think is cleaner:
// grammar1.y:
%left '+'
%left '*'
%left ATOM ';' '(' ')'
%%
program: /* empty */ { $$ = ""; }
| program statement ';' { std::cout << $2 << std::endl; }
| program ';'
;
statement: expr
;
expr: ATOM
| '(' expr ')' { $$ = $2; }
| expr expr %prec ATOM { $$ = '(' + $1 + ' ' + $2 + ')'; }
| expr '+' expr { $$ = "(+ " + $1 + ' ' + $3 + ')'; }
| expr '*' expr { $$ = "(* " + $1 + ' ' + $3 + ')'; }
;
// grammar2.y
%token ATOM
%left '+'
%left '*'
%%
program: /* empty */ { $$ = ""; }
| program statement ';' { std::cout << $2 << std::endl; }
| program ';'
;
statement: expr
;
term : ATOM
| '(' expr ')' { $$ = $2; }
;
apply: term
| apply term { $$ = '(' + $1 + ' ' + $2 + ')'; }
;
expr : apply
| expr '+' expr { $$ = "(+ " + $1 + ' ' + $3 + ')'; }
| expr '*' expr { $$ = "(* " + $1 + ' ' + $3 + ')'; }
;
来源:https://stackoverflow.com/questions/17933925/parsing-a-sequence-of-expressions-using-yacc