How do I remove shift-reduce conflict for bison for the given grammar?
selection-stmt -> if ( expression ) statement |
if ( express
make if else higher level than normal statements, like:
statements:
statements lineEnd statement
| statements lineEnd IfStat
| statements lineEnd IfElseStat
| IfStat
| IfElseStat
;
IfStat:
if ( statement )
;
IfElse:
IfStat else statement
;
There is a much simpler solution. If you know how LR parsers work, then you know that the conflict happens here:
if ( expression ) statement * else statement
where the star marks the current position of the cursor. The question the parser must answer is "should I shift, or should I reduce". Usually, you want to bind the else
to the closest if
, which means you want to shift the else
token now. Reducing now would mean that you want the else
to wait to be bound to an "older" if
.
Now you want to "tell" your parser generator that "when there is a shift/reduce conflict between the token "else"
and the rule "stm -> if ( exp ) stm", then the token must win". To do so, "give a name" to the precedence of your rule (e.g., "then"
), and specify that "then"
has less precedence than "else"
. Something like:
// Precedences go increasing, so "then" < "else".
%nonassoc "then"
%nonassoc "else"
%%
stm: "if" "(" exp ")" stm %prec "then"
| "if" "(" exp ")" stm "else" stm
using Bison syntax.
Actually, my favorite answer is even to give "then"
and "else"
the same precedence. When the precedences are equal, to break the tie between the token that wants to be shifted, and the rule that wants to be reduced, Bison/Yacc will look at associativity. Here, you want to promote right-associativity so to speak (more exactly, you want to promote "shift"), so:
%right "then" "else" // Same precedence, but "shift" wins.
will suffice.
You need to recognize the fact that the middle statement
in the if-else case cannot be (or end with) a dangling if (an if with no else.) The easiest way to do that is to split the stmt
rule in two:
stmt -> stmt-ending-with-dangling-if | stmt-not-ending-with-dangling-if
stmt-not-ending-with-dangling-if ->
if ( expression ) stmt-not-ending-with-dangling-if else stmt-not-ending-with-dangling-if |
...other statements not ending with dangling if...
stmt-ending-with-dangling-if ->
if ( expression ) stmt |
if ( expression ) stmt-not-ending-with-dangling-if else stmt-ending-with-dangling-if |
...other statements ending with dangling if...
Any other stmt -> whatever
rule where whatever
doesn't end with a stmt
goes in the stmt-not-ending-with-if
rule, while any stmt
rule that DOES end in stmt
get split in two versions; an not-ending-with-if
version in the not-ending-with-if
rule and a dangling-if
version in the dangling-if
rule.
edit
A more complete grammar with other productions:
stmt : stmt-ending-with-dangling-if | stmt-not-ending-with-dangling-if
stmt-not-ending-with-dangling-if :
IF '(' expr ')' stmt-not-ending-with-dangling-if ELSE stmt-not-ending-with-dangling-if |
WHILE '(' expr ')' stmt-not-ending-with-dangling-if |
DO stmt WHILE '(' expr ')' ';' |
expr ';' |
'{' stmt-list '}'
stmt-ending-with-dangling-if:
IF '(' expr ')' stmt |
IF '(' expr ')' stmt-not-ending-with-dangling-if ELSE stmt-ending-with-dangling-if |
WHILE '(' expr ')' stmt-ending-with-dangling-if
Rules like WHILE (expr) stmt
get split in two (as they end with stmt
), while rules like expr;
do not.