问题
A piece of code of my gramamar its driveing me crazy.
I have to write a grammar that allow write functions with multiple inputs
e.g.
function
begin
a:
<statments>
b:
<statements>
end
The problem with that its that is statements that are assignments like this
ID = Expresion.
in the following quote you can see the output produced by yacc.
0 $accept : InstanciasFuncion $end
1 InstanciasFuncion : InstanciasFuncion InstanciaFuncion
2 | InstanciaFuncion
3 InstanciaFuncion : PuntoEntrada Sentencias
4 PuntoEntrada : ID ':'
5 Sentencias : Sentencias Sentencia
6 | Sentencia
7 Sentencia : ID '=' ID
State 0
0 $accept: . InstanciasFuncion $end
ID shift, and go to state 1
InstanciasFuncion go to state 2
InstanciaFuncion go to state 3
PuntoEntrada go to state 4
State 1
4 PuntoEntrada: ID . ':'
':' shift, and go to state 5
State 2
0 $accept: InstanciasFuncion . $end
1 InstanciasFuncion: InstanciasFuncion . InstanciaFuncion
$end shift, and go to state 6
ID shift, and go to state 1
InstanciaFuncion go to state 7
PuntoEntrada go to state 4
State 3
2 InstanciasFuncion: InstanciaFuncion .
$default reduce using rule 2 (InstanciasFuncion)
State 4
3 InstanciaFuncion: PuntoEntrada . Sentencias
ID shift, and go to state 8
Sentencias go to state 9
Sentencia go to state 10
State 5
4 PuntoEntrada: ID ':' .
$default reduce using rule 4 (PuntoEntrada)
State 6
0 $accept: InstanciasFuncion $end .
$default accept
State 7
1 InstanciasFuncion: InstanciasFuncion InstanciaFuncion .
$default reduce using rule 1 (InstanciasFuncion)
State 8
7 Sentencia: ID . '=' ID
'=' shift, and go to state 11
State 9
3 InstanciaFuncion: PuntoEntrada Sentencias .
5 Sentencias: Sentencias . Sentencia
ID shift, and go to state 8
ID [reduce using rule 3 (InstanciaFuncion)]
$default reduce using rule 3 (InstanciaFuncion)
Sentencia go to state 12
State 10
6 Sentencias: Sentencia .
$default reduce using rule 6 (Sentencias)
State 11
7 Sentencia: ID '=' . ID
ID shift, and go to state 13
State 12
5 Sentencias: Sentencias Sentencia .
$default reduce using rule 5 (Sentencias)
State 13
7 Sentencia: ID '=' ID .
$default reduce using rule 7 (Sentencia)
Maybe somebody can help me to disambiguate this grammar
回答1:
Bison provides you with at least a hint. In State 9, which is really the only relevant part of the output other than the grammar itself, we see:
State 9
3 InstanciaFuncion: PuntoEntrada Sentencias .
5 Sentencias: Sentencias . Sentencia
ID shift, and go to state 8
ID [reduce using rule 3 (InstanciaFuncion)]
$default reduce using rule 3 (InstanciaFuncion)
Sentencia go to state 12
There's a shift/reduce conflict with ID
, in the context in which the possibilities are:
Complete the parse of an
InstanciaFuncion
(reduce)Continue the parse of a
Sentencias
(shift)
In both of those contexts, an ID
is possible. It's easy to construct an example. Consider these two instancias:
f : a = b c = d ...
f : a = b c : d = ...
We've finished with the b
and c
is the lookahead, so we can't see the symbol which follows the c. Now, have we finished parsing the funcion f
? Or should we try for a longer list of sentencias? No se sabe. (Nobody knows.)
Yes, your grammar is unambiguous, so it doesn't need to be disambiguated. It's not LR(1), though: you cannot tell what to do by only looking at the next one symbol. However, it is LR(2), and there is a proof than any LR(2) grammar has a corresponding LR(1) grammar. (For any value of 2 :) ). But, unfortunately, actually doing the transformation is not always very pretty. It can be done mechanically, but the resulting grammar can be hard to read. (See Notes below for references.)
In your case, it's pretty easy to find an equivalent grammar, but the parse tree will need to be adjusted. Here's one example:
InstanciasFuncion : PuntoEntrada
| InstanciasFuncion PuntoEntrada
| InstanciasFuncion Sentencia
PuntoEntrada: ID ':' Sentencia
Sentencia : ID '=' ID
It's a curious fact that this precise shift/reduce conflict is a feature of the grammar of bison itself, since bison accepts grammars as written above (i.e. without semi-colons). Posix insists that yacc
do so, and bison tries to emulate yacc
. Bison itself solves this problem in the scanner, not in the grammar: it's scanner recognizes "ID :" as a single token (even if separated with arbitrary whitespace). That might also be your best bet.
There is an excellent description of the proof than any LR(k) grammar can be covered by an LR(1) grammar, including the construction technique and a brief description of how to recover the original parse tree, in Sippu & Soisalon-Soininen, Parsing Theory, Vol. II (Springer Verlag, 1990) (Amazon). This two-volume set is a great reference for theoreticians, and has a lot of valuable practical information, but its heavy reading and its also a serious investment. If you have a university library handy, there should be a copy of it available. The algorithm presented is due to MD Mickunas, and was published in 1976 in JACM 23:17-30 (paywalled), which you should also be able to find in a good university library. Failing that, I found a very abbreviated description in Richard Marion Schell's thesis.
Personally, I wouldn't bother with all that, though. Either use a GLR parser, or use the same trick bison uses for the same purpose. Or use the simple grammar in the answer above and fiddle with the AST afterwards; it's not really difficult.
来源:https://stackoverflow.com/questions/19335353/a-yacc-shift-reduce-conflict-on-an-unambiguous-grammar