How do I rewrite a context free grammar so that it is LR(1)?

江枫思渺然 提交于 2019-12-08 09:12:14

问题


For the given context free grammar:

S -> G $
G -> PG | P
P -> id : R
R -> id R | epsilon

How do I rewrite the grammar so that it is LR(1)?
The current grammar has shift/reduce conflicts when parsing the input "id : .id", where "." is the input pointer for the parser.
This grammar produces the language satisfying the regular expression (id:(id)*)+


回答1:


It's easy enough to produce an LR(1) grammar for the same language. The trick is finding one which has a similar parse tree, or at least from which the original parse tree can be recovered easily.

Here's a manually generated grammar, which is slightly simplified from the general algorithm. In effect, we rewrite the regular expression:

(id:id*)+

to:

id(:id+)*:id*

which induces the grammar:

S  → id G $ 
G  → P G | P'
P' → : R'
P  → : R
R' → ε | id R'
R  → ε | id R

which is LALR(1).

In effect, we've just shifted all the productions one token to the right, and there is a general algorithm which can be used to create an LR(1) grammar from an LR(k+1) grammar for any k≥1. (The version of this algorithm I'm using comes from Parsing Theory by S. Sippu & E. Soisalon-Soininen, Vol II, section 6.7.)

The non-terminals of the new grammar will have the form (x, V, y) where V is a symbol from the original grammar (either a terminal or a non-terminal) and x and y are terminal sequences of maximum length k such that:

y ∈ FOLLOWk(V)
x ∈ FIRSTk(Vy)

(The lengths of y and consequently x might be less than k if the end of input is included in the follow set. Some people avoid this issue by adding k end symbols, but I think this version is just as simple.)

A non-terminal (x, V, y) will generate the x-derivative of the strings derived from Vy from the original grammar. Informally, the entire grammar is shifted k tokens to the right; each non-terminal matches a string which is missing the first k tokens but is augmented with the following k tokens.

The productions are generated mechanically from the original productions. First, we add a new start symbol, S' with productions:

S' → x (x, S, ε)

for every x ∈ FIRSTk(S). Then, for every production

T → V0 V1 … Vm

we generate the set of productions:

(x0,T,xm+1) → (x0,V0,x1) (x1,V1,x2) … (xm,Vm,xm+1)

and for every terminal A we generate the set of productions

(Ax,A,xB) → B    if |x| = k
(Ax,A,x) → ε     if |x| ≤ k

Since there is an obvious homomorphism from the productions in the new grammar to the productions in the old grammar, we can directly create the original parse tree, although we need to play some tricks with the semantic values in order to correctly attach them to the parse tree.



来源:https://stackoverflow.com/questions/26786746/how-do-i-rewrite-a-context-free-grammar-so-that-it-is-lr1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!