问题
I'm having trouble building the collection of sets of items for LR(1) parsers with a grammar containing epsilon productions. For example, given the following grammar (where eps stands for epsilon)
S -> a S U
U -> b
| eps
State0 would be
S' -> .S, $
S -> .a S U, $
Moving with 'a' from State0 would give the following state, let's call it State2
S -> a .S U, $
S -> .a S U, $/???
In order to have the lookahead for the second item of State2 I need to calculate FIRST(U$). I know that FIRST(U) = {'b', eps}. My first question is: the lookaheads of the second item of State2 are $ and 'b'? Since U can be eps, my brain tells me that I can have $ as a lookahead as well, not just 'b'. It would have been just 'b' if FIRST(U) would have been just {'b'}. Is that correct?
Second question: at some point I will have a state as the following one
S -> a S .U, $
U -> .b, $
U -> .eps, $
What do I do here? Do I need to move with eps and have a set with the item U -> eps., $
? What if I have another terminal as lookahead, i.e. X -> .eps, a/$
? And if I move, ending up having a set of the form X -> eps., $
, do I reduce?
And more: do I need to insert eps in the parse table as a symbol?
Thanks
回答1:
FIRST(U$)
means "the set of symbols which could be first in a derivation ofU$
". Clearly, ifU
can derive the empty string,$
must be part of this set. The end-of-input marker$
ensures that we never have to worry about epsilons in theFIRST
sets. (If we were doing LR(k) instead of LR(1), we would usek
end markers so that all the strings inFIRSTk
had lengthk
.The item associated with
U →
(or withU → ε
if you insist) isU → •
. In other words, it is reducible and should trigger a reduce action on matching lookahead.ε
is not a symbol; we only use it (sometimes) to make the empty string visible. But the empty string is empty.
来源:https://stackoverflow.com/questions/50393772/lr1-parsing-table-with-epsilon-productions