问题
I have this ANTLR 4 grammar:
constantFixedExpresion : term (('+'|'-') term)+;
term : factor (('*'|'//'|'REM')factor)+;
factor : ('+'|'-')*
( wholeNumberConstant
| constantFixedExpresion
| 'TOFIXED' (stringConstant | bitCodeConstant)
| identifier)
('FIT'constantFixedExpresion)*;
I get the following error:
error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]
I tried so many ways but can't fix it. What is the problem and how can I solve it?
回答1:
Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.
This is caused by this construction in your grammar:
constantFixedExpression: term ...;
term: factor ...;
factor: ('+' | '-')* (constantFixedExpression | ...) ...;
Since the *
operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression
, so it needs to try term
, so it needs to try factor
, so it needs to try constantFixedEXpression
, so it [...]" and you've got yourself an infinite loop.
Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:
A -> Aa | b
-- becomes --
A -> bR
R -> aR | ε
Or in Antlr notation:
A: Aa | b;
// becomes
A: bR;
R: (aR)?;
More information about this process can be found in automaton/grammars books or in the Wikipedia.
I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:
expr: NUMBER
|<assoc=right> expr '^' expr
| expr '*' expr
| expr '/' expr
| expr '+' expr
| expr '-' expr;
This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3
will be parsed as 2+(2*3)
rather than (2+2)*3
. The <assoc=right>
construction means the operator in right-associative, so 1^2^3
will be parsed as 1^(2^3)
rather than (1^2)^3
.
As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.
来源:https://stackoverflow.com/questions/26460013/antlr4-mutually-left-recursive-error-when-parsing