问题
I am working on my Antlr grammar to parse polynomial functions in multiple variables using Java. Examples for legal input are
42; X; +42X; Y^42; 1337HelloWorld; 13,37X^42;
The following grammar does compile without warnings or errors:
grammar Function;
parseFunction returns [java.util.List<java.util.List<Object>> list] :
{ list = new java.util.ArrayList(); } ( f=functionPart { list.add($f.list); } )+
| { list = new java.util.ArrayList(); } ( fb=functionBegin ) { list.add($fb.list); } ( f=functionPart { list.add($f.list); } )*
;
functionBegin returns [java.util.List<Object> list]:
m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); list.add($e.value); }
| m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); }
| v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add("1"); list.add($v.text); list.add($e.value); }
| v=VARIABLE { list = new java.util.ArrayList(); list.add("+"); list.add("1"); list.add($v.text); }
| m=NUMBER { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); }
;
functionPart returns [java.util.List<Object> list] :
s=SIGN m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($v.text); list.add($e.value); }
| s=SIGN m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($v.text); }
| s=SIGN v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add("1"); list.add($v.text); list.add($e.value); }
| s=SIGN v=VARIABLE { list = new java.util.ArrayList(); list.add($s.text); list.add("1"); list.add($v.text); }
| s=SIGN m=NUMBER { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); }
;
exponent returns [int value]: ('^' n=INTEGER) { $value = 1; if ( $n != null && $n.text.length() > 0) $value = Integer.parseInt($n.text); }
;
VARIABLE : ('a'..'z'|'A'..'Z')+
;
INTEGER : ('0'..'9')+
;
NUMBER : ('0'..'9')+ (','('0'..'9')+)?
;
SIGN : ('+'|'-')
;
WS : (' ' | '\t' | '\r'| '\n')+ {skip();}
;
This grammar, if compiled and used in Java does accept most input values. Apparently, not all valid input values are accepted. As soon as a number not using a comma pops up, like the inputs
+42; 42; 42X^1337;
an error is thrown (error from input "+42"):
line 1:1 no viable alternative at input '+'
The error is not thrown if I modify the inputs to
+42,0; 42,0; 42,0X^1337
Can anyone say, why and how to fix it?
回答1:
The first lexer rule with the longest match wins, thus 42
is an INTEGER
, and NUMBER
in fact only matches when the comma part is present, i.e. when NUMBER
has a longer match than INTEGER
.
This can be fixed by adding a parser rule
number : NUMBER | INTEGER ;
and using that instead of NUMBER
from other parser rules.
来源:https://stackoverflow.com/questions/11736681/generated-antlr-parser-in-java-not-all-inputs-are-read