问题
I have the following grammar and am trying to start out slowly, working up to move complex arguments.
grammar Command;
commands : command+ EOF;
command : NAME args NL;
args : arg | ;
arg : DASH LOWER | LOWER;
//arg : DASH 'a' | 'x';
NAME : [_a-zA-Z0-9]+;
NL : '\n';
WS : [ \t\r]+ -> skip ; // spaces, tabs, newlines
DASH : '-';
LOWER: [a-z];//'a' .. 'z';
I was hoping (for now) to parse files like this:
cmd1
cmd3 -a
If I run that input through grun I get an error:
$ java org.antlr.v4.gui.TestRig Command commands -tree
...
`line 3:6 mismatched input 'a' expecting LOWER`
It seems like LOWER should match 'a'. If I change the arg definition to be the commented out line it works fine and I get the '-a' as an arg. What's the difference between using LOWER and using a 'a' explicitly?
回答1:
As soon as you have a "mismatched" error, add -tokens
to grun to display the tokens, it helps finding the discrepancy between what you THINK the lexer will do and what it actually DOES. With your grammar :
$ alias grun='java org.antlr.v4.gui.TestRig'
$ grun Command commands -tokens -diagnostics t.text
[@0,0:3='cmd1',<NAME>,1:0]
[@1,4:4='\n',<'
'>,1:4]
[@2,5:8='cmd3',<NAME>,2:0]
[@3,10:10='-',<'-'>,2:5]
[@4,11:11='a',<NAME>,2:6]
[@5,12:12='\n',<'
'>,2:7]
[@6,13:12='<EOF>',<EOF>,3:0]
line 2:6 mismatched input 'a' expecting LOWER
you immediately see that the letter a
is a NAME
and not the expected LOWER
.
Also watch rules with an empty alternative :
args
: arg
|
;
may lead to problems in some circumstances. I prefer to explicitly add the ?
suffix which means zero or one time. So my solution would be :
grammar Command;
commands
@init {System.out.println("Question last update 1829");}
: command+ EOF
;
command
: NAME args? NL
;
args
: arg
;
arg : DASH? LOWER ;
LOWER : [a-z] ;
NAME : [_a-zA-Z0-9]+;
DASH : '-' ;
NL : '\n' ;
WS : [ \t\r]+ -> skip ;
Execution :
$ grun Command commands -tokens -diagnostics t.text
[@0,0:3='cmd1',<NAME>,1:0]
[@1,4:4='\n',<'
'>,1:4]
[@2,5:8='cmd3',<NAME>,2:0]
[@3,10:10='-',<'-'>,2:5]
[@4,11:11='a',<LOWER>,2:6]
[@5,12:12='\n',<'
'>,2:7]
[@6,13:12='<EOF>',<EOF>,3:0]
Question last update 1829
来源:https://stackoverflow.com/questions/46527648/antlr4-cant-extract-literal-into-token