ANTLR lexer rule consumes characters even if not matched?

笑着哭i 提交于 2019-12-05 10:16:13

The '0' is discarded by the lexer and the following errors are produced:

line 1:3 no viable alternative at character '.'
line 1:2 extraneous input '..' expecting INTEGER

This is because when the lexer encounters '0.', it tries to create a FLOAT token, but can't. And since there is no other rule to fall back on to match '0.', it produces the errors, discards '0' and creates a DOT token.

This is simply how ANTLR's lexer works: it will not backtrack to match an INTEGER followed by a DDOTS (note that backtrack=true only applies to parser rules!).

Inside the FLOAT rule, you must make sure that when a double '.' is ahead, you produce a INTEGER token instead. You can do that by adding a syntactic predicate (the ('..')=> part) and produce FLOAT tokens only when a single '.' is followed by a digit (the ('.' DIGIT)=> part). See the following demo:

declaration
 : LBRACEVAR INTEGER DDOTS INTEGER RBRACEVAR
 ;

LBRACEVAR : '[';
RBRACEVAR : ']';
DOT       : '.';
DDOTS     : '..';

INTEGER
 : DIGIT+
 ;

FLOAT
 : DIGIT+ ( ('.' DIGIT)=> '.' DIGIT+ EXP? 
          | ('..')=>      {$type=INTEGER;} // change the token here
          |               EXP
          )
 ;

fragment EXP   : ('e' | 'E') DIGIT+;
fragment DIGIT : ('0'..'9');
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!