how to resolve simple ambiguity

廉价感情. 提交于 2019-12-11 13:50:17

问题


I just started using Antlr and am stuck. I have the below grammar and am trying to resolve the ambiguity to parse input like Field:ValueString.

expression : Field ':' ValueString;
Field : Letter LetterOrDigit*;
ValueString : ~[:];
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;

suppose a:b is passed in to the grammar, a and b are both identified as Field. How do I resolve this in Antlr4 (C#)?


回答1:


You can use a semantic predicate in your lexer rules to perform lookahead (or behind) without consuming characters (ANTLR4 negative lookahead in lexer)

In you case, to remove ambiguity, you can check if the char after the Field rule is : or you can check if the char before the ValueString is :.

Ïn the first case:

expression : Field ':' ValueString;
Field : Letter LetterOrDigit* {_input.LA(1) == ':'}?;
ValueString : ~[:];
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;

In the second one (please note that Field and ValueString order have been inversed):

expression : Field ':' ValueString;
ValueString : {_input.LA(-1) == ':'}? ~[:];
Field : Letter LetterOrDigit*;
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;

Also consider using fragment keyword for Letter and LetterOrDigit

fragment Letter : [a-zA-Z];
fragment LetterOrDigit : [a-zA-Z0-9];

"[With fragment keyword] You can also define rules that are not tokens but rather aid in the recognition of tokens. These fragment rules do not result in tokens visible to the parser." (source https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Lexer+Rules)




回答2:


A way to resolve this without lookahead is to simple define a parser rule that can be any of those two lexer tokens:

expression : Field ':' value;
value : Field | ValueString;
Field : Letter LetterOrDigit*;
ValueString : ~[:];
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;

The parser will work as expected and the grammar is kept simple, but you may need to add a method to your visitor or listener implementation.



来源:https://stackoverflow.com/questions/28493732/how-to-resolve-simple-ambiguity

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!