ANTLR Lua long string grammar rules

旧巷老猫 提交于 2019-12-24 07:39:20

问题


I'm trying to create ANTLR parser for Lua. So i took grammar produced by Nicolai Mainero(available at ANTLR's site, Lua 5.1 grammar) and begin to work.

Grammar is good. One thing not working: LONG STRINGS.

Lua specification rule: Literal

strings can also be defined using a long format enclosed by long brackets. We define an opening long bracket of level n as an opening square bracket followed by n equal signs followed by another opening square bracket. So, an opening long bracket of level 0 is written as [[, an opening long bracket of level 1 is written as [=[, and so on. A closing long bracket is defined similarly; for instance, a closing long bracket of level 4 is written as ]====]. A long string starts with an opening long bracket of any level and ends at the first closing long bracket of the same level. Literals in this bracketed form can run for several lines, do not interpret any escape sequences, and ignore long brackets of any other level. They can contain anything except a closing bracket of the proper level.the proper level.

My question is close by meaning to this but tools are different.

Some little example of LONGSTRING:

local a = [==[ Some interesting string [=[ sub string in string ]=] [hello indexes] [[And some line strings]] ]==] - its correct string. 
local f = [==[ Not interesting string ]=] - incorrect string

Here my rule for LONGSTRING with out '=' symbol:

LONGSTRING: '[[' (~(']') | ']'(~(']')))* ']]';

Can somebody help me? Thanks!


回答1:


I once wrote a Lua grammar according the specs and solved it like this:

grammar Lua;

// ... options ...

// ... tokens ...

@lexer::members {
    public boolean noCloseAhead(int numEqSigns) {
        if(input.LA(1) != ']') return true;
        for(int i = 2; i < numEqSigns+2; i++) {
            if(input.LA(i) != '=') return true;
        }
        return input.LA(numEqSigns+2) != ']';
    }

    public void matchClose(int numEqSigns) throws MismatchedTokenException {
        StringBuilder eqSigns = new StringBuilder();
        for(int i = 0; i < numEqSigns; i++) {
            eqSigns.append('=');
        }
        match("]"+eqSigns+"]");
    }
}

// ... parser rules ...

String
  :  '"'  (~('"'  | '\\') | EscapeSequence)* '"'
  |  '\'' (~('\'' | '\\') | EscapeSequence)* '\''
  |  LongBracket
  ;

Comment
  :  (BlockComment | LineComment) {skip();}
  ;

fragment
BlockComment
  :  '--' LongBracket 
  ;

fragment
LineComment
  :  '--' ~('\r' | '\n')* ('\r'? '\n' | EOF) 
  ;

fragment
LongBracket
@init {int openEq = 0;}
  :  '[' ('=' {openEq++;})* '[' ({noCloseAhead(openEq)}?=> .)* {matchClose(openEq);}
  ;

// ... more lexer rules ...

Be careful with what you find on the ANTLR Wiki! As the name suggests: it's a Wiki and one can post stuff fairly easy. The Lua grammar you mention is a nice start, but has quite a bit of errors in it (binary or hex literals are incorrect as well, at least, at the time I looked at it...).



来源:https://stackoverflow.com/questions/4065327/antlr-lua-long-string-grammar-rules

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!