ANTLR4 negative lookahead in lexer

后端 未结 1 1275
终归单人心
终归单人心 2020-12-19 22:39

I am trying to define lexer rules for PostgreSQL SQL.

The problem is with the operator definition and the line comments conflicting with each other.

for exam

相关标签:
1条回答
  • 2020-12-19 23:36

    You can use a semantic predicate in your lexer rules to perform lookahead (or behind) without consuming characters. For example, the following covers several rules for an operator.

    OPERATOR
      : ( [+*<>=~!@#%^&|`?]
        | '-' {_input.LA(1) != '-'}?
        | '/' {_input.LA(1) != '*'}?
        )+
      ;
    

    However, the above rule does not address the restrictions on including a + or - at the end of an operator. To handle that in the easiest way possible, I would probably separate the two cases into separate rules.

    // this rule does not allow + or - at the end of a rule
    OPERATOR
      : ( [*<>=~!@#%^&|`?]
        | ( '+'
          | '-' {_input.LA(1) != '-'}?
          )+
          [*<>=~!@#%^&|`?]
        | '/' {_input.LA(1) != '*'}?
        )+
      ;
    
    // this rule allows + or - at the end of a rule and sets the type to OPERATOR
    // it requires a character from the special subset to appear
    OPERATOR2
      : ( [*<>=+]
        | '-' {_input.LA(1) != '-'}?
        | '/' {_input.LA(1) != '*'}?
        )*
        [~!@#%^&|`?]
        OPERATOR?
        ( '+'
        | '-' {_input.LA(1) != '-'}?
        )+
        -> type(OPERATOR)
      ;
    
    0 讨论(0)
提交回复
热议问题