How can I modify the text of tokens in a CommonTokenStream with ANTLR?

前端 未结 4 1838
梦如初夏
梦如初夏 2021-02-09 13:38

I\'m trying to learn ANTLR and at the same time use it for a current project.

I\'ve gotten to the point where I can run the lexer on a chunk of code and output it to a C

4条回答
  •  旧时难觅i
    2021-02-09 14:15

    ANTLR has a way to do this in it's grammar file.

    Let's say you're parsing a string consisting of numbers and strings delimited by comma's. A grammar would look like this:

    grammar Foo;
    
    parse
      :  value ( ',' value )* EOF
      ;
    
    value
      :  Number
      |  String
      ;
    
    String
      :  '"' ( ~( '"' | '\\' ) | '\\\\' | '\\"' )* '"'
      ;
    
    Number
      :  '0'..'9'+
      ;
    
    Space
      :  ( ' ' | '\t' ) {skip();}
      ;
    

    This should all look familiar to you. Let's say you want to wrap square brackets around all integer values. Here's how to do that:

    grammar Foo;
    
    options {output=template; rewrite=true;} 
    
    parse
      :  value ( ',' value )* EOF
      ;
    
    value
      :  n=Number -> template(num={$n.text}) "[]" 
      |  String
      ;
    
    String
      :  '"' ( ~( '"' | '\\' ) | '\\\\' | '\\"' )* '"'
      ;
    
    Number
      :  '0'..'9'+
      ;
    
    Space
      :  ( ' ' | '\t' ) {skip();}
      ;
    

    As you see, I've added some options at the top, and added a rewrite rule (everything after the ->) after the Number in the value parser rule.

    Now to test it all, compile and run this class:

    import org.antlr.runtime.*;
    
    public class FooTest {
      public static void main(String[] args) throws Exception {
        String text = "12, \"34\", 56, \"a\\\"b\", 78";
        System.out.println("parsing: "+text);
        ANTLRStringStream in = new ANTLRStringStream(text);
        FooLexer lexer = new FooLexer(in);
        CommonTokenStream tokens = new TokenRewriteStream(lexer); // Note: a TokenRewriteStream!
        FooParser parser = new FooParser(tokens);
        parser.parse();
        System.out.println("tokens: "+tokens.toString());
      }
    }
    

    which produces:

    parsing: 12, "34", 56, "a\"b", 78
    tokens: [12],"34",[56],"a\"b",[78]
    

提交回复
热议问题