How to implement a function call with Antlr so that it can be called even before it is defined?

前端 未结 1 443
情歌与酒
情歌与酒 2021-02-09 15:00

Once the AST is built, what is the best way implement the tree walker so that functions can be defined and called in whatever order?

For example, this is valid in PHP:

相关标签:
1条回答
  • 2021-02-09 15:19

    Yes, you are right: this is done in more than one pass over the AST.

    You first create a grammar that builds a AST of the source, then you create a tree grammar that is used to iterate over the tree and discovers all defined function. You could then evaluate the script using another tree grammar that takes the discovered functions from the previous tree grammar.

    A demo.

    Take the source:

    <?php
    f(); // function called before it’s defined
    function f() {
      g();
    }
    function g() {}
    ?>
    

    which is parsed into the following AST:

    alt text

    using the (combined) grammar:

    grammar PHPMin;
    
    options { 
      output=AST; 
    }
    
    tokens {
      SCRIPT; F_CALL; F_DECL; F_BODY;
    }
    
    parse
      :  script EOF -> script
      ;
    
    script
      :  '<?php' atom* '?>' -> ^(SCRIPT atom*)
      ;
    
    atom
      :  functionCall
      |  functionDecl
      ;
    
    functionCall
      :  Identifier '(' ')' ';' -> ^(F_CALL Identifier)
      ;
    
    functionDecl
      :  'function' Identifier '(' ')' '{' functionBody '}' -> ^(F_DECL Identifier functionBody)
      ;
    
    functionBody
      :  functionCall* -> ^(F_BODY functionCall*)
      ;
    
    Identifier  : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')* ;
    LineComment : '//' ~('\r' | '\n')* ('\r'? '\n' | EOF){skip();} ;
    Space       : (' ' | '\t' | '\r' | '\n'){skip();} ;
    

    Then discover the declared functions using a "tree-walker" generated from the following tree grammar:

    tree grammar PHPMinFunctionWalker;
    
    options {
        tokenVocab=PHPMin;
        ASTLabelType=CommonTree;
    }
    
    @members {
        java.util.Set<String> declared = new java.util.HashSet<String>();
    }
    
    discover
      :  script
      ;
    
    script
      :  ^(SCRIPT atom*)
      ;
    
    atom
      :  functionCall
      |  functionDecl
      ;
    
    functionCall
      :  ^(F_CALL Identifier)
      ;
    
    functionDecl
      :  ^(F_DECL Identifier functionBody) {declared.add($Identifier.text);}
      ;
    
    functionBody
      :  ^(F_BODY functionCall*)
      ;
    

    To test it all, create a lexer and parser (A), generate the "tree-walker" (B), compile all source files (C):

    // A
    java -cp antlr-3.2.jar org.antlr.Tool PHPMin.g
    
    // B 
    java -cp antlr-3.2.jar org.antlr.Tool PHPMinFunctionWalker.g
    
    // C
    javac -cp antlr-3.2.jar *.java
    
    // D     
    java -cp .:antlr-3.2.jar Main    // *nix 
    java -cp .;antlr-3.2.jar Main    // Windows
    

    and run the following main class (D):

    import org.antlr.runtime.*;
    import org.antlr.runtime.tree.*;
    import org.antlr.stringtemplate.*;
    
    public class Main {
    
        public static void main(String[] args) throws Exception {
    
            String source = "<?php                                          \n" + 
                            "f(); // function called before it’s defined    \n" + 
                            "function f() {                                 \n" + 
                            "  g();                                         \n" + 
                            "}                                              \n" + 
                            "function g() {}                                \n" + 
                            "?>                                             \n";
    
            // create a lexer and parser for the source
            ANTLRStringStream in = new ANTLRStringStream(source);
            PHPMinLexer lexer = new PHPMinLexer(in);
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            PHPMinParser parser = new PHPMinParser(tokens);
            PHPMinParser.parse_return returnValue = parser.parse();
            CommonTree tree = (CommonTree)returnValue.getTree();
    
            // create a tree walker to discover all declared functions
            CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
            nodes.setTokenStream(tokens);
            PHPMinFunctionWalker functions = new PHPMinFunctionWalker(nodes);
            functions.discover();
            System.out.println("Declared functions: "+functions.declared);
        }
    }
    

    which produces the following output:

    Declared functions: [f, g]
    

    Of course, this is just an example of how to approach it, not of how it is best done. I can imagine (when using Java to interpret the script), you wouldn't store the declared functions as simple Strings in a Set<String>, but rather as a Map<String, CommonTree> to easily get the root of a function and evaluate it when called.

    Further reading: http://www.antlr.org/wiki/display/ANTLR3/Simple+tree-based+interpeter

    Good luck!

    EDIT

    The seconds pass could then check if all functions are defined ahead of it using the previous tree-walker:

    tree grammar PHPMinValidateWalker;
    
    options {
        tokenVocab=PHPMin;
        ASTLabelType=CommonTree;
    }
    
    @members {
        java.util.Set<String> declared = new java.util.HashSet<String>();
    }
    
    validate
      :  script
      ;
    
    script
      :  ^(SCRIPT atom*)
      ;
    
    atom
      :  functionCall
      |  functionDecl
      ;
    
    functionCall
      :  ^(F_CALL Identifier) 
         {
           if(!declared.contains($Identifier.text)) {
             throw new RuntimeException("no such function: " +  $Identifier.text);
           }
         }
      ;
    
    functionDecl
      :  ^(F_DECL Identifier functionBody)
      ;
    
    functionBody
      :  ^(F_BODY functionCall*)
      ;
    

    Using the test:

    import org.antlr.runtime.*;
    import org.antlr.runtime.tree.*;
    import org.antlr.stringtemplate.*;
    
    public class Main {
    
        public static void main(String[] args) throws Exception {
    
            String source = "<?php                                          \n" + 
                            "f(); // function called before it’s defined    \n" + 
                            "function f() {                                 \n" + 
                            "  g();                                         \n" + 
                            "  x();                                         \n" + 
                            "}                                              \n" + 
                            "function g() {}                                \n" + 
                            "?>                                             \n";
    
            // create a lexer and parser for the source
            ANTLRStringStream in = new ANTLRStringStream(source);
            PHPMinLexer lexer = new PHPMinLexer(in);
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            PHPMinParser parser = new PHPMinParser(tokens);
            PHPMinParser.parse_return returnValue = parser.parse();
            CommonTree tree = (CommonTree)returnValue.getTree();
    
            // create a tree walker to discover all declared functions
            CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
            nodes.setTokenStream(tokens);
            PHPMinFunctionWalker functions = new PHPMinFunctionWalker(nodes);
            functions.discover();
            System.out.println("Declared functions: "+functions.declared);
    
            // PHPMinValidateWalker
            nodes = new CommonTreeNodeStream(tree);
            nodes.setTokenStream(tokens);
            PHPMinValidateWalker validator = new PHPMinValidateWalker(nodes);
            validator.declared = functions.declared;
            validator.validate();
        }
    }
    

    produces an exception since x() is not define anywhere. Removing it from the source will cause the tree-walker to produce no exception.

    0 讨论(0)
提交回复
热议问题