Tutorial for walking ANTLR ASTs in C#?

后端 未结 4 1336
一生所求
一生所求 2021-01-31 10:06

Is anyone aware of tutorials for walking ANTLR-generated ASTs in C#? The closest I was able to find is this, but it\'s not terribly helpful.

My goal is to walk through t

相关标签:
4条回答
  • 2021-01-31 10:41

    Normally you walk ASTs with recursion, and perform different actions based on the kind of the node. If you're using polymorphic tree nodes (i.e. different subclasses for different nodes in the tree), then double-dispatch in the Visitor pattern may be appropriate; however, that's usually not very convenient with Antlr.

    In pseudocode, walking usually looks somewhat like this:

    func processTree(t)
        case t.Type of
            FOO: processFoo t
            BAR: processBar t
        end
    
    // a post-order process
    func processFoo(foo)
        // visit children
        for (i = 0; i < foo.ChildCount; ++i)
            processTree(foo.GetChild(i))
        // visit node
        do_stuff(foo.getText())
    
    // a pre-order process
    func processBoo(bar)
        // visit node
        do_stuff(bar.getText())
        // visit children
        for (i = 0; i < foo.ChildCount; ++i)
            processTree(foo.GetChild(i))
    

    The kinds of processing are highly dependent on the semantics of the language. For example, handling an IF statement, with structure (IF <predicate> <if-true> [<if-false>]), when generating code for a stack machine like the JVM or CLR, might look somewhat like this:

    func processIf(n)
        predicate = n.GetChild(0)
        processExpr(predicate) // get predicate value on stack
        falseLabel = createLabel()
        genCode(JUMP_IF_FALSE, falseLabel) // JUMP_IF_FALSE is called brfalse in CLR,
                                           // ifeq in JVM
        if_true = n.GetChild(1)
        processStmt(if_true)
        if_false = n.ChildCount > 2 ? n.GetChild(2) : null
        if (if_false != null)
            doneLabel = createLabel()
            genCode(JUMP, doneLabel)
        markLabel(falseLabel)
        if (if_false != null)
            processStmt(if_false) // if-false branch
            markLabel(doneLabel)
    

    Generally everything is done recursively depending on the type of the current node, etc.

    0 讨论(0)
  • 2021-01-31 10:41

    I did something similar (but not really) and I ended up with a TreeParser.

    I also suggest buying the ANTLR book. I found it to be more valuable than any web resource. It may not have all the answers but it sure helps with the basics.

    0 讨论(0)
  • 2021-01-31 10:50

    I managed to figure this out by adapting the example at the end of Manuel Abadia's article.

    Here's my version, which I happen to be using to convert parsed code to C#. These are the steps:

    1. Instantiate an ANTLRStringStream or subclass with your input (it can be a file or string).
    2. Instantiate your generated lexer, passing in that string stream.
    3. Instantiate a token stream with the lexer.
    4. Instantiate your parser with that token stream.
    5. Get the top-level value from your parser, and turn it into a CommonTree.
    6. Traverse the tree:

    To get the literal text of a node, use node.Text. To get the token name of a node, use node.Token.Text.

    Note that node.Token.Text will only give you the actual name of your token if it's an imaginary token with no corresponding string. If it's a real token, then node.Token.Text will return its string.

    For example, if you had the following in your grammar:

    tokens { PROGRAM, FUNCDEC }
    
    EQUALS : '==';
    ASSIGN : '=';
    

    Then you'll get "PROGRAM", "FUNCDEC", "==", and "=" from the corresponding accesses of node.Token.Text.

    You can see part of my example below, or you can browse the full version.


    public static string Convert(string input)
    {
        ANTLRStringStream sStream = new ANTLRStringStream(input);
        MyGrammarLexer lexer = new MyGrammarLexer(sStream);
    
        CommonTokenStream tStream = new CommonTokenStream(lexer);
    
        MyGrammarParser parser = new MyGrammarParser (tStream);
        MyGrammarParser.program_return parserResult = parser.program();
    
        CommonTree ast = (CommonTree)parserResult.Tree;
    
        Print(ast);
        string output = header + body + footer;
    
        return output;
    }
    
    public static void PrintChildren(CT ast)
    {
        PrintChildren(ast, " ", true);
    }
    
    public static void PrintChildren(CT ast, string delim, bool final)
    {
        if (ast.Children == null)
        {
            return;
        }
    
        int num = ast.Children.Count;
    
        for (int i = 0; i < num; ++i)
        {
            CT d = (CT)(ast.Children[i]);
            Print(d);
            if (final || i < num - 1)
            {
                body += delim;
            }
        }
    }
    
    public static void Print(CommonTree ast)
    {
        switch (ast.Token.Text)
        {
            case "PROGRAM":
                //body += header;
                PrintChildren(ast);
                //body += footer;
                break;
            case "GLOBALS":
                body += "\r\n\r\n// GLOBALS\r\n";
                PrintChildren(ast);
                break;
            case "GLOBAL":
                body += "public static ";
                PrintChildren(ast);
                body += ";\r\n";
                break;
    
          ....
        }
    }
    
    0 讨论(0)
  • 2021-01-31 10:50

    You should look into writing a TreeParser; it can make the job of interpreting the tree much simpler.

    For ANTLR 2.x see http://www.antlr2.org/doc/sor.html For ANTLR 3.x see http://www.antlr.org/wiki/display/ANTLR3/Tree+construction (java-based parser and tree parser example)

    0 讨论(0)
提交回复
热议问题