Building own C# compiler using ANTLR: Compilation Unit

China☆狼群 提交于 2019-12-03 12:40:52

问题


// Create a scanner that reads from the input stream passed to us
 CSLexer lexer = new CSLexer(new ANTLRFileStream(f));
tokens.TokenSource = lexer;

// Create a parser that reads from the scanner
CSParser parser = new CSParser(tokens);

// start parsing at the compilationUnit rule
CSParser.compilation_unit_return x = parser.compilation_unit();
object ast = x.Tree;

What can I do with the x which is of compilation_unit_return type, to extract its root, its classes, its methods etc? Do I have to extract its Adaptor out? How do I do that? Note that the compilation_unit_return is defined as such in my CSParser (which is automatically generated by ANTLR):

public class compilation_unit_return : ParserRuleReturnScope
    {
        private object tree;
        override public object Tree
        {
            get { return tree; }
            set { tree = (object) value; }
        }
    };

However the tree I am getting is of the type object. I run using the debugger and seemed to see that it is of the type BaseTree. But BaseTree is an interface! I don't know how it relates to BaseTree and don't know how to extract details out from this tree.

I need to write a visitor which has visit to its class, method, variables, etc. The ParserRuleReturn class extends from RuleReturnScope and has a start and stop object, which I don't know what it is.

Furthermore, there is this TreeVisitor class provided by ANTLR which looks confusing. It requires an Adaptor to be pass as a parameter to its constructor (if not it will use the default CommonTreeAdaptor), tt's why I asked about the how to obtain the Adaptor eariler on. And other issues too. For the API, you can refer to http://www.antlr.org/api/CSharp/annotated.html


回答1:


I haven't ever worked with ANTLR from C#, but following your link to API, BaseTree is clearly not an interface - it's a class, and it has public properties: Type to get type of the node, Text to get (I assume) source text corresponding to it, and Children to get the child nodes. What else do you need to walk it?




回答2:


You can set the AST tree type in your grammar options at the top of the file like so:

tree grammar CSharpTree;
options { 
    ASTLabelType = CommonTree
}

I would build a 3rd grammar or work it into your existing parser grammar that turns the tree into classes that you create. For example assume you've got a rule that matches the plus operator and it's 2 arguments. You can define a rule matching that tree that creates a class that you've written, let's call it PlusExpression like this:

plusExpr returns [PlusExpression value]
   : ^(PLUS left=expr right=expr) { $value = new PlusExpression($left.value, $right.value); }

expr would be another rule in your grammar matching expressions. left and right are just aliases given to the tree values. The part in between the { }'s is pretty much turned into C# code verbatim with the exception of replacing the variable references. The .value property off of $left and $right comes from the return specified off of the rules that they were created from.




回答3:


If I were going to make a C# compiler today, here's what I would do try as a first attempt:

  1. Start with the ANTLR C# 3 target (of course I'm biased here - seriously you can use either the CSharp2 or CSharp3 target).
  2. Get Visual Studio 2010 with the .NET Framework 4. The key here is .NET 4 and it's sweet new expression trees.
  3. Build a basic combined parser. Put as little logic in the parser as absolutely possible. It should have few (if any) actions, and the output should be an undecorated AST that can be walked with LL(1) walker.
  4. Build a tree grammar to walk the tree and identify all declared types. It should also keep the member_declaration sub-trees for later use.
  5. Build a tree walker that walks a single member_declaration and adds the member to the TypeBuilder. Keep track of the method bodies but don't deep-walk them yet.
  6. Build a tree walker that walks the body of a method. Generate an Expression<TDelegate> matching the method, and use the CompileToMethod method my own API (see Pavel's and my comments) to generate the IL code.

If you do things in this order, then when you are finally parsing the expressions (method bodies, field initializers), you can use the string parameterized methods like this one in the Expression class to save work resolving members.



来源:https://stackoverflow.com/questions/1291153/building-own-c-sharp-compiler-using-antlr-compilation-unit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!