I am trying to understand AST in C#. I wonder, what exactly Compile()
method from this example does.
// Some code skipped
Expression
An Expression represents a data structure in the form of an expression tree - using Compile() this expression tree can be compiled into executable code in the form of a delegate (which is a "method" call).
After compilation you can then normally invoke the delegate - in your example the delegate is a Func<string,int,int,string>
. This approach might be needed when you dynamically create the expression tree based on data that is only available at run time with the end goal of creating and executing the corresponding delegate.
You cannot see the "code" for the delegate. The expression tree itself which it is based on is the closest to that.
What interests me is the
Compile()
method. Does it somehow produce real MSIL?
Yes. The Compile method runs a visitor over the lambda body block and generates IL dynamically for each subexpression.
If you're interested in learning how to spit IL yourself, see this "Hello World" example of how to use Lightweight Codegen. (I note that if you are in the unfortunate position of having to use Lightweight Codegen in a partially trusted appdomain then things can get a bit weird in a world with Restricted Skip Visibility; see Shawn Farkas's article on the subject if that interests you.)
Can I see the MSIL?
Yes, but you need a special "visualizer". The visualizer I used to debug Compile()
while I was implementing my portions of it can be downloaded here:
http://blogs.msdn.com/b/haibo_luo/archive/2005/10/25/484861.aspx
The answer to this is now partly outdated, in that it's now only sometimes what happens.
Compilation of expressions to IL requires Reflection.Emit which isn't available all the time, particular with AOT. So in those cases instead of compiling to IL the expression is "compiled" to a list of objects representing instructions. Each of these instructions has a Run
method that causes it to carry out the appropriate action, working on a stack of values much as IL works on a stack. A method that calls Run
on these objects can then be returned as the delegate.
Generally running such a delegate is slower than jitting IL, but it's the only option when compiling to IL isn't available, and the compilation step is often faster, so very often the total time of compile + run is less with the interpreter than with IL for one-off expressions.
For that reason, in .NET Core there is now an overload of Compile
that takes a boolean requesting interpretation even if compiling to IL is available.
All of which makes for an interesting mix of languages; Expressions themselves are a language, the assembly is written in C#, it can compile to IL and the interpreted instruction objects constitute a fourth language.