问题
I'm working on the byte code compiler for Renjin (R for the JVM) and am experimenting with translating our intermediate three address code (TAC) representation to byte code. All the textbooks on compilers that I've consulted discuss register allocation during code generation, but I haven't been able to find any resources for code generation on stack-based virtual machines like the JVM.
Simple TAC instructions are trivial to translate into bytecode, but I get a bit lost when temporaries are involved. Does any one have any pointers to resources that describe this?
Here is a complete example:
Original R code looks like this:
x + sqrt(x * y)
TAC IR:
0: _t2 := primitive<*>(x, y)
1: _t3 := primitive<sqrt>(_t2)
2: return primitive<+>(x, _t3)
(ignore for a second the fact taht we can't always resolve function calls to primitives at compile time)
The resulting JVM byte code would look (roughly) something like this:
aload_x
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn
Basically, at the top of the program, I already need to be thinking that I'm going to need local variable x at the beginning of the stack by the time that i get to TAC instruction 2. I can think this through manually but I'm having trouble thinking through an algorithm to do this correctly. Any pointers?
回答1:
Transforming a 3-address representation into stack is easier than a stack one into 3-address.
Your sequence should be the following:
- Form basic blocks
- Perform an SSA-transform
- Build expression trees within the basic blocks
- Perform a register schedulling (and phi- removal simultaneously) to allocate local variables for the registers not eliminated by the previous step
- Emit a JVM code - registers goes into variables, expression trees are trivially expanded into stack operations
来源:https://stackoverflow.com/questions/8427888/code-generation-from-three-address-code-to-jvm-bytecode