Java - Is binary code the same as ByteCode?

前端未结

关注

 6  1024

后悔当初 2021-01-31 03:33

In Java, does \"binary code\" means the same as \"Java bytecode?\"

Is this the flow in Java ?

Java File (.java) -> [javac] -> ByteCode File (.

6条回答

清酒与你 (楼主)

2021-01-31 04:02
JVM is very complex program, and the flow there is in certain level unpredictable. E.g. flow inside HotSpot JVM is something like the following:

1) it takes your bytecode and interprets it
2) if some method is executed quite frequently (some amount of times during some time span) it is marked as a "hot" method and JVM schedules its compiling to platform depended machine code (is that what you have called binary code?). That flow looks like the following:
```
ByteCode
--> Hige-level Intermediate Representation (HIR)
  --> Middle-level Intermediate Representation (MIR)
    --> Low-level Intermediate Representation (LIR)
      --> Register Allocation
        --> EMIT (platform dependent machine code)
```
Each step in that flow is important and helps JVM perform some optimizations of your code. It does not change your algorithm of course, optimization just means that some sequences of code can be detected and exchanged with better performing code (producing the same result). Starting from LIR stage, code becomes platform dependent (!).

Bytecode can be good for interpretation, but not good enough to be easily transformed into the machine native code. HIR takes care of it and its purpose is to quickly transform bytecode into an intermediate representation. MIR transforms all operations into the three-operands operation; ByteCode is based on stack operation:
```
iload_0
iload_1
iand
```
that was bytecode for simple and operation, and middle level representation for this will be sort of the following:
```
and v0 v1 -> v2
```
LIR depends on platform, taking into account our simple example with and operation, and specifying our platform as x86, then our code snippet will be:
```
x86_and v1 v0 -> v1
x86_move v1 -> v2
```
because and operation takes two operands, first one is destination, another one is source, and then we put the result value to another "variable". Next stage is "register allocation", because x86 platform (and probably most others) work with registers, and not variables (like intermediate representation), nor stack (like bytecode). Here our code snippet should be like the following:
```
x86_and eax ecx -> eax
```
and here you can notice absence of a "move" operation. Our code contained only one line and JVM figured out that creating a new virtual variable was not neede; we can just reuse the eax register. If code is big enough, having many variables and working with them intensive (e.g. using eax somewhere below, so we can't change its value), then you will see move operation left in machine code. That's again about optimization :)

That was JIT flow, but depending on VM implementation there can be one more step - if code was compiled (being "hot"), and still executed many many times, JVM schedules optimization of that code (e.g. using inlining).

Well, conclusion is that the path from bytecode to machine code is quite interesting, a bit unforeseeable, and depends on many many things.

btw, the described above process is called "Mixed mode interpretation" (when JVM first interprets bytecode, and then uses JIT compilation), example of such JVM is HotSpot. Some JVMs (like JRockit from Oracle) use JIT compilation only.

This was a very simple description of what is going on there. I hope that it helps to understand the flow inside JVM on a very high level, as well as targets the question about differences between bytecode and binary code. For references, and other issues not mentioned here and related to that topic, please read the similar topic "Why are compiled Java class files smaller than C compiled files?".

Also feel free to critique this answer, point me to mistakes or misunderstanding of mine, I'm always willing to improve my knowledge about JVM :)
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...