Could someone list major tasks that the bytecode verifier has to perform to guarantee correctness of the program? Is there a standard, minimal set of responsibilities define
Alternatively you might like to give it a look at the Java Language Environment white paper by James Gosling.
The bytecode verifier traverses the bytecodes, constructs the type state information, and verifies the types of the parameters to all the bytecode instructions.
The illustration shows the flow of data and control from Java language source code through the Java compiler, to the class loader and bytecode verifier and hence on to the Java virtual machine, which contains the interpreter and runtime system. The important issue is that the Java class loader and the bytecode verifier make no assumptions about the primary source of the bytecode stream--the code may have come from the local system, or it may have travelled halfway around the planet. The bytecode verifier acts as a sort of gatekeeper: it ensures that code passed to the Java interpreter is in a fit state to be executed and can run without fear of breaking the Java interpreter. Imported code is not allowed to execute by any means until after it has passed the verifier's tests. Once the verifier is done, a number of important properties are known:
- There are no operand stack overflows or underflows
- The types of the parameters of all bytecode instructions are known to always be correct
- Object field accesses are known to be legal--private, public, or protected
While all this checking appears excruciatingly detailed, by the time the bytecode verifier has done its work, the Java interpreter can proceed, knowing that the code will run securely. Knowing these properties makes the Java interpreter much faster, because it doesn't have to check anything. There are no operand type checks and no stack overflow checks. The interpreter can thus function at full speed without compromising reliability.
This is specified in the JVM Specification: Chapter 4.10. Verification of class Files .
The bulk of the page describes the various aspects of type safety. To check that the program is type-safe the verifier needs to figure out what types of operands reside in the operand stack at each program point, and make sure that they match the type expected by the respective instruction.
Other things it verifies include, but is not limited to the following:
Branches must be within the bounds of the code array for the method.
The targets of all control-flow instructions are each the start of an instruction. In the case of a wide instruction, the wide opcode is considered the start of the instruction, and the opcode giving the operation modified by that wide instruction is not considered to start an instruction. Branches into the middle of an instruction are disallowed.
No instruction can access or modify a local variable at an index greater than or equal to the number of local variables that its method indicates it allocates.
All references to the constant pool must be to an entry of the appropriate type. (For example, the instruction getfield must reference a field.)
The code does not end in the middle of an instruction.
Execution cannot fall off the end of the code.
For each exception handler, the starting and ending point of code protected by the handler must be at the beginning of an instruction or, in the case of the ending point, immediately past the end of the code. The starting point must be before the ending point. The exception handler code must start at a valid instruction, and it must not start at an opcode being modified by the wide instruction.
As a final step the verifier also performs a data-flow analysis, which makes sure that no instruction reference any uninitialized local variables.
It does the following:
Reference: http://java.sun.com/docs/white/langenv/Security.doc3.html