Implementation of invokevirtual in JVM

问题

I had expected that the Java Virtual Machine would use a simple lookup table for normal method invocations: The object contains a pointer to a lookup table with the addresses for all methods that the class implements (including the methods implemented by super-classes). A specific method is simply represented by an index into that table. The JVM looks up the address of the method in the table, and jumps to that address.

But the JVM specification specifies a long and complex procedure for looking up the correct method at run-time (see the official spec):

The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class [which] must be a symbolic reference to a method , which gives the name and descriptor of the method as well as a symbolic reference to the class in which the method is to be found. The named method is resolved. [...] then the invokevirtual instruction proceeds as follows.

If C contains a declaration for an instance method m that overrides (§5.4.5) the resolved method, then m is the method to be invoked, and the lookup procedure terminates.

Otherwise, if C has a superclass, this same lookup procedure is performed recursively using the direct superclass of C; the method to be invoked is the result of the recursive invocation of this lookup procedure.

I would expect this complex and long procedure to take a long time. Because it is done for every normal method call, almost all time for JVM based programs would be spent in this procedure.

Is this really how it is implemented in the real (Oracle) JVM? Or does the JVM do a JIT type of compilation to a lookup table? Is there a description of how the concrete JVM actually implements this?

回答1:

There is nothing in the Java Language Specification or the Java Virtual Machine Specification that prescribes any particular implementation strategy. Every implementor is free to choose any implementation strategy they want, as long as the result is the same AS-IF they had implemented the algorithm described in the spec.

In other words, the algorithm in the spec describes the end result but not the recipe.

The simplest and most obvious possible optimization is to just stupidly perform the algorithm as described, but cache the result instead of throwing it away.

Most modern high-performance JVMs are derived from Smalltalk VMs and use techniques that were invented in the 1980s by the Smalltalk community.

Eclipse OpenJ9 started its life as IBM J9, which in turn is derived from the IBM VisualAge for Java Universal Virtual Machine (which was capable of seamlessly executing a mix of JVM byte code and Smalltalk byte code), which in turn was based on the IBM VisualAge for Smalltalk VM.

Oracle HotSpot is based on the Animorphic Smalltalk VM by LongView, which in turn is based on the Self VM. (The Animorphic Smalltalk VM was also the original basis for Google's V8 ECMAScript engine.)

Azul Zing is derived from HotSpot. Oracle Labs Maxine RVM was developed by some old Smalltalk and Self developers based on ideas from the Klein VM (an experimental meta-circular Self VM written in Self).

Some of the most well-known techniques for eliminating dynamic runtime virtual message dispatch overhead are

Devirtualization – turning dynamic runtime virtual message dispatch into static method lookup:
- A Study of Devirtualization Techniques for a JavaTM Just-In-Time Compiler Kazuaki Ishizaki, Motohiro Kawahito, Toshiaki Yasue, Hideaki Komatsu, Toshio Nakatani (IBM Research, Tokyo)
Callsite Customization – compiling multiple different versions of the code, each for a specific receiver type:
- Customization: Optimizing Compiler Technology for Self, a Dynamically-Typed Object-Oriented Programming Language, Craig Chambers and David Ungar
- An Efficient Implementation of Self, a Dynamically-Typed Object-Oriented Language Based on Prototypes, Craig Chambers, David Ungar, and Elgin Lee
Dynamic Type Feedback:
- Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback, Urs Hölzle and David Ungar
Dynamic Type Inference:
- Iterative Type Analysis and Extended Message Splitting: Optimizing Dynamically-Typed Object-Oriented Programs, Craig Chambers and David Ungar
Inline Caching – remembering what the result of the lookup was the last time
- Monomorphic Inline Caching,
- Duomorphic Inline Caching,
- Polymorphic Inline Caching:
  - Optimizing Dynamically-Typed Object-Oriented Programming Languages with Polymorphic Inline Caches, Urs Hölzle, Craig Chambers, and David Ungar
- Megamorphic Inline Caching,
Speculative Inlining:
- Sista: Speculative inlining, Smalltalk-style, Eliot Miranda of Cadence Design Systems and Clément Béra of INRIA, Lille, France (YouTube video from Stanford Seminar)
various other forms of Adaptive Optimizations:
- Adaptive optimization for Self: Reconciling High Performance with Exploratory Programming, Urs Hölzle
- A Third-Generation Self Implementation: Reconciling Responsiveness with Performance, Urs Hölzle and David Ungar

[You will note that almost all of the sources are for either Self or Smalltalk. The two major reasons are that Self pioneered a lot of these techniques, and Smalltalk and the Smalltalk VM were a major influence on Java and the JVM.]

The JVMs I am most familiar with (Eclipse OpenJ9, Oracle HotSpot, Oracle Labs Maxine RVM, Azul Zing) implement most of the above.

The invokedynamic bytecode introduced into the JVM Specification in Java 7 allows programmers access to the above optimizations but supply their own method lookup algorithm instead of the one hard-coded into the JVM. This makes it possible to create high-performance implementations on top of the JVM for languages whose method lookup algorithm is incompatible with Java's.

来源：https://stackoverflow.com/questions/65552062/implementation-of-invokevirtual-in-jvm

标签

java

methods

jvm