In a recent discussion about how to optimize some code, I was told that breaking code up into lots of small methods can significantly increase performance, because the JIT compi
The Hotspot JIT only inlines methods that are less than a certain (configurable) size. So using smaller methods allows more inlining, which is good.
See the various inlining options on this page.
EDIT
To elaborate a little:
Example (full code to have the same line numbers if you try it)
package javaapplication27;
public class TestInline {
private int count = 0;
public static void main(String[] args) throws Exception {
TestInline t = new TestInline();
int sum = 0;
for (int i = 0; i < 1000000; i++) {
sum += t.m();
}
System.out.println(sum);
}
public int m() {
int i = count;
if (i % 10 == 0) {
i += 1;
} else if (i % 10 == 1) {
i += 2;
} else if (i % 10 == 2) {
i += 3;
}
i += count;
i *= count;
i++;
return i;
}
}
When running this code with the following JVM flags: -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:FreqInlineSize=50 -XX:MaxInlineSize=50 -XX:+PrintInlining
(yes I have used values that prove my case: m
is too big but both the refactored m
and m2
are below the threshold - with other values you might get a different output).
You will see that m()
and main()
get compiled, but m()
does not get inlined:
56 1 javaapplication27.TestInline::m (62 bytes)
57 1 % javaapplication27.TestInline::main @ 12 (53 bytes)
@ 20 javaapplication27.TestInline::m (62 bytes) too big
You can also inspect the generated assembly to confirm that m
is not inlined (I used these JVM flags: -XX:+PrintAssembly -XX:PrintAssemblyOptions=intel
) - it will look like this:
0x0000000002780624: int3 ;*invokevirtual m
; - javaapplication27.TestInline::main@20 (line 10)
If you refactor the code like this (I have extracted the if/else in a separate method):
public int m() {
int i = count;
i = m2(i);
i += count;
i *= count;
i++;
return i;
}
public int m2(int i) {
if (i % 10 == 0) {
i += 1;
} else if (i % 10 == 1) {
i += 2;
} else if (i % 10 == 2) {
i += 3;
}
return i;
}
You will see the following compilation actions:
60 1 javaapplication27.TestInline::m (30 bytes)
60 2 javaapplication27.TestInline::m2 (40 bytes)
@ 7 javaapplication27.TestInline::m2 (40 bytes) inline (hot)
63 1 % javaapplication27.TestInline::main @ 12 (53 bytes)
@ 20 javaapplication27.TestInline::m (30 bytes) inline (hot)
@ 7 javaapplication27.TestInline::m2 (40 bytes) inline (hot)
So m2
gets inlined into m
, which you would expect so we are back to the original scenario. But when main
gets compiled, it actually inlines the whole thing. At the assembly level, it means you won't find any invokevirtual
instructions any more. You will find lines like this:
0x00000000026d0121: add ecx,edi ;*iinc
; - javaapplication27.TestInline::m2@7 (line 33)
; - javaapplication27.TestInline::m@7 (line 24)
; - javaapplication27.TestInline::main@20 (line 10)
where basically common instructions are "mutualised".
Conclusion
I am not saying that this example is representative but it seems to prove a few points:
And finally: if a portion of your code is really critical for performance that these considerations matter, you should examine the JIT output to fine tune your code and importantly profile before and after.