Is it true that having lots of small methods helps the JIT compiler optimize?

后端未结

关注

 4  425

夕颜 2021-01-30 16:56

In a recent discussion about how to optimize some code, I was told that breaking code up into lots of small methods can significantly increase performance, because the JIT compi

4条回答

旧时难觅i (楼主)

2021-01-30 17:28
I've read numerous articles which have stated that smaller methods (as measured in the number of bytes required to represent the method as Java bytecode) are more likely to be eligible for inlining by the JIT (just-in-time compiler) when it compiles hot methods (those which are being run most frequently) into machine code. And they describe how method inlining produces better performance of the resulting machine code. In short: smaller methods give the JIT more options in terms of how to compile bytecode into machine code when it identifies a hot method, and this allows more sophisticated optimizations.

To test this theory, I created a JMH class with two benchmark methods, each containing identical behaviour but factored differently. The first benchmark is named monolithicMethod (all code in a single method), and the second benchmark is named smallFocusedMethods and has been refactored so that each major behaviour has been moved out into its own method. The smallFocusedMethods benchmark look like this:
```
@Benchmark
public void smallFocusedMethods(TestState state) {
    int i = state.value;
    if (i < 90) {
        actionOne(i, state);
    } else {
        actionTwo(i, state);
    }
}

private void actionOne(int i, TestState state) {
    state.sb.append(Integer.toString(i)).append(
            ": has triggered the first type of action.");
    int result = i;
    for (int j = 0; j < i; ++j) {
        result += j;
    }
    state.sb.append("Calculation gives result ").append(Integer.toString(
            result));
}

private void actionTwo(int i, TestState state) {
    state.sb.append(i).append(" has triggered the second type of action.");
    int result = i;
    for (int j = 0; j < 3; ++j) {
        for (int k = 0; k < 3; ++k) {
            result *= k * j + i;
        }
    }
    state.sb.append("Calculation gives result ").append(Integer.toString(
            result));
}
```
and you can imagine how monolithicMethod looks (same code but entirely contained within the one method). The TestState simply does the work of creating a new StringBuilder (so that the creation of this object is not counted in the benchmark time) and of choosing a random number between 0 and 100 for each invocation (and this has been deliberately configured so that both benchmarks use exactly the same sequence of random numbers, to avoid the risk of bias).

After running the benchmark with six "forks", each involving five warmups of one second, followed by six iterations of five seconds, the results look like this:
```
Benchmark                                         Mode   Cnt        Score        Error   Units

monolithicMethod                                  thrpt   30  7609784.687 ± 118863.736   ops/s
monolithicMethod:·gc.alloc.rate                   thrpt   30     1368.296 ±     15.834  MB/sec
monolithicMethod:·gc.alloc.rate.norm              thrpt   30      270.328 ±      0.016    B/op
monolithicMethod:·gc.churn.G1_Eden_Space          thrpt   30     1357.303 ±     16.951  MB/sec
monolithicMethod:·gc.churn.G1_Eden_Space.norm     thrpt   30      268.156 ±      1.264    B/op
monolithicMethod:·gc.churn.G1_Old_Gen             thrpt   30        0.186 ±      0.001  MB/sec
monolithicMethod:·gc.churn.G1_Old_Gen.norm        thrpt   30        0.037 ±      0.001    B/op
monolithicMethod:·gc.count                        thrpt   30     2123.000               counts
monolithicMethod:·gc.time                         thrpt   30     1060.000                   ms

smallFocusedMethods                               thrpt   30  7855677.144 ±  48987.206   ops/s
smallFocusedMethods:·gc.alloc.rate                thrpt   30     1404.228 ±      8.831  MB/sec
smallFocusedMethods:·gc.alloc.rate.norm           thrpt   30      270.320 ±      0.001    B/op
smallFocusedMethods:·gc.churn.G1_Eden_Space       thrpt   30     1393.473 ±     10.493  MB/sec
smallFocusedMethods:·gc.churn.G1_Eden_Space.norm  thrpt   30      268.250 ±      1.193    B/op
smallFocusedMethods:·gc.churn.G1_Old_Gen          thrpt   30        0.186 ±      0.001  MB/sec
smallFocusedMethods:·gc.churn.G1_Old_Gen.norm     thrpt   30        0.036 ±      0.001    B/op
smallFocusedMethods:·gc.count                     thrpt   30     1986.000               counts
smallFocusedMethods:·gc.time                      thrpt   30     1011.000                   ms
```
In short, these numbers show that the smallFocusedMethods approach ran 3.2% faster, and the difference was statistically significant (with 99.9% confidence). And note that the memory usage (based on garbage collection profiling) was not significantly different. So you get faster performance without increased overhead.

I've run a variety of similar benchmarks to test whether small, focused methods give better throughput, and I've found that the improvement is between 3% and 7% in all cases I've tried. But it's likely that the actual gain depends strongly upon the version of the JVM being used, the distribution of executions across your if/else blocks (I've gone for 90% on the first and 10% on the second to exaggerate the heat on the first "action", but I've seen throughput improvements even with a more equal spread across a chain of if/else blocks), and the actual complexity of the work being done by each of the possible actions. So be sure to write your own specific benchmarks if you need to determine what works for your specific application.

My advice is this: write small, focused methods because it makes the code tidier, easier to read, and much easier to override specific behaviours when inheritance is involved. The fact that the JIT is likely to reward you with slightly better performance is a bonus, but tidy code should be your main goal in the majority of cases. Oh, and it's also important to give each method a clear, descriptive name which exactly summarises the responsibility of the method (unlike the terrible names I've used in my benchmark).
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...