What can explain the huge performance penalty of writing a reference to a heap location?

跟風遠走 提交于 2019-12-03 06:24:29

Quoting the authoritative answer provided by Vladimir Kozlov at hotspot-compiler-dev mailing list:

Hi Marko,

For primitive arrays we use handwritten assembler code which use XMM registers as vectors for initialization. For object arrays we did not optimize it because it is not common case. We can improve it similar to what we did for arracopy but we decided leave it for now.

Regards,
Vladimir

I have also wondered why the optimized code is not inlined, and got that answer as well:

The code is not small, so we decided to not inline it. Look on MacroAssembler::generate_fill() in macroAssembler_x86.cpp:

http://hg.openjdk.java.net/hsx/hotspot-main/hotspot/file/54f0c207dc35/src/cpu/x86/vm/macroAssembler_x86.cpp


My original answer:

I missed an important bit in the machine code, apparently because I was looking at the On-Stack Replacement version of the compiled method instead of the one used for subsequent calls. It turns out that HotSpot was able to prove that my loop amounts to what a call to Arrays.fill would have done and replaced the entire loop with a call instruction to such code. I can't see that function's code, but it probably uses every possible trick, such as MMX instructions, to fill a block of memory with the same 32-bit value.

This gave me the idea to measure the actual Arrays.fill calls. I got more surprise:

Benchmark                  Mode Thr    Cnt  Sec         Mean   Mean error    Units
fillPrimitiveArray         avgt   1      5    2      155.343        1.318  nsec/op
fillReferenceArray         avgt   1      5    2      682.975       17.990  nsec/op
loopFillPrimitiveArray     avgt   1      5    2      156.114        0.523  nsec/op
loopFillReferenceArray     avgt   1      5    2      682.209        7.047  nsec/op

The results with a loop and with a call to fill are identical. If anything, this is even more confusing than the results which motivated the question. I would have at least expected fill to benefit from the same optimization ideas regardless of value type.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!