How much does Java optimize string concatenation with +?

匿名 (未验证) 提交于 2019-12-03 08:44:33

问题:

I know that in more recent Java versions string concatenation

String test = one + "two"+ three; 

Will get optimized to use a StringBuilder.

However will a new StringBuilder be generated each time it hits this line or will a single Thread Local StringBuilder be generated that is then used for all string concatenation?

In other words can I improve on the performance for a frequently called method by creating my own thread local StringBuilder to re-use or will there be no significant gains by doing so?

I can just write a test for this but I wonder if it might be compiler/JVM specific or something that can be answered more generally?

回答1:

As far as I know, there is no compiler generating code reusing StringBuilder instances, most notably javac and ECJ don’t generate reusing code.

It’s important to emphasize that it is reasonable not to do such re-use. It’s not safe to assume that code retrieving an instance from a ThreadLocal variable is faster than a plain allocation from a TLAB. Even by trying to add the potential costs of a local gc cycle for reclaiming that instance, as far as we can identify its fraction on the costs, we can’t conclude that.

So the code trying to reuse the builder would be more complicated, wasting memory, as it keeps the builder alive without knowing whether it ever will be actually reused, without a clear performance benefit.

Especially when we consider that additionally to the statement above

  • JVMs like HotSpot have Escape Analysis, which can elide pure local allocations like these altogether and also may elide the copying costs of array resize operations
  • Such sophisticated JVMs usually also have optimizations dedicated specifically to StringBuilder based concatenation, which work best when the compiled code follows the common pattern

invokedynamic instruction which will get linked to a JRE provided factory at runtime (see StringConcatFactory). Then, the JRE will decide how the code will look like, which allows to tailor it to the specific JVM, including buffer re-use, if it has a benefit on that particular JVM. This will also reduce the code size, as it requires only a single instruction rather than the sequence of an allocation and multiple calls into the StringBuilder.



回答2:

You would be amazed how much effort was put into jdk-9 String concatenation. First javac emits an invokedynamic instead of an invocation to StringBuilder#append. That invokedynamic will return a CallSite with contains a MethodHandle (that is actually a series of MethodHandles).

Thus the decision of what is actually done for a String concatenation is moved to the runtime. The downside is that the first time you concatenate Strings that is going to be slower (for the same type of arguments).

Then there are a series of strategies you can choose from when concatenating a String(you can override the default one via java.lang.invoke.stringConcat parameter):

private enum Strategy {     /**      * Bytecode generator, calling into {@link java.lang.StringBuilder}.      */     BC_SB,      /**      * Bytecode generator, calling into {@link java.lang.StringBuilder};      * but trying to estimate the required storage.      */     BC_SB_SIZED,      /**      * Bytecode generator, calling into {@link java.lang.StringBuilder};      * but computing the required storage exactly.      */     BC_SB_SIZED_EXACT,      /**      * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.      * This strategy also tries to estimate the required storage.      */     MH_SB_SIZED,      /**      * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.      * This strategy also estimate the required storage exactly.      */     MH_SB_SIZED_EXACT,      /**      * MethodHandle-based generator, that constructs its own byte[] array from      * the arguments. It computes the required storage exactly.      */     MH_INLINE_SIZED_EXACT } 

The default strategy is: MH_INLINE_SIZED_EXACT which is a beast!

It uses the package-private constructor to build the String (which is the fastest):

/*  * Package private constructor which shares value array for speed.  */ String(byte[] value, byte coder) {     this.value = value;     this.coder = coder; } 

First this strategy creates so called filters; these are basically method handles that would transform the incoming parameter to a String value. As one might expect, these MethodHandles are stored in a class called Stringifiers that in most cases produce a MethodHandle that calls:

String.valueOf(YourInstance) 

So if you have 3 Objects that you want to concatenate there will be 3 MethodHandles that will delegate to String.valueOf(YourObject) which effectively means that you have transformed your objects into Strings. There are certain tweaks inside this class that I still can't understand; like the need to have separate classes StringifierMost (that transforms to String only References, float and doubles) and StringifierAny.

Since the MH_INLINE_SIZED_EXACT says that the byte array is computed to exact size; there is a way to compute that.

The way this is done is via methods in StringConcatHelper#mixLen which take Stringified version of your input parameters (References/float/double). At this point we know the size of our final String. Well, we don't actually know it, we have a MethodHandle that will compute it.

There's one more change in String jdk-9 that is worth mentioning here - addition of a coder field. This is needed to compute the size/equality/charAt of a String. Since it's needed for the size, we need to compute it also; this is done via StringConcatHelper#mixCoder.

It is safe at this point to delegate a MethodHandle that will create ur array:

    @ForceInline     private static byte[] newArray(int length, byte coder) {         return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length << coder);     } 

How is each element appended? Via methods in StringConcatHelper#prepend.

And only now we need all the details needed to invoke that constructor of the String that takes a byte.


All these operations (and many others I have skipped for simplicity) are handled via emitting a MethodHandle that will be invoked when the appending actually happens.



易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!