When I have a String that I need to concatenate a single char to its end,
should I prefer s = .... + \']\'
over s = .... + \"]\"
for any performance re
When I have a String that I need to concatenate a single char to its end, should I prefer s = .... + ']' over s = .... + "]" for any performance reason?
There are actually two questions here:
Q1: Is there a performance difference?
Answer: It depends ...
In some cases, possibly yes, depending on the JVM and/or the bytecode compiler. If the bytecode compiler generates a call to StringBuilder.append(char)
rather than StringBuilder.append(String)
then you would expect the former to be faster. But the JIT compiler could treat these methods as "intrinics" and optimize calls to append(String)
with a one character (literal) string.
In short, you would need to benchmark this on your platform to be sure.
In other cases, there is definitely no difference. For example, these two calls will be compiled identical bytecode sequences because the concatenation is a Constant Expression.
System.out.println("234" + "]");
System.out.println("234" + ']');
This is guaranteed by the JLS.
Q2: Should you prefer one version over the other.
Answer:
In the general sense, this is likely to be a premature optimization. You should only prefer one form over the other for performance reasons if you have profiled your code at the application level and determined that the code snippet has a measurable impact on performance.
If you have profiled the code, then use the answer to Q1 as a guide.
And if it was worth trying to optimize the snippet, then is essential that you rerun your benchmarking / profiling after optimizing, to see if it made any difference. Your intuition about what is fastest... and what you have read in some old article on the internet ... could be very wrong.
Besides profiling this we have another possibility to get some insights. I want to focus on the possible speed differences and not on the things which remove them again.
So lets start with this Test
class:
public class Test {
// Do not optimize this
public static volatile String A = "A String";
public static void main( String [] args ) throws Exception {
String a1 = A + "B";
String a2 = A + 'B';
a1.equals( a2 );
}
}
I compiled this with javac Test.java (using javac -v: javac 1.7.0_55)
Using javap -c Test.class we get:
Compiled from "Test.java"
public class Test {
public static volatile java.lang.String A;
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]) throws java.lang.Exception;
Code:
0: new #2 // class java/lang/StringBuilder
3: dup
4: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
7: getstatic #4 // Field A:Ljava/lang/String;
10: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
13: ldc #6 // String B
15: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_1
22: new #2 // class java/lang/StringBuilder
25: dup
26: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
29: getstatic #4 // Field A:Ljava/lang/String;
32: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
35: bipush 66
37: invokevirtual #8 // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
40: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
43: astore_2
44: aload_1
45: aload_2
46: invokevirtual #9 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
49: pop
50: return
static {};
Code:
0: ldc #10 // String A String
2: putstatic #4 // Field A:Ljava/lang/String;
5: return
}
We can see, that there are two StringBuilders involved (Lines 4, 22 ). So the first thing we discover is, that using +
to concat Strings
is effectively the same as using StringBuilder.
The second thing we can see here is that the StringBuilders both are called twice. First for appending the volatile variable (Lines 10, 32) and the second time for appending the constant part (Lines 15, 37)
In case of A + "B"
append
is called with a Ljava/lang/String
(a String) argument while in case of A + 'B'
it is called with an C
(a char) argument.
So the compile does not convert String to char but leaves it as it is*.
Now looking in AbstractStringBuilder
which contains the methods used we have:
public AbstractStringBuilder append(char c) {
ensureCapacityInternal(count + 1);
value[count++] = c;
return this;
}
and
public AbstractStringBuilder append(String str) {
if (str == null) str = "null";
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;
}
as the methods actually called.
The most expensive operations here is certainly ensureCapacity
but only in case the limit is reached (it does an array copy of the old StringBuffers char[] into a new one). So this is true for both and makes no real difference.
As one can see there are numerous other operations which are done but the real distinction is between value[count++] = c;
and str.getChars(0, len, value, count);
If we look in to getChars we see, that it all boils down to one System.arrayCopy
which is used here to copy the String to the Buffer's array plus some checks and additional method calls vs. one single array access.
So I would say in theory using A + "B"
is much slower than using A + 'B'
.
I think in real execution it is slower, too. But to determine this we need to benchmark.
EDIT: Of cause this is all before the JIT does it's magic. See Stephen C's answer for that.
EDIT2: I've been looking at the bytecode which eclipse's compiler generated and it's nearly identical. So at least these two compilers don't differ in the outcome.
EDIT2: AND NOW THE FUN PART
The Benchmarks. This result is generated by running Loops 0..100M for a+'B'
and a+"B"
few times after a warmup:
a+"B": 5096 ms
a+'B': 4569 ms
a+'B': 4384 ms
a+"B": 5502 ms
a+"B": 5395 ms
a+'B': 4833 ms
a+'B': 4601 ms
a+"B": 5090 ms
a+"B": 4766 ms
a+'B': 4362 ms
a+'B': 4249 ms
a+"B": 5142 ms
a+"B": 5022 ms
a+'B': 4643 ms
a+'B': 5222 ms
a+"B": 5322 ms
averageing to:
a+'B': 4608ms
a+"B": 5167ms
So even in the real benchmark world of syntetic knowlege (hehe) a+'B'
is about 10% faster than a+"B"
...
... at least (disclaimer) on my system with my compiler and my cpu and it's really no difference / not noticeable in real world programms. Except of cause you have a piece of code you run realy often and all your application perfomance depends on that. But then you would probably do things different in the first place.
EDIT4:
Thinking about it. This is the loop used to benchmark:
start = System.currentTimeMillis();
for( int i=0; i<RUNS; i++ ){
a1 = a + 'B';
}
end = System.currentTimeMillis();
System.out.println( "a+'B': " + (end-start) + " ms" );
so we're really not only benchmarking the one thing we care about but although java loop performance, object creation perfomance and assignment to variables performance. So the real speed difference may be even a little bigger.