Have you ever thought about the implications of this change in the Java Programming Language?
The String class was conceived as an immutable class (and this decision was
Just a comment about your "StringBuilders and threads" remark: even in multi-threaded programs, it's very rare to want to build up a string across multiple threads. Typically, each thread will have some set of data and create a string from that, often by concatenating multiple strings together. They'll then convert that StringBuilder
to a string, and that string can be safely shared among threads.
I don't think I've ever seen a bug due to a StringBuilder
being shared between threads.
Personally I wish StringBuffer
didn't exist - it was in the "let's synchronize everything" phase of Java, leading to Vector
and Hashtable
which have been almost obsoleted by the unsynchronized ArrayList
and HashMap
classes from Java 2. It just took a little while long for the unsynchronized equivalent of StringBuffer
to arrive.
So basically:
StringBuilder
to perform manipulation, usually over a short periodStringBuffer
unless you really, really need it - and as I say, I can't remember ever seeing a situation where I'd use StringBuffer
instead of StringBuilder
, when both are available.StringBuffer was in Java 1.0; it was not any kind of a reaction to slowness or immutability. It's also not in any way faster or better than string concatenation; in fact, the Java compiler compiles
String s1 = s2 + s3;
into something like
String s1 = new StringBuilder(s2).append(s3).toString();
If you don't believe me, try it yourself with a disassembler (javap -c, for example.)
The thing about "StringBuffer is faster than concatenation" refers to repeated concatenation. In that case explicitly creating yoir own StringBuffer and using it repeatedly performs better than letting the compiler create many of them.
StringBuilder was introduced in Java 5 for performance reasons, as you say. The reason it makes sense is that StringBuffer/Builder are virtually never shared outside of the method that creates them: 99% of their usage is something like the above, where they're created, used to append a few strings together, then discarded.
Nowadays both StringBuffer and Builder are sort of useless (from performance point of view). I explain why:
StringBuilder was supposed to be faster than StringBuffer but any sane JVM can optimize away the synchronization. So it was quite a huge miss (and small hit) when it was introduced.
StringBuffer used NOT to copy the char[] when creating the String (in non shared variant); however that was a major source of issues, incl leaking huge char[] for small Strings. In 1.5 they decided that a copy of the char[] must occur every time and that practically made StringBuffer useless (the sync was there to ensure no thread games can trick out the String). That conserves memory, though and ultimately helps the GC (beside the obviously reduced footprint), usually the char[] is the top3 of the objects consuming memory.
String.concat was and still is the fastest way to concatenate 2 strings (and 2 only... or possibly 3). Keep that in mind, it does not perform an extra copy of the char[].
Back to the useless part, now any 3rd party code can achieve the same performance as StringBuilder. Even in java1.1 I used to have a class name AsycnStringBuffer which did exactly the same what StringBuilder does now, but still it allocates larger char[] than StringBuilder. Both StrinBuffer/StringBuilder are optimized for small Strings by default you can see the c-tor
StringBuilder(String str) {
super(str.length() + 16);
append(str);
}
Thus if the 2nd string is longer than 16chars, it gets another copy of the underlying char[]. Pretty uncool.
That can be a side effect of attempt at fitting both StringBuilder/Buffer and the char[] into the same cache line (on x86) on 32bit OS... but I don't know for sure.
As for the remark of hours of debugging, etc. Use your judgment, I personally do not recall ever having any issues w/ strings operations, aside impl. rope alike structure for the sql generator of JDO impl.
Edit: Below I illustrate what java designers didn't do to make String operations faster. Please, note that the class is intended for java.lang package and it can put there only by adding it to the bootstrap classpath. However, even if not put there (the difference is a single line of code!), it'd be still faster than StringBuilder, shocking? The class would have made string1+string2+... a lot better than using StringBuilder, but well...
package java.lang;
public class FastConcat {
public static String concat(String s1, String s2){
s1=String.valueOf(s1);//null checks
s2=String.valueOf(s2);
return s1.concat(s2);
}
public static String concat(String s1, String s2, String s3){
s1=String.valueOf(s1);//null checks
s2=String.valueOf(s2);
s3=String.valueOf(s3);
int len = s1.length()+s2.length()+s3.length();
char[] c = new char[len];
int idx=0;
idx = copy(s1, c, idx);
idx = copy(s2, c, idx);
idx = copy(s3, c, idx);
return newString(c);
}
public static String concat(String s1, String s2, String s3, String s4){
s1=String.valueOf(s1);//null checks
s2=String.valueOf(s2);
s3=String.valueOf(s3);
s4=String.valueOf(s4);
int len = s1.length()+s2.length()+s3.length()+s4.length();
char[] c = new char[len];
int idx=0;
idx = copy(s1, c, idx);
idx = copy(s2, c, idx);
idx = copy(s3, c, idx);
idx = copy(s4, c, idx);
return newString(c);
}
private static int copy(String s, char[] c, int idx){
s.getChars(c, idx);
return idx+s.length();
}
private static String newString(char[] c){
return new String(0, c.length, c);
//return String.copyValueOf(c);//if not in java.lang
}
}
I tried the same thing on an XP machine. the StringBuilder IS somewhat faster but if You reverse the order of the run, or make several runs You'll notice that the "almost factor two" in the results will be changed into something like 10% advantage:
StringBuffer build & output duration= 4282,000000 µs
StringBuilder build & output duration= 4226,000000 µs
StringBuffer build & output duration= 4439,000000 µs
StringBuilder build & output duration= 3961,000000 µs
StringBuffer build & output duration= 4801,000000 µs
StringBuilder build & output duration= 4210,000000 µs
For Your kind of test the JVM will NOT help out. I had to limit the number of runs and elements just to get ANY result from a "String only"-test.
Decided to put the options to the test with a simple composition of XML exercise. Testing done on a 2.7GHz i5 with 16Gb DDR3 RAM for those wishing to replicate results.
Code:
private int testcount = 1000;
private int elementCount = 50000;
public void testStringBuilder() {
long total = 0;
int counter = 0;
while (counter++ < testcount) {
total += doStringBuilder();
}
float f = (total/testcount)/1000;
System.out.printf("StringBuilder build & output duration= %f µs%n%n", f);
}
private long doStringBuilder(){
long start = System.nanoTime();
StringBuilder buffer = new StringBuilder("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
buffer.append("<root>");
for (int i =0; i < elementCount; i++) {
buffer.append("<data/>");
}
buffer.append("</root>");
//System.out.println(buffer.toString());
output = buffer.toString();
long end = System.nanoTime();
return end - start;
}
public void testStringBuffer(){
long total = 0;
int counter = 0;
while (counter++ < testcount) {
total += doStringBuffer();
}
float f = (total/testcount)/1000;
System.out.printf("StringBuffer build & output duration= %f µs%n%n", f);
}
private long doStringBuffer(){
long start = System.nanoTime();
StringBuffer buffer = new StringBuffer("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
buffer.append("<root>");
for (int i =0; i < elementCount; i++) {
buffer.append("<data/>");
}
buffer.append("</root>");
//System.out.println(buffer.toString());
output = buffer.toString();
long end = System.nanoTime();
return end - start;
}
Results:
On OSX machine: StringBuilder build & output duration= 1047.000000 µs StringBuffer build & output duration= 1844.000000 µs On Win7 machine: StringBuilder build & output duration= 1869.000000 µs StringBuffer build & output duration= 2122.000000 µs
So looks like performance enhancement might be platform specific, dependant on how JVM implements synchronisation.
References:
Use of System.nanoTime() has been covered here -> Is System.nanoTime() completely useless? and here -> How do I time a method's execution in Java?.
Source for StringBuilder & StringBuffer here -> http://www.java2s.com/Open-Source/Java-Document/6.0-JDK-Core/lang/java.lang.htm
Good overview of synchronising here -> http://www.javaworld.com/javaworld/jw-07-1997/jw-07-hood.html?page=1