I have a StringBuilder object that needs to be trimmed (i.e. all whitespace chars /u0020 and below removed from either end).
I can\'t seem to find a method in string
I've used Zaven's analysis approach and StringBuilder's delete(start, end) method which performs far better than the deleteCharAt(index) approach, but slightly worse than the substring() approach. This method also uses the array copy, but array copy is called far fewer times (only twice in the worst case). In addition, this avoids creating multiple instances of intermediate Strings in case trim() is called repeatedly on the same StringBuilder object.
public class Main {
public static String trimOriginal(StringBuilder sb) {
return sb.toString().trim();
}
public static String trimDeleteRange(StringBuilder sb) {
int first, last;
for (first = 0; first < sb.length(); first++)
if (!Character.isWhitespace(sb.charAt(first)))
break;
for (last = sb.length(); last > first; last--)
if (!Character.isWhitespace(sb.charAt(last - 1)))
break;
if (first == last) {
sb.delete(0, sb.length());
} else {
if (last < sb.length()) {
sb.delete(last, sb.length());
}
if (first > 0) {
sb.delete(0, first);
}
}
return sb.toString();
}
public static String trimSubstring(StringBuilder sb) {
int first, last;
for (first = 0; first < sb.length(); first++)
if (!Character.isWhitespace(sb.charAt(first)))
break;
for (last = sb.length(); last > first; last--)
if (!Character.isWhitespace(sb.charAt(last - 1)))
break;
return sb.substring(first, last);
}
public static void main(String[] args) {
runAnalysis(1000);
runAnalysis(10000);
runAnalysis(100000);
runAnalysis(200000);
runAnalysis(500000);
runAnalysis(1000000);
}
private static void runAnalysis(int stringLength) {
System.out.println("Main:runAnalysis(string-length=" + stringLength + ")");
long originalTime = 0;
long deleteTime = 0;
long substringTime = 0;
for (int i = 0; i < 200; i++) {
StringBuilder temp = new StringBuilder();
char[] options = {' ', ' ', ' ', ' ', 'a', 'b', 'c', 'd'};
for (int j = 0; j < stringLength; j++) {
temp.append(options[(int) ((Math.random() * 1000)) % options.length]);
}
String testStr = temp.toString();
StringBuilder sb1 = new StringBuilder(testStr);
StringBuilder sb2 = new StringBuilder(testStr);
StringBuilder sb3 = new StringBuilder(testStr);
long timer1 = System.currentTimeMillis();
trimOriginal(sb1);
originalTime += System.currentTimeMillis() - timer1;
long timer2 = System.currentTimeMillis();
trimDeleteRange(sb2);
deleteTime += System.currentTimeMillis() - timer2;
long timer3 = System.currentTimeMillis();
trimSubstring(sb3);
substringTime += System.currentTimeMillis() - timer3;
}
System.out.println(" original: " + originalTime + " ms");
System.out.println(" delete-range: " + deleteTime + " ms");
System.out.println(" substring: " + substringTime + " ms");
}
}
Output:
Main:runAnalysis(string-length=1000)
original: 0 ms
delete-range: 4 ms
substring: 0 ms
Main:runAnalysis(string-length=10000)
original: 4 ms
delete-range: 9 ms
substring: 4 ms
Main:runAnalysis(string-length=100000)
original: 22 ms
delete-range: 33 ms
substring: 43 ms
Main:runAnalysis(string-length=200000)
original: 57 ms
delete-range: 93 ms
substring: 110 ms
Main:runAnalysis(string-length=500000)
original: 266 ms
delete-range: 220 ms
substring: 191 ms
Main:runAnalysis(string-length=1000000)
original: 479 ms
delete-range: 467 ms
substring: 426 ms
You get two strings, but I'd expect the data to be only allocated once. Since Strings in Java are immutable, I'd expect the trim implementation to give you an object that shares the same character data, but with different start- and end indices. At least that's what the substr method does. So, anything you try to optimise this most certainly will have the opposite effect, since you add overhead that is not needed.
Just step through the trim() method with your debugger.