I have an old piece of code that performs find and replace of tokens within a string.
It receives a map of from
and to
pairs, iterates over
When it comes to replaceAll("[,. ]*", "")
it's not that big of a surprise since it relies on regular expressions. The regex engine creates an automaton which it runs over the input. Some overhead is expected.
The second approach (replace(",", "")...
) also uses regular expressions internally. Here the given pattern is however compiled using Pattern.LITERAL
so the regular expression overhead should be negligable.) In this case it is probably due to the fact that Strings
are immutable (however small change you do, you will create a new string) and thus not as efficient as StringBuffers
which manipulate the string in-place.
As I have put in a comment [,. ]* matches the empty String "". So, every "space" between characters matches the pattern. It is only noted in performance because you are replacing a lot of "" by "".
Try doing this:
Pattern p = Pattern.compile("[,. ]*");
System.out.println(p.matcher("Hello World").replaceAll("$$$");
It returns:
H$$$e$$$l$$$o$$$$$$W$$$o$$$r$$$l$$$d$$$!$$$
No wonder it is slower that doing it "by hand"! You should try with [,. ]+
While using regular expressions imparts some performance impact, it should not be as terrible.
Note that using String.replaceAll() will compile the regular expression each time you call it.
You can avoid that by explicitly using a Pattern object:
Pattern p = Pattern.compile("[,. ]+");
// repeat only the following part:
String output = p.matcher(input).replaceAll("");
Note also that using +
instead of *
avoids replacing empty strings and therefore might also speed up the process.
replace
and replaceAll
uses regex internally which in most cases gives a serious performance impact compared to e.g., StringUtils.replace(..).
String.replaceAll():
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this ).replaceAll(
replacement);
}
String.replace() uses Pattern.compile underneath.
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL)
.matcher(this ).replaceAll(
Matcher.quoteReplacement(replacement.toString()));
}
Also see Replace all occurrences of substring in a string - which is more efficient in Java?