I need to remove some substrings in strings (in a large dataset). The substrings often contain special characters, like these: ., ^, /,... and replaceAll() would treat them as s
Just use String.replace(String, String), not replaceAll
. String.replace
doesn't treat its argument as a regex.
You can match literally. For instance, if we want to match "<.]}^", we can do:
Pattern pat=Pattern.compile("<.]}^", PATTERN.LITERAL");
and use that pattern.
You can also use backslashes to escape it. Note that the string literal itself needs backslashes, so escaping a single dot will take two backslashes, as follows:
Pattern pat=Pattern.compile("\\.");
First backslash is seen by compiler, and second backslash is taken as a backslash for the regex parser.
There are 2 methods named replace
in the String
class that perform replacement without treating their parameters as regular expressions.
One replace method replaces one char
with another char
.
The other replace method replaces a CharSequence
(usually a String
) with another CharSequence
.
Quoting the Javadocs from the second replace
method:
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.
Is there other functions to do the "replace"
Yes, it is called replace :) Main difference between it and replaceAll is that it escapes regex special characters.
BTW if you want to escape regex's special characters in string you can
yourString = Pattern.quote(yourString)
,"\\Q"
and "\\E"
, to escape only some special characters you can
"\\"
before them like \\.
"["
and "]"
like [.]
.Just use String.replace(). It functions the same way, but it deals with escaping the special characters internally to avoid you having to worry about regex.
Documentation