I am thinking about using String.replaceAll()
to remove certain characters in my string. It is unclear which characters are going to be removed (i.e. which char
I guess the example code on your link is good enough which you can add other valid characters of your choice. But you can minimize the code using regular expression. Take a look at the code of Abdullah, or see more link1,link2, link3.
I think you are looking for a code like this to solve your problem without any looping
:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class StripChars {
public static void main(String[] args) {
// prints: Just to clarify I will have strings of varying lengths
System.out.println(
replace("Just to clarify, I will have strings of varying lengths.",
",."));
// prints: Solution to my problem on Stackoverflow will cost me 0
System.out.println(
replace("Solution to my problem on stackoverflow will cost me $0.",
".$"));
}
static String replace(String line, String charsToBeReplaced) {
Pattern p = Pattern.compile("(.{1})");
Matcher m = p.matcher(charsToBeReplaced);
return line.replaceAll(m.replaceAll("\\\\$1\\|"), "");
}
}
To take care of special regex characters (meta-characters) in input replace method is first putting \ (backslash) before each character and a | (pipe) after each character in your input. So an input of ",."
will become "\\,|\\.|"
Once that is done then replacement is pretty simple: for every matching char replace it by a blank.
Not used in this solution but here is the pattern to detect presence of ANY special regex character in Java:
Pattern metachars = Pattern.compile(
"^.*?(\\(|\\[|\\{|\\^|\\-|\\$|\\||\\]|\\}|\\)|\\?|\\*|\\+|\\.).*?$");
The Guava method is interesting, though I'm not sure why they use the "spread" variable. Since they use that, a subtraction operation is needed for each shift. I benchmarked a few versions (including a simple hand coded shifter) and you can find the writeup here :
http://thushw.blogspot.com/2013/06/java-remove-specified-characters-from.html
You might want to start by specifying which character you WANT to keep, try something like:
"mystring".replaceAll("[^a-zA-Z]", "")
To only keep letters.
This is one of those cases where regular expressions are probably not a good idea. You're going to end up writing more special code to get around regex than if you just take the simple approach and iterate over the characters. You also risk overlooking some cases that might surface as a bug later.
If you're concerned about performance, regex is actually going to be much slower. If you look through the code or profile the use of it, regex has to create a pattern to parse/compile, run through the matching logic and then apply your replacement. All of that creates a lot of objects, which can be expensive if you iterate on this frequently enough.
I'd implement what you found on that link a little differently though. You can save on unnecessary String
allocations as it builds the result without any additional complexity:
public static String stripChars(String input, String strip) {
StringBuilder result = new StringBuilder();
for (char c : input.toCharArray()) {
if (strip.indexOf(c) == -1) {
result.append(c);
}
}
return result.toString();
}
I think this can be done by using regular expressions.
Firstly, we know [a-zA-Z]
and $%!
is valid for characters in string. So we use regx "[^a-zA-Z0-9$%!]"
to strip out the other invalid chars.
check http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html for detail info of JAVA patten.
Next, we can usemystring.replaceAll(String regex, String replacement)
P.S. RefexPlanet online Regular Expression Test Page