Remove all characters from string which are not on whitelist

早过忘川 提交于 2019-12-09 02:44:37

问题


I am trying to write java code which would remove all unwanted characters and let there be only whitelisted ones.

Example:

String[] whitelist = {"a", "b", "c"..."z", "0"..."9", "[", "]",...}

I want there only letters (lower and uppercase) and numbers + some next characters I would add. Then I would start for() cycle for every character in the string, and replace it with empty string if it isn't on whitelist.

But that isn't good solution. Maybe it could be done somehow using pattern (regex)? Thanks.


回答1:


Yes, you can use String.replaceAll which takes a regex:

String input = "BAD good {} []";
String output = input.replaceAll("[^a-z0-9\\[\\]]", "");
System.out.println(output); // good[]

Or in Guava you could use a CharMatcher:

CharMatcher matcher = CharMatcher.inRange('a', 'z')
                          .or(CharMatcher.inRange('0', '9'))
                          .or(CharMatcher.anyOf("[]"));
String input = "BAD good {} []";
String output = matcher.retainFrom(input);

That just shows the lower case version, making it easier to demonstrate. To include upper case letters, use "[^A-Za-z0-9\\[\\]]" in the regex (and any other symbols you want) - and for the CharMatcher you can or it with CharMatcher.inRange('A', 'Z').




回答2:


You could try and match everything that is not in your whitelist and replace it with an empty string:

String in = "asng $%& 123";
//this assumes your whitelist contains word characters and whitespaces, adapt as needed
System.out.println(in.replaceAll( "[^\\w\\s]+", "" )); 


来源:https://stackoverflow.com/questions/15249047/remove-all-characters-from-string-which-are-not-on-whitelist

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!