Regex for special characters in java

后端 未结 6 1613
说谎
说谎 2021-01-06 10:09
public static final String specialChars1= \"\\\\W\\\\S\";
String str2 = str1.replaceAll(specialChars1, \"\").replace(\" \", \"+\");

public static final String speci         


        
相关标签:
6条回答
  • 2021-01-06 10:25

    This worked for me:

    String result = str.replaceAll("[^\\dA-Za-z ]", "").replaceAll("\\s+", "+");

    For this input string:

    /-+!@#$%^&())";:[]{}\ |wetyk 678dfgh

    It yielded this result:

    +wetyk+678dfgh

    0 讨论(0)
  • 2021-01-06 10:25

    you can use a regex like this:

    [<#![CDATA[¢<(+|!$*);¬/¦,%_>?:#="~{@}\]]]#>]`

    remove "#" at first and at end from expression

    regards

    0 讨论(0)
  • 2021-01-06 10:25

    @npinti

    using "\w" is the same as "\dA-Za-z"

    This worked for me:

    String result = str.replaceAll("[^\\w ]", "").replaceAll("\\s+", "+");
    
    0 讨论(0)
  • 2021-01-06 10:40

    replaceAll expects a regex:

    public static final String specialChars2 = "[`~!@#$%^&*()_+[\\]\\\\;\',./{}|:\"<>?]";
    
    0 讨论(0)
  • 2021-01-06 10:47

    The problem with your first regex, is that "\W\S" means find a sequence of two characters, the first of which is not a letter or a number followed by a character which is not whitespace.

    What you mean is "[^\w\s]". Which means: find a single character which is neither a letter nor a number nor whitespace. (we can't use "[\W\S]" as this means find a character which is not a letter or a number OR is not whitespace -- which is essentially all printable character).

    The second regex is a problem because you are trying to use reserved characters without escaping them. You can enclose them in [] where most characters (not all) do not have special meanings, but the whole thing would look very messy and you have to check that you haven't missed out any punctuation.

    Example:

    String sequence = "qwe 123 :@~ ";
    
    String withoutSpecialChars = sequence.replaceAll("[^\\w\\s]", "");
    
    String spacesAsPluses = withoutSpecialChars.replaceAll("\\s", "+");
    
    System.out.println("without special chars: '"+withoutSpecialChars+ '\'');
    System.out.println("spaces as pluses: '"+spacesAsPluses+'\'');
    

    This outputs:

    without special chars: 'qwe 123  '
    spaces as pluses: 'qwe+123++'
    

    If you want to group multiple spaces into one + then use "\s+" as your regex instead (remember to escape the slash).

    0 讨论(0)
  • 2021-01-06 10:47

    I had a similar problem to solve and I used following method:

    text.replaceAll("\\p{Punct}+", "").replaceAll("\\s+", "+");
    

    Code with time bench marking

    public static String cleanPunctuations(String text) {
        return text.replaceAll("\\p{Punct}+", "").replaceAll("\\s+", "+");
    }
    
    public static void test(String in){
        long t1 = System.currentTimeMillis();
        String out = cleanPunctuations(in);
        long t2 = System.currentTimeMillis();
        System.out.println("In=" + in + "\nOut="+ out + "\nTime=" + (t2 - t1)+ "ms");
    
    }
    
    public static void main(String[] args) {
        String s1 = "My text with 212354 digits spaces and \n newline \t tab " +
                "[`~!@#$%^&*()_+[\\\\]\\\\\\\\;\\',./{}|:\\\"<>?] special chars";
        test(s1);
        String s2 = "\"Sample Text=\"  with - minimal \t punctuation's";
        test(s2);
    }
    

    Sample Output

    In=My text with 212354 digits spaces and 
     newline     tab [`~!@#$%^&*()_+[\\]\\\\;\',./{}|:\"<>?] special chars
    Out=My+text+with+212354+digits+spaces+and+newline+tab+special+chars
    Time=4ms
    In="Sample Text="  with - minimal    punctuation's
    Out=Sample+Text+with+minimal+punctuations
    Time=0ms
    
    0 讨论(0)
提交回复
热议问题