Replacing double backslashes with single backslash

后端 未结 7 1180
终归单人心
终归单人心 2020-12-18 04:21

I have a string \"\\\\u003c\", which belongs to UTF-8 charset. I am unable to decode it to unicode because of the presence of double backslashes. How do i get \"\\u003c\" fr

7条回答
  •  囚心锁ツ
    2020-12-18 04:45

    Regarding the problem of "replacing double backslashes with single backslashes" or, more generally, "replacing a simple string, containing \, with a different simple string, containing \" (which is not entirely the OP problem, but part of it):

    Most of the answers in this thread mention replaceAll, which is a wrong tool for the job here. The easier tool is replace, but confusingly, the OP states that replace("\\\\", "\\") doesn't work for him, that's perhaps why all answers focus on replaceAll.

    Important note for people with JavaScript background: Note that replace(CharSequence, CharSequence) in Java does replace ALL occurrences of a substring - unlike in JavaScript, where it only replaces the first one!

    Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.

    On the other hand, replaceAll(String regex, String replacement) -- more docs also here -- is treating both parameters as more than regular strings:

    Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string.

    (this is because \ and $ can be used as backreferences to the captured regex groups, hence if you want to used them literally, you need to escape them).

    In other words, both first and 2nd params of replace and replaceAll behave differently. For replace you need to double the \ in both params (standard escaping of a backslash in a string literal), whereas in replaceAll, you need to quadruple it! (standard string escape + function-specific escape)

    To sum up, for simple replacements, one should stick to replace("\\\\", "\\") (it needs only one escaping, not two).

    https://ideone.com/ANeMpw

    System.out.println("a\\\\b\\\\c");                                 // "a\\b\\c"
    System.out.println("a\\\\b\\\\c".replaceAll("\\\\\\\\", "\\\\"));  // "a\b\c"
    //System.out.println("a\\\\b\\\\c".replaceAll("\\\\\\\\", "\\"));  // runtime error
    System.out.println("a\\\\b\\\\c".replace("\\\\", "\\"));           // "a\b\c"
    

    https://www.ideone.com/Fj4RCO

    String str = "\\\\u003c";
    System.out.println(str);                                // "\\u003c"
    System.out.println(str.replaceAll("\\\\\\\\", "\\\\")); // "\u003c"
    System.out.println(str.replace("\\\\", "\\"));          // "\u003c"
    

提交回复
热议问题