Escaping double-slashes with regular expressions in Java

感情迁移 提交于 2019-12-23 12:55:02

问题


I have this unit test:

public void testDeEscapeResponse() {
    final String[] inputs = new String[] {"peque\\\\u0f1o", "peque\\u0f1o"};
    final String[] expected = new String[] {"peque\\u0f1o", "peque\\u0f1o"};
    for (int i = 0; i < inputs.length; i++) {
        final String input = inputs[i];
        final String actual = QTIResultParser.deEscapeResponse(input);
        Assert.assertEquals(
            "deEscapeResponse did not work correctly", expected[i], actual);
    }
}

I have this method:

static String deEscapeResponse(String str) {
    return str.replaceAll("\\\\", "\\");
}

The unit test is failing with this error:

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
    at java.lang.String.charAt(String.java:686)
    at java.util.regex.Matcher.appendReplacement(Matcher.java:703)
    at java.util.regex.Matcher.replaceAll(Matcher.java:813)
    at java.lang.String.replaceAll(String.java:2189)
    at com.acme.MyClass.deEscapeResponse
    at com.acme.MyClassTest.testDeEscapeResponse

Why?


回答1:


Use String.replace which does a literal replacement instead of String.replaceAll which uses regular expressions.

Example:

"peque\\\\u0f1o".replace("\\\\", "\\")    //  gives  peque\u0f1o

String.replaceAll takes a regular expression thus \\\\ is interpreted as the expression \\ which in turn matches a single \. (The replacement string also has special treatment for \ so there's an error there too.)

To make String.replaceAll work as you expect here, you would need to do

"peque\\\\u0f1o".replaceAll("\\\\\\\\", "\\\\")



回答2:


I think the problem is that you're using replaceAll() instead of replace(). replaceAll expects a regular expression in the first field and you're just trying to string match.




回答3:


See javadoc for Matcher:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

Thus with replaceAll you cannot replace anything with a backslash. Thus a really crazy workaround for your case would be str.replaceAll("\\\\(\\\\)", "$1")



来源:https://stackoverflow.com/questions/6348847/escaping-double-slashes-with-regular-expressions-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!