How to split a string, but also keep the delimiters?

前端 未结 23 2407
我在风中等你
我在风中等你 2020-11-21 06:32

I have a multiline string which is delimited by a set of different delimiters:

(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)
23条回答
  •  时光取名叫无心
    2020-11-21 07:06

    Here is a simple clean implementation which is consistent with Pattern#split and works with variable length patterns, which look behind cannot support, and it is easier to use. It is similar to the solution provided by @cletus.

    public static String[] split(CharSequence input, String pattern) {
        return split(input, Pattern.compile(pattern));
    }
    
    public static String[] split(CharSequence input, Pattern pattern) {
        Matcher matcher = pattern.matcher(input);
        int start = 0;
        List result = new ArrayList<>();
        while (matcher.find()) {
            result.add(input.subSequence(start, matcher.start()).toString());
            result.add(matcher.group());
            start = matcher.end();
        }
        if (start != input.length()) result.add(input.subSequence(start, input.length()).toString());
        return result.toArray(new String[0]);
    }
    

    I don't do null checks here, Pattern#split doesn't, why should I. I don't like the if at the end but it is required for consistency with the Pattern#split . Otherwise I would unconditionally append, resulting in an empty string as the last element of the result if the input string ends with the pattern.

    I convert to String[] for consistency with Pattern#split, I use new String[0] rather than new String[result.size()], see here for why.

    Here are my tests:

    @Test
    public void splitsVariableLengthPattern() {
        String[] result = Split.split("/foo/$bar/bas", "\\$\\w+");
        Assert.assertArrayEquals(new String[] { "/foo/", "$bar", "/bas" }, result);
    }
    
    @Test
    public void splitsEndingWithPattern() {
        String[] result = Split.split("/foo/$bar", "\\$\\w+");
        Assert.assertArrayEquals(new String[] { "/foo/", "$bar" }, result);
    }
    
    @Test
    public void splitsStartingWithPattern() {
        String[] result = Split.split("$foo/bar", "\\$\\w+");
        Assert.assertArrayEquals(new String[] { "", "$foo", "/bar" }, result);
    }
    
    @Test
    public void splitsNoMatchesPattern() {
        String[] result = Split.split("/foo/bar", "\\$\\w+");
        Assert.assertArrayEquals(new String[] { "/foo/bar" }, result);
    }
    

提交回复
热议问题