How to split a string, but also keep the delimiters?

前端未结

关注

 23  2370

I have a multiline string which is delimited by a set of different delimiters:

(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)

相关标签:

23条回答

野的像风

2020-11-21 06:57
You can use Lookahead and Lookbehind. Like this:
```
System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))")));
```
And you will get:
```
[a;, b;, c;, d]
[a, ;b, ;c, ;d]
[a, ;, b, ;, c, ;, d]
```
The last one is what you want.

((?<=;)|(?=;)) equals to select an empty character before ; or after ;.

Hope this helps.

EDIT Fabian Steeg comments on Readability is valid. Readability is always the problem for RegEx. One thing, I do to help easing this is to create a variable whose name represent what the regex does and use Java String format to help that. Like this:
```
static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";
...
public void someMethod() {
...
final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
...
}
...
```
This helps a little bit. :-D
0 讨论(0)
发布评论:

提交评论
- 加载中...

无人及你

2020-11-21 06:57

I had a look at the above answers and honestly none of them I find satisfactory. What you want to do is essentially mimic the Perl split functionality. Why Java doesn't allow this and have a join() method somewhere is beyond me but I digress. You don't even need a class for this really. Its just a function. Run this sample program:

Some of the earlier answers have excessive null-checking, which I recently wrote a response to a question here:

https://stackoverflow.com/users/18393/cletus

Anyway, the code:

public class Split {
    public static List<String> split(String s, String pattern) {
        assert s != null;
        assert pattern != null;
        return split(s, Pattern.compile(pattern));
    }

    public static List<String> split(String s, Pattern pattern) {
        assert s != null;
        assert pattern != null;
        Matcher m = pattern.matcher(s);
        List<String> ret = new ArrayList<String>();
        int start = 0;
        while (m.find()) {
            ret.add(s.substring(start, m.start()));
            ret.add(m.group());
            start = m.end();
        }
        ret.add(start >= s.length() ? "" : s.substring(start));
        return ret;
    }

    private static void testSplit(String s, String pattern) {
        System.out.printf("Splitting '%s' with pattern '%s'%n", s, pattern);
        List<String> tokens = split(s, pattern);
        System.out.printf("Found %d matches%n", tokens.size());
        int i = 0;
        for (String token : tokens) {
            System.out.printf("  %d/%d: '%s'%n", ++i, tokens.size(), token);
        }
        System.out.println();
    }

    public static void main(String args[]) {
        testSplit("abcdefghij", "z"); // "abcdefghij"
        testSplit("abcdefghij", "f"); // "abcde", "f", "ghi"
        testSplit("abcdefghij", "j"); // "abcdefghi", "j", ""
        testSplit("abcdefghij", "a"); // "", "a", "bcdefghij"
        testSplit("abcdefghij", "[bdfh]"); // "a", "b", "c", "d", "e", "f", "g", "h", "ij"
    }
}

0 讨论(0)

醉梦人生

2020-11-21 07:02
If you can afford, use Java's replace(CharSequence target, CharSequence replacement) method and fill in another delimiter to split with. Example: I want to split the string "boo:and:foo" and keep ':' at its righthand String.
```
String str = "boo:and:foo";
str = str.replace(":","newdelimiter:");
String[] tokens = str.split("newdelimiter");
```
Important note: This only works if you have no further "newdelimiter" in your String! Thus, it is not a general solution. But if you know a CharSequence of which you can be sure that it will never appear in the String, this is a very simple solution.
0 讨论(0)
发布评论:

提交评论
- 加载中...

时光取名叫无心

2020-11-21 07:06

Here is a simple clean implementation which is consistent with Pattern#split and works with variable length patterns, which look behind cannot support, and it is easier to use. It is similar to the solution provided by @cletus.

public static String[] split(CharSequence input, String pattern) {
    return split(input, Pattern.compile(pattern));
}

public static String[] split(CharSequence input, Pattern pattern) {
    Matcher matcher = pattern.matcher(input);
    int start = 0;
    List<String> result = new ArrayList<>();
    while (matcher.find()) {
        result.add(input.subSequence(start, matcher.start()).toString());
        result.add(matcher.group());
        start = matcher.end();
    }
    if (start != input.length()) result.add(input.subSequence(start, input.length()).toString());
    return result.toArray(new String[0]);
}

I don't do null checks here, Pattern#split doesn't, why should I. I don't like the if at the end but it is required for consistency with the Pattern#split . Otherwise I would unconditionally append, resulting in an empty string as the last element of the result if the input string ends with the pattern.

I convert to String[] for consistency with Pattern#split, I use new String[0] rather than new String[result.size()], see here for why.

Here are my tests:

@Test
public void splitsVariableLengthPattern() {
    String[] result = Split.split("/foo/$bar/bas", "\\$\\w+");
    Assert.assertArrayEquals(new String[] { "/foo/", "$bar", "/bas" }, result);
}

@Test
public void splitsEndingWithPattern() {
    String[] result = Split.split("/foo/$bar", "\\$\\w+");
    Assert.assertArrayEquals(new String[] { "/foo/", "$bar" }, result);
}

@Test
public void splitsStartingWithPattern() {
    String[] result = Split.split("$foo/bar", "\\$\\w+");
    Assert.assertArrayEquals(new String[] { "", "$foo", "/bar" }, result);
}

@Test
public void splitsNoMatchesPattern() {
    String[] result = Split.split("/foo/bar", "\\$\\w+");
    Assert.assertArrayEquals(new String[] { "/foo/bar" }, result);
}

0 讨论(0)

粉色の甜心

2020-11-21 07:09

I suggest using Pattern and Matcher, which will almost certainly achieve what you want. Your regular expression will need to be somewhat more complicated than what you are using in String.split.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2 3 4