How can I extract overlapping matches from an input using String.split()
?
For example, if trying to find matches to \"aba\"
:
I would use indexOf.
for(int i = text.indexOf(find); i >= 0; i = text.indexOf(find, i + 1))
System.out.println(find + " found at " + i);
String#split
will not give you overlapping matches. Because a particular part of the string, will only be included in a unique index, of the array obtained, and not in two indices.
You should use Pattern
and Matcher
classes here.
You can use this regex: -
Pattern pattern = Pattern.compile("(?=(aba))");
And use Matcher#find
method to get all the overlapping matches, and print group(1)
for it.
The above regex matches every empty string, that is followed by aba
, then just print the 1st captured group. Now since look-ahead
is zero-width assertion, so it will not consume the string that is matched. And hence you will get all the overlapping matches.
String input = "abababa";
String patternToFind = "aba";
Pattern pattern = Pattern.compile("(?=" + patternToFind + ")");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(patternToFind + " found at index: " + matcher.start());
}
Output: -
aba found at index: 0
aba found at index: 2
aba found at index: 4
This is not a correct use of split()
. From the javadocs:
Splits this string around matches of the given regular expression.
Seems to me that you are not trying to split the string but to find all matches of your regular expression in the string. For this you would have to use a Matcher, and some extra code that loops on the Matcher
to find all matches and then creates the array.