I\'m trying to get the last result of a match without having to cycle through .find()
Here\'s my code:
String in = \"num 123 num 1 num 698 num 19238
To get the last match even this works and not sure why this was not mentioned earlier:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
if (m.find()) {
in= m.group(m.groupCount());
}
This seems like a more equally plausible approach.
public class LastMatchTest {
public static void main(String[] args) throws Exception {
String target = "num 123 num 1 num 698 num 19238 num 2134";
Pattern regex = Pattern.compile("(?:.*?num.*?(\\d+))+");
Matcher regexMatcher = regex.matcher(target);
if (regexMatcher.find()) {
System.out.println(regexMatcher.group(1));
}
}
}
The .*?
is a reluctant match so it won't gobble up everything. The ?:
forces a non-capturing group so the inner group is group 1. Matching multiples in a greedy fashion causes it to match across the entire string until all matches are exhausted leaving group 1 with the value of your last match.
Java does not provide such a mechanism. The only thing I can suggest would be a binary search for the last index.
It would be something like this:
N = haystack.length();
if ( matcher.find(N/2) ) {
recursively try right side
else
recursively try left side
And here's code that does it since I found it to be an interesting problem:
import org.junit.Test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static org.junit.Assert.assertEquals;
public class RecursiveFind {
@Test
public void testFindLastIndexOf() {
assertEquals(0, findLastIndexOf("abcffffdffffd", "abc"));
assertEquals(1, findLastIndexOf("dabcffffdffffd", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "a+b"));
assertEquals(6, findLastIndexOf("aabcaaabc", "a+b"));
assertEquals(2, findLastIndexOf("abcde", "c"));
assertEquals(2, findLastIndexOf("abcdef", "c"));
assertEquals(2, findLastIndexOf("abcd", "c"));
}
public static int findLastIndexOf(String haystack, String needle) {
return findLastIndexOf(0, haystack.length(), Pattern.compile(needle).matcher(haystack));
}
private static int findLastIndexOf(int start, int end, Matcher m) {
if ( start > end ) {
return -1;
}
int pivot = ((end-start) / 2) + start;
if ( m.find(pivot) ) {
//recurse on right side
return findLastIndexOfRecurse(end, m);
} else if (m.find(start)) {
//recurse on left side
return findLastIndexOfRecurse(pivot, m);
} else {
//not found at all between start and end
return -1;
}
}
private static int findLastIndexOfRecurse(int end, Matcher m) {
int foundIndex = m.start();
int recurseIndex = findLastIndexOf(foundIndex + 1, end, m);
if ( recurseIndex == -1 ) {
return foundIndex;
} else {
return recurseIndex;
}
}
}
I haven't found a breaking test case yet.
You could prepend .*
to your regex, which will greedily consume all characters up to the last match:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile(".*num ([0-9]+)");
Matcher m = p.matcher(in);
if(m.find()) {
System.out.println(m.group(1));
}
}
}
Prints:
2134
You could also reverse the string as well as change your regex to match the reverse instead:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("([0-9]+) mun");
Matcher m = p.matcher(new StringBuilder(in).reverse());
if(m.find()) {
System.out.println(new StringBuilder(m.group(1)).reverse());
}
}
}
But neither solution is better than just looping through all matches using while (m.find())
, IMO.
Why not keep it simple?
in.replaceAll(".*[^\\d](\\d+).*", "$1")