Find the last match with Java regex matcher

后端 未结 11 2096
情书的邮戳
情书的邮戳 2020-12-01 14:27

I\'m trying to get the last result of a match without having to cycle through .find()

Here\'s my code:

String in = \"num 123 num 1 num 698 num 19238          


        
相关标签:
11条回答
  • 2020-12-01 14:58

    To get the last match even this works and not sure why this was not mentioned earlier:

    String in = "num 123 num 1 num 698 num 19238 num 2134";
    Pattern p = Pattern.compile("num '([0-9]+) ");
    Matcher m = p.matcher(in);
    if (m.find()) {
      in= m.group(m.groupCount());
    }
    
    0 讨论(0)
  • 2020-12-01 14:58

    This seems like a more equally plausible approach.

        public class LastMatchTest {
            public static void main(String[] args) throws Exception {
                String target = "num 123 num 1 num 698 num 19238 num 2134";
                Pattern regex = Pattern.compile("(?:.*?num.*?(\\d+))+");
                Matcher regexMatcher = regex.matcher(target);
    
                if (regexMatcher.find()) {
                    System.out.println(regexMatcher.group(1));
                }
            }
        }
    

    The .*? is a reluctant match so it won't gobble up everything. The ?: forces a non-capturing group so the inner group is group 1. Matching multiples in a greedy fashion causes it to match across the entire string until all matches are exhausted leaving group 1 with the value of your last match.

    0 讨论(0)
  • 2020-12-01 15:00

    Java does not provide such a mechanism. The only thing I can suggest would be a binary search for the last index.

    It would be something like this:

    N = haystack.length();
    if ( matcher.find(N/2) ) {
        recursively try right side
    else
        recursively try left side
    

    Edit

    And here's code that does it since I found it to be an interesting problem:

    import org.junit.Test;
    
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    import static org.junit.Assert.assertEquals;
    
    public class RecursiveFind {
        @Test
        public void testFindLastIndexOf() {
            assertEquals(0, findLastIndexOf("abcffffdffffd", "abc"));
            assertEquals(1, findLastIndexOf("dabcffffdffffd", "abc"));
            assertEquals(4, findLastIndexOf("aaaaabc", "abc"));
            assertEquals(4, findLastIndexOf("aaaaabc", "a+b"));
            assertEquals(6, findLastIndexOf("aabcaaabc", "a+b"));
            assertEquals(2, findLastIndexOf("abcde", "c"));
            assertEquals(2, findLastIndexOf("abcdef", "c"));
            assertEquals(2, findLastIndexOf("abcd", "c"));
        }
    
        public static int findLastIndexOf(String haystack, String needle) {
            return findLastIndexOf(0, haystack.length(), Pattern.compile(needle).matcher(haystack));
        }
    
        private static int findLastIndexOf(int start, int end, Matcher m) {
            if ( start > end ) {
                return -1;
            }
    
            int pivot = ((end-start) / 2) + start;
            if ( m.find(pivot) ) {
                //recurse on right side
                return findLastIndexOfRecurse(end, m);
            } else if (m.find(start)) {
                //recurse on left side
                return findLastIndexOfRecurse(pivot, m);
            } else {
                //not found at all between start and end
                return -1;
            }
        }
    
        private static int findLastIndexOfRecurse(int end, Matcher m) {
            int foundIndex = m.start();
            int recurseIndex = findLastIndexOf(foundIndex + 1, end, m);
            if ( recurseIndex == -1 ) {
                return foundIndex;
            } else {
                return recurseIndex;
            }
        }
    
    }
    

    I haven't found a breaking test case yet.

    0 讨论(0)
  • 2020-12-01 15:03

    You could prepend .* to your regex, which will greedily consume all characters up to the last match:

    import java.util.regex.*;
    
    class Test {
      public static void main (String[] args) {
        String in = "num 123 num 1 num 698 num 19238 num 2134";
        Pattern p = Pattern.compile(".*num ([0-9]+)");
        Matcher m = p.matcher(in);
        if(m.find()) {
          System.out.println(m.group(1));
        }
      }
    }
    

    Prints:

    2134
    

    You could also reverse the string as well as change your regex to match the reverse instead:

    import java.util.regex.*;
    
    class Test {
      public static void main (String[] args) {
        String in = "num 123 num 1 num 698 num 19238 num 2134";
        Pattern p = Pattern.compile("([0-9]+) mun");
        Matcher m = p.matcher(new StringBuilder(in).reverse());
        if(m.find()) {
          System.out.println(new StringBuilder(m.group(1)).reverse());
        }
      }
    }
    

    But neither solution is better than just looping through all matches using while (m.find()), IMO.

    0 讨论(0)
  • 2020-12-01 15:03

    Why not keep it simple?

    in.replaceAll(".*[^\\d](\\d+).*", "$1")
    
    0 讨论(0)
提交回复
热议问题