a regular expression generator for number ranges

前端 未结 9 2169
长情又很酷
长情又很酷 2021-02-04 05:56

I checked on the stackExchange description, and algorithm questions are one of the allowed topics. So here goes.

Given an input of a range, where begin and ending number

9条回答
  •  难免孤独
    2021-02-04 06:09

    [Hint: somehow the idea of applying recursion as presented in my first answer (using Python) did not reach the OP, so here it is again in Java. Note that for a recursive solution it is often easier to prove correctness.]

    The key observation to use recursion is that ranges starting with a number ending in 0 and ending with a number ending in 9 are covered by digit patterns that all end in [0-9].

    20-239 is covered by [2-9][0-9], 1[0-9][0-9], 2[0-3][0-9]
    

    When taking off the last digit of start and end of the range the resulting range is covered by the same digit patterns, except for the missing trailing [0-9]:

    20-239 is covered by [2-9][0-9], 1[0-9][0-9], 2[0-3][0-9]
    2 -23  is covered by [2-9],      1[0-9],      2[0-3]
    

    So when we are looking for the digit patterns that cover a range (e.g. 13-247), we split off a range before the first number ending in 0 and a range after the last number ending in 9 (note that these split off ranges can be empty), e.g.

    13-247 = 13-19, 20-239, 240-247
    20-247 =        20-239, 240-247
    13-239 = 13-19, 20-239
    20-239 =        20-239
    

    The remaining range is handled recursively by taking off the last digits and appending [0-9] to all digit patterns of the reduced range.

    When generating pairs start,end for the subranges that can be covered by one digit pattern (as done by bezmax and the OP), the subranges of the reduced range have to be "blown up" correspondingly.

    The special cases when there is no number ending in 0 in the range or when there is no number ending in 9 in the range can only happen if start and end only differ at the last digit; in this case the whole range can be covered by one digit pattern.

    So here is an alternative implementation of getRegexPairs based on this recursion principle:

    private static List getRegexPairs(int start, int end)
    {
      List pairs = new ArrayList<>();   
      if (start > end) return pairs; // empty range
      int firstEndingWith0 = 10*((start+9)/10); // first number ending with 0
      if (firstEndingWith0 > end) // not in range?
      {
        // start and end differ only at last digit
        pairs.add(start);
        pairs.add(end);
        return pairs;
      }
    
      if (start < firstEndingWith0) // start is not ending in 0
      {
        pairs.add(start);
        pairs.add(firstEndingWith0-1);
      }
    
      int lastEndingWith9 = 10*(end/10)-1; // last number in range ending with 9
      // all regex for the range [firstEndingWith0,lastEndingWith9] end with [0-9]
      List pairsMiddle = getRegexPairs(firstEndingWith0/10, lastEndingWith9/10);
      for (int i=0; i

提交回复
热议问题