a regular expression generator for number ranges

前端 未结 9 2177
长情又很酷
长情又很酷 2021-02-04 05:56

I checked on the stackExchange description, and algorithm questions are one of the allowed topics. So here goes.

Given an input of a range, where begin and ending number

9条回答
  •  伪装坚强ぢ
    2021-02-04 06:09

    Bezmax's answer is close but doesn't quite solve the problem correctly. It has a few details incorrect I believe. I have fixed the issues and written the algorithm in c++. The main problem in Bezmax's algorithm is as follows:

    The prev function should produce the following: 387 -> 380,379 -> 300,299 -> 100, 99->10, 9->0 Whereas Bezmax had: 387 -> 380,379 -> 300,299 -> 0

    Bezmax had 299 "weakening" to 0 this could leave part of the range out in certain circumstances. Basically you want to weaken to the lowest number you can but never change the number of digits. The full solution is too much code to post here but here is the important parts. Hope this helps someone.

        // Find the next number that is advantageous for regular expressions.
        //
        // Starting at the right most decimal digit convert all zeros to nines. Upon
        // encountering the first non-zero convert it to a nine and stop. The output
        // always has the number of digits as the input.
        // examples: 100->999, 0->9, 5->9, 9->9, 14->19, 120->199, 10010->10099
        static int Next(int val)
        {
           assert(val >= 0);
    
           // keep track of how many nines to add to val.
           int addNines = 0;
    
           do {
              auto res = std::div(val, 10);
              val = res.quot;
              ++addNines;
              if (res.rem != 0) {
                 break;
              }
           } while (val != 0);
    
           // add the nines
           for (int i = 0; i < addNines; ++i) {
              val = val * 10 + 9;
           }
    
           return val;
        }
    
        // Find the previous number that is advantageous for regular expressions.
        //
        // If the number is a single digit number convert it to zero and stop. Else...
        // Starting at the right most decimal digit convert all trailing 9's to 0's
        // unless the digit is the most significant digit - change that 9 to a 1. Upon
        // encounter with first non-nine digit convert it to a zero (or 1 if most
        // significant digit) and stop. The output always has the same number of digits
        // as the input.
        // examples: 0->0, 1->0, 29->10, 999->100, 10199->10000, 10->10, 399->100
        static int Prev(int val)
        {
           assert(val >= 0);
    
           // special case all single digit numbers reduce to 0
           if (val < 10) {
              return 0;
           }
    
           // keep track of how many zeros to add to val.
           int addZeros = 0;
    
           for (;;) {
              auto res = std::div(val, 10);
              val = res.quot;
              ++addZeros;
              if (res.rem != 9) {
                 break;
              }
    
              if (val < 10) {
                 val = 1;
                 break;
              }
           }
    
           // add the zeros
           for (int i = 0; i < addZeros; ++i) {
              val *= 10;
           }
    
           return val;
        }
    
        // Create a vector of ranges that covers [start, end] that is advantageous for
        // regular expression creation. Must satisfy end>=start>=0.
        static std::vector> MakeRegexRangeVector(const int start,
                                                                     const int end)
        {
           assert(start <= end);
           assert(start >= 0);
    
           // keep track of the remaining portion of the range not yet placed into
           // the forward and reverse vectors.
           int remainingStart = start;
           int remainingEnd = end;
    
           std::vector> forward;
           while (remainingStart <= remainingEnd) {
              auto nextNum = Next(remainingStart);
              // is the next number within the range still needed.
              if (nextNum <= remainingEnd) {
                 forward.emplace_back(remainingStart, nextNum);
                 // increase remainingStart as portions of the numeric range are
                 // transfered to the forward vector.
                 remainingStart = nextNum + 1;
              } else {
                 break;
              }
           }
           std::vector> reverse;
           while (remainingEnd >= remainingStart) {
              auto prevNum = Prev(remainingEnd);
              // is the previous number within the range still needed.
              if (prevNum >= remainingStart) {
                 reverse.emplace_back(prevNum, remainingEnd);
                 // reduce remainingEnd as portions of the numeric range are transfered
                 // to the reverse vector.
                 remainingEnd = prevNum - 1;
              } else {
                 break;
              }
           }
    
           // is there any part of the range not accounted for in the forward and
           // reverse vectors?
           if (remainingStart <= remainingEnd) {
              // add the unaccounted for part - this is guaranteed to be expressable
              // as a single regex substring.
              forward.emplace_back(remainingStart, remainingEnd);
           }
    
           // Concatenate, in reverse order, the reverse vector to forward.
           forward.insert(forward.end(), reverse.rbegin(), reverse.rend());
    
           // Some sanity checks.
           // size must be non zero.
           assert(forward.size() > 0);
    
           // verify starting and ending points of the range
           assert(forward.front().first == start);
           assert(forward.back().second == end);
    
           return forward;
        }
    

提交回复
热议问题