a regular expression generator for number ranges

前端 未结 9 2180
长情又很酷
长情又很酷 2021-02-04 05:56

I checked on the stackExchange description, and algorithm questions are one of the allowed topics. So here goes.

Given an input of a range, where begin and ending number

9条回答
  •  说谎
    说谎 (楼主)
    2021-02-04 06:05

    One option would be to (for a range [n, m]) generate the regexp n|n+1|...|m-1|m. However, I think you're after getting something more optimised. You can still do essentially the same, generate a FSM that matches each number using a distinct path through a state machine, then use any of the well-known FSM minimisation algorithms to generate a smaller machine, then turn that into a more condensed regular expression (since "regular expressions" without the Perl extensions are isomorphic to finite state machines).

    Let's say we are looking at the range [107, 112]:

    state1:
      1 -> state2
      * -> NotOK
    state2:
      0 -> state2.0
      1 -> state2.1
      * -> NotOK
    state2.0:
      7 -> OK
      8 -> OK
      9 -> OK
      * -> NotOK
    state2.1:
      0 -> OK
      1 -> OK
      2 -> OK
      * -> NotOK
    

    We can't really reduce this machine any further. We can see that state2.0 correspond to the RE [789] and 2.1 corresponds to [012]. We can then see that state2.0 is (0[789])|(1[012]) and the whole is 1(0[789])|(1[012]).

    Further reading on DFA minimization can be found on Wikipedia (and pages linked from there).

提交回复
热议问题