Partial matching a string against a regex

前端 未结 6 1762
梦毁少年i
梦毁少年i 2021-02-04 01:02

Suppose that I have this regular expression: /abcd/ Suppose that I wanna check the user input against that regex and disallow entering invalid characters in the input. When user

6条回答
  •  梦谈多话
    2021-02-04 01:49

    I strongly suspect (although I'm not 100% sure) that general case of this problem has no solution the same way as famous Turing's "Haltin problem" (see Undecidable problem). And even if there is a solution, it most probably will be not what users actually want and thus depending on your strictness will result in a bad-to-horrible UX.

    Example:

    Assume "target RegEx" is [a,b]*c[a,b]* also assume that you produced a reasonable at first glance "test RegEx" [a,b]*c?[a,b]* (obviously two c in the string is invalid, yeah?) and assume that the current user input is aabcbb but there is a typo because what the user actually wanted is aacbbb. There are many possible ways to fix this typo:

    • remove c and add it before first b - will work OK
    • remove first b and add after c - will work OK
    • add c before first b and then remove the old one - Oops, we prohibit this input as invalid and the user will go crazy because no normal human can understand such a logic.

    Note also that your hitEnd will have the same problem here unless you prohibit user to enter characters in the middle of the input box that will be another way to make a horrible UI.

    In the real life there would be many much more complicated examples that any of your smart heuristics will not be able to account for properly and thus will upset users.

    So what to do? I think the only thing you can do and still get reasonable UX is the simplest thing you can do i.e. just analyze your "target RegEx" for set of allowed characters and make your "test RegEx" [set of allowed chars]*. And yes, if the "target RegEx" contains . wildcart, you will not be able to do any reasonable filtering at all.

提交回复
热议问题