Regex only allow letters and some characters

前端 未结 4 959
孤街浪徒
孤街浪徒 2021-02-14 18:04

I am attempting to create a regex that only allows letters upper or lowercase, and the characters of space, \'-\', \',\' \'.\', \'(\', and \')\'. This is what I have so far but

相关标签:
4条回答
  • 2021-02-14 18:30

    I tried that with javascript and it works fine. The others are correct, though. If in javascript, check if everything works fine or else the check will not happen at all.

    0 讨论(0)
  • 2021-02-14 18:34

    Well, there is an issue in that -, is being interpreted as a range, like a-z, allowing all characters from space to comma. Escape that and at least some of the bugs should be fixed.

    ^[a-zA-Z \-,.()]*$
    

    Strictly speaking, you should probably also escape the . and (), too, since those have special meaning in regular expressions. The Javascript regex engine (where I was testing) seems to interpret them literally within a [] context, anyway, but it's always far better to be explicit.

    ^[a-zA-Z \-,\.\(\)]*$
    

    However, this still shouldn't be allowing 0-9 digits, so your actual code that uses this regular expression probably has an issue, as well.

    0 讨论(0)
  • 2021-02-14 18:51

    The  -, in [a-zA-Z -,.()] describes a range from   (0x20) to , (0x2C). And that is equivalent to [ !"#$%'()*+,]. You should either escape the - or place it somewhere else where it is not interpreted as a range indicator.

    But that’s not the cause of this issue as the digits are from 0x30 to 0x39.

    0 讨论(0)
  • 2021-02-14 18:53

    - is special in character class. It is used to define a range as you've done with a-z.

    To match a literal - you need to either escape it or place it such that it'll not function as range operator:

    ^[a-zA-Z \-,.()]*$
             ^^ escaping \ 
    

    or

    ^[-a-zA-Z ,.()]*$
      ^ placing it at the beginning.
    

    or

    ^[a-zA-Z -,.()-]*$
                  ^ placing it at the end.
    

    and interestingly

    ^[a-z-A-Z -,.()]*$
         ^ placing in the middle of two ranges.
    

    In the final case - is place between a-z and A-Z since both the characters surrounding the -(the one which we want to treat literally) that is z and A are already involved in ranges, the - is treated literally again.

    Of all the mentioned methods, the escaping method is recommended as it makes your code easier to read and understand. Anyone seeing the \ would expect that an escape is intended. Placing the - at the beginning(end) will create problems if you later add a character before(after) it in the character class without escaping the - thus forming a range.

    0 讨论(0)
提交回复
热议问题