I am attempting to create a regex that only allows letters upper or lowercase, and the characters of space, \'-\', \',\' \'.\', \'(\', and \')\'. This is what I have so far but
I tried that with javascript and it works fine. The others are correct, though. If in javascript, check if everything works fine or else the check will not happen at all.
Well, there is an issue in that -,
is being interpreted as a range, like a-z
, allowing all characters from space to comma. Escape that and at least some of the bugs should be fixed.
^[a-zA-Z \-,.()]*$
Strictly speaking, you should probably also escape the .
and ()
, too, since those have special meaning in regular expressions. The Javascript regex engine (where I was testing) seems to interpret them literally within a []
context, anyway, but it's always far better to be explicit.
^[a-zA-Z \-,\.\(\)]*$
However, this still shouldn't be allowing 0-9
digits, so your actual code that uses this regular expression probably has an issue, as well.
The -,
in [a-zA-Z -,.()]
describes a range from
(0x20) to ,
(0x2C). And that is equivalent to [ !"#$%'()*+,]
. You should either escape the -
or place it somewhere else where it is not interpreted as a range indicator.
But that’s not the cause of this issue as the digits are from 0x30 to 0x39.
-
is special in character class. It is used to define a range as you've done with a-z
.
To match a literal -
you need to either escape it or place it such that it'll not function as range operator:
^[a-zA-Z \-,.()]*$
^^ escaping \
or
^[-a-zA-Z ,.()]*$
^ placing it at the beginning.
or
^[a-zA-Z -,.()-]*$
^ placing it at the end.
and interestingly
^[a-z-A-Z -,.()]*$
^ placing in the middle of two ranges.
In the final case -
is place between a-z
and A-Z
since both the characters surrounding the -
(the one which we want to treat literally) that is z
and A
are already involved in ranges, the -
is treated literally again.
Of all the mentioned methods, the escaping method is recommended as it makes your code easier to read and understand. Anyone seeing the \
would expect that an escape is intended. Placing the -
at the beginning(end) will create problems if you later add a character before(after) it in the character class without escaping the -
thus forming a range.