I\'m looking for some regex code that I can use to check for a valid username.
I would like for the username to have letters (both upper case and lower case), number
So it looks like you want your username to have a "word" part (sequence of letters or numbers), interspersed with some "separator" part.
The regex will look something like this:
^[a-z0-9]+(?:[ _.-][a-z0-9]+)*$
Here's a schematic breakdown:
_____sep-word…____
/ \
^[a-z0-9]+(?:[ _.-][a-z0-9]+)*$ i.e. "word ( sep word )*"
|\_______/ \____/\_______/ |
| "word" "sep" "word" |
| |
from beginning of string... till the end of string
So essentially we want to match things like word
, word-sep-word
, word-sep-word-sep-word
, etc.
sep
without a word
in betweenword
(i.e. not a sep
char)Note that for [ _.-]
, -
is last so that it's not a range definition metacharacter. The (?:…)
is what is called a non-capturing group. We need the brackets for grouping for the repetition (i.e. (…)*
), but since we don't need the capture, we can use (?:…)*
instead.
To allow uppercase/various Unicode letters etc, just expand the character class/use more flags as necessary.
Although I'm sure someone will shortly post a 1 million lines regex to do exactly what you want, I don't think in this case a regex is a good solution.
Why don't you write a good old fashioned parser? It will take about as long as writing the regex that does everything you mentioned, but it's going to be much easier to maintain and read.
In particular, this is the tricky part:
it should also not allow for any of the special characters listed above to be repeated more than once in succession
Alternatively you can always do a hybrid of the two. A regex for the other checks ([a-zA-Z0-9][a-zA-Z0-9 _-\.]*[a-zA-Z0-9]
) and a non-regex method for the no-repeat requirement.
You don't have to use a regex for everything. I find that requirements like the "no two consecutive characters" usually make the regexes so ugly that it's better to do that bit with a simple procedural loop.
I'd just use something like ^[A-Za-z0-9][A-Za-z0-9 \.\-_]*[A-Za-z0-9]$
(or the equivalents like ::alnum::
if your regex engine is more advanced) and then just check every character in a loop to make sure the next character isn't the same.
By doing it procedurally, you can check all the other rules you're likely to want at some point without resorting to what I call "regex gymnastics", things like:
and so forth.