This seems to match the rules I have defined, but I only starting learning regex tonight, so I am wondering if it is correct.
Rules:
The specs in the question aren't very clear, so I'll just assume the string can contain only ASCII letters and digits, with hyphens, underscores and spaces as internal separators. The meat of the problem is insuring that the first and last character are not separators, and that there's never more than one separator in a row (that part seems clear, anyway). Here's the simplest way:
/^[A-Za-z0-9]+(?:[ _-][A-Za-z0-9]+)*$/
After matching one or more alphanumeric characters, if there's a separator it must be followed by one or more alphanumerics; repeat as needed.
Let's look at regexes from some of the other answers.
/^[[:alnum:]]+(?:[-_ ]?[[:alnum:]]+)*$/
This is effectively the same (assuming your regex flavor supports the POSIX character-class notation), but why make the separator optional? The only reason you'd be in that part of the regex in the first place is if there's a separator or some other, invalid character.
/^[a-zA-Z0-9]+([_\s\-]?[a-zA-Z0-9])*$/
On the other hand, this only works because the separator is optional. After the first separator, it can only match one alphanumeric at a time. To match more, it has to keep repeating the whole group: zero separators followed by one alphanumeric, over and over. If the second [a-zA-Z0-9]
were followed by a plus sign, it could find a match by a much more direct route.
/^[a-zA-Z0-9][a-zA-Z0-9_\s\-]*[a-zA-Z0-9](?<![_\s\-]{2,}.*)$/
This uses unbounded lookbehind, which is a very rare feature, but you can use a lookahead to the same effect:
/^(?!.*[_\s-]{2,})[a-zA-Z0-9][a-zA-Z0-9_\s\-]*[a-zA-Z0-9]$/
This performs essentially a separate search for two consecutive separators, and fails the match if it finds one. The main body then only needs to make sure all the characters are alphanumerics or separators, with the first and last being alphanumerics. Since those two are required, the name must be at least two characters long.
/^[a-zA-Z0-9]+([a-zA-Z0-9](_|-| )[a-zA-Z0-9])*[a-zA-Z0-9]+$/
This is your own regex, and it requires the string to start and end with two alphanumeric characters, and if there are two separators within the string, there have to be exactly two alphanumerics between them. So ab
, ab-cd
and ab-cd-ef
will match, but a
, a-b
and a-b-c
won't.
Also, as some of the commenters have pointed out, the (_|-| )
in your regex should be [-_ ]
. That part's not incorrect, but if you have a choice between an alternation and a character class, you should always go with the character class: they're more efficient as well as more readable.
Again, I'm not worried about whether "alphanumeric" is supposed to include non-ASCII characters, or the exact meaning of "space", just how to enforce a policy of non-contiguous internal separators with a regex.
([a-zA-Z0-9](_|-| )[a-zA-Z0-9])*
is a 0 or more repetiton of alphanum, dashspace, alphanum.
So it would match
a_aa_aa_a
but not
aaaaa
The complete regexp can't match
a_aaaaaaaaa_a for example.
Let's look back at what you want:
* Usernames can consist of lowercase and capitals or alphanumerica characters
* Usernames can consist of alphanumeric characters
* Usernames can consist of underscore and hyphens and spaces
* Cannot be two underscores, two hypens or two spaces in a row
* Cannot have a underscore, hypen or space at the start or end
The beginning is simple ... just match an alphanum, then (ingoring the two in the row rule) an (alphanum or dashspace)* and at the and an alphanum again.
To prevent the two dashspaces in a row you probably need to understand lookahead/lookbehind.
Oh, and regarding the other answer: Please download Espresso, it REALLY helps you undestand those things.
Using the POSIX character class for alphanumeric characters to make it work for accented and other foreign alphabetic characters:
/^[[:alnum:]]+([-_ ]?[[:alnum:]])*$/
More efficient (prevents captures):
/^[[:alnum:]]+(?:[-_ ]?[[:alnum:]]+)*$/
These also prevent sequences of more than one space/hyphen/underscore in combination. It doesn't follow from your specification whether that is desirable, but your own regex seems to indicate this is what you want.
Your regex doesn't work. The hard part is the check for consecutive spaces/hyphens. You could use this one, which uses look-behind:
/^[a-zA-Z0-9][a-zA-Z0-9_\s\-]*[a-zA-Z0-9](?<![_\s\-]{2,}.*)$/