Regular expression for valid subdomain in Ruby

问题

I'm attempting to validate a string of user input that will be used as a subdomain. The rules are as follows:

Between 1 and 63 characters in length (I take 63 from the number of characters Google Chrome appears to allow in a subdomain, not sure if it's actually a server directive. If you have better advice on valid max length, I'm interested in hearing it)
May contain a-zA-Z0-9, hyphen, underscore
May not begin or end with a hyphen or underscore

EDIT: From input below, I've added the following: 4. Should not contain consecutive hyphens or underscores.

Examples:

a => valid
0 => valid
- => not valid
_ => not valid
a- => not valid
-a => not valid
a_ => not valid
_a => not valid
aa => valid
aaa => valid
a-a-a => valid
0-a => valid
a&a => not valid
a-_0 => not valid
a--a => not valid
aaa- => not valid

My issue is I'm not sure how to specify with a RegEx that the string is allowed to be only one character, while also specifying that it may not begin or end with a hyphen or underscore.

Thanks!

回答1:

You ~~can't~~ can have underscores in ~~proper~~ subdomains, but do you need them? After trimming your input, do a simple string length check, then test with this:

/^[a-z\d]+(-[a-z\d]+)*$/i

With the above, you won't get consecutive - characters, e.g. a-bbb-ccc passes and a--d fails.

/^[a-z\d]+([-_][a-z\d]+)*$/i

Will allow non-consecutive underscores as well.

Update: you'll find that, in practice, underscores are disallowed and all subdomains must start with a letter. The solution above does not allow internationalised subdomains (punycode). You're better of using this

/\A([a-z][a-z\d]*(-[a-z\d]+)*|xn--[\-a-z\d]+)\z/i

回答2:

I'm not familiar with Ruby regex syntax, but I'll assume it's like, say, Perl. Sounds like you want:

/^(?![-_])[-a-z\d_]{1,63}(?<![-_])$/i

Or if Ruby doesn't use the i flag, just replace [-a-z\d_] with [-a-zA-Z\d_].

The reason I'm using [-a-zA-Z\d_] instead of the shorter [-\w] is that, while nearly equivalent, \w will allow special characters such as ä rather than just ASCII-type characters. That behavior can be optionally turned off in most languages, or you can allow it if you like.

Some more information on character classes, quantifiers, and lookarounds

回答3:

/^([a-z0-9][a-z0-9\-\_]{0,61}[a-z0-9]|[a-z0-9])$/i

I've took it as a challenge to create a regex that should match only strings with non-repeating hyphens or underscores and also check the proper length for you:

/^([a-z0-9]([_\-](?![_\-])|[a-z0-9]){0,61}[a-z0-9]|[a-z0-9])$/i

The middle part uses a lookaround to verify that.

回答4:

^[a-zA-Z]([-a-zA-Z\d]*[a-zA-Z\d])?$

This simply enforces the standard in an efficient way without backtracking. It does not check the length, but Regex is inefficient at things like that. Just check the string length (1 to 64 chars).

回答5:

/[^\W\_](.+?)[^\W\_]$/i should work for ya (try our http://rubular.com/ to test out regular expressions)

EDIT: actually, this doesn't check single/double letter/numbers. try /([^\W\_](.+?)[^\W\_])|([a-z0-9]{1,2})/i instead, and tinker with it in rubular until you get exactly what ya want (if this doesn't take care of it already).

来源：https://stackoverflow.com/questions/5196640/regular-expression-for-valid-subdomain-in-ruby

标签

ruby-on-rails

ruby

regex

ruby-on-rails-3

subdomain