Regex: Match a string containing numbers and letters but not a string of just numbers

若如初见. 提交于 2019-12-23 18:16:29

问题


Question

I would like to be able to use a single regex (if possible) to require that a string fits [A-Za-z0-9_] but doesn't allow:

  • Strings containing just numbers or/and symbols.
  • Strings starting or ending with symbols
  • Multiple symbols next to eachother

Valid

  • test_0123
  • t0e1s2t3
  • 0123_test
  • te0_s1t23
  • t_t

Invalid

  • t__t
  • ____
  • 01230123
  • _0123
  • _test
  • _test123
  • test_
  • test123_

Reasons for the Rules

The purpose of this is to filter usernames for a website I'm working on. I've arrived at the rules for specific reasons.

  • Usernames with only numbers and/or symbols could cause problems with routing and database lookups. The route for /users/#{id} allows id to be either the user's id or user's name. So names and ids shouldn't be able to collide.

  • _test looks wierd and I don't believe it's valid subdomain i.e. _test.example.com

  • I don't like the look of t__t as a subdomain. i.e. t__t.example.com


回答1:


This matches exactly what you want:

/\A(?!_)(?:[a-z0-9]_?)*[a-z](?:_?[a-z0-9])*(?<!_)\z/i
  1. At least one alphabetic character (the [a-z] in the middle).
  2. Does not begin or end with an underscore (the (?!_) and (?<!_) at the beginning and end).
  3. May have any number of numbers, letters, or underscores before and after the alphabetic character, but every underscore must be separated by at least one number or letter (the rest).

Edit: In fact, you probably don't even need the lookahead/lookbehinds due to how the rest of the regex works - the first ?: parenthetical won't allow an underscore until after an alphanumeric, and the second ?: parenthetical won't allow an underscore unless it's before an alphanumeric:

/\A(?:[a-z0-9]_?)*[a-z](?:_?[a-z0-9])*\z/i

Should work fine.




回答2:


I'm sure that you could put all this into one regular expression, but it won't be simple and I'm not sure why insist on it being one regex. Why not use multiple passes during validation? If the validation checks are done when users create a new account, there really isn't any reason to try to cram it into one regex. (That is, you will only be dealing with one item at a time, not hundreds or thousands or more. A few passes over a normal sized username should take very little time, I would think.)

First reject if the name doesn't contain at least one number; then reject if the name doesn't contain at least one letter; then check that the start and end are correct; etc. Each of those passes could be a simple to read and easy to maintain regular expression.




回答3:


What about:

/^(?=[^_])([A-Za-z0-9]+_?)*[A-Za-z](_?[A-Za-z0-9]+)*$/

It doesn't use a back reference.

Edit:

Succeeds for all your test cases. Is ruby compatible.




回答4:


This doesn't block "__", but it does get the rest:

([A-Za-z]|[0-9][0-9_]*)([A-Za-z0-9]|_[A-Za-z0-9])*

And here's the longer form that gets all your rules:

([A-Za-z]|([0-9]+(_[0-9]+)*([A-Za-z|_[A-Za-z])))([A-Za-z0-9]|_[A-Za-z0-9])*

dang, that's ugly. I'll agree with Telemachus, that you probably shouldn't do this with one regex, even though it's technically possible. regex is often a pain for maintenance.




回答5:


The question asks for a single regexp, and implies that it should be a regexp that matches, which is fine, and answered by others. For interest, though, I note that these rules are rather easier to state directly as a regexp that should not match. I.e.:

x !~ /[^A-Za-z0-9_]|^_|_$|__|^\d+$/
  • no other characters than letters, numbers and _
  • can't start with a _
  • can't end with a _
  • can't have two _s in a row
  • can't be all digits

You can't use it this way in a Rails validates_format_of, but you could put it in a validate method for the class, and I think you'd have much better chance of still being able to make sense of what you meant, a month or a year from now.




回答6:


Here you go:

^(([a-zA-Z]([^a-zA-Z0-9]?[a-zA-Z0-9])*)|([0-9]([^a-zA-Z0-9]?[a-zA-Z0-9])*[a-zA-Z]+([^a-zA-Z0-9]?[a-zA-Z0-9])*))$

If you want to restrict the symbols you want to accept, simply change all [^a-zA-Z0-9] with [] containing all allowed symbols




回答7:


(?=.*[a-zA-Z].*)^[A-Za-z0-9](_?[A-Za-z0-9]+)*$

This one works.

Look ahead to make sure there's at least one letter in the string, then start consuming input. Every time there is an underscore, there must be a number or a letter before the next underscore.




回答8:


/^(?![\d_]+$)[A-Za-z0-9]+(?:_[A-Za-z0-9]+)*$/

Your question is essentially the same as this one, with the added requirement that at least one of the characters has to be a letter. The negative lookahead - (?![\d_]+$) - takes care of that part, and is much easier (both to read and write) than incorporating it into the basic regex as some others have tried to do.




回答9:


[A-Za-z][A-Za-z0-9_]*[A-Za-z]

That would work for your first two rules (since it requires a letter at the beginning and end for the second rule, it automatically requires letters).

I'm not sure the third rule is possible using regexes.



来源:https://stackoverflow.com/questions/1240674/regex-match-a-string-containing-numbers-and-letters-but-not-a-string-of-just-nu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!