Regular Expression for alphanumeric and underscores

前端 未结 20 792
北荒
北荒 2020-11-22 10:01

I would like to have a regular expression that checks if a string contains only upper and lowercase letters, numbers, and underscores.

20条回答
  •  逝去的感伤
    2020-11-22 10:51

    Although it's more verbose than \w, I personally appreciate the readability of the full POSIX character class names ( http://www.zytrax.com/tech/web/regex.htm#special ), so I'd say:

    ^[[:alnum:]_]+$
    

    However, while the documentation at the above links states that \w will "Match any character in the range 0 - 9, A - Z and a - z (equivalent of POSIX [:alnum:])", I have not found this to be true. Not with grep -P anyway. You need to explicitly include the underscore if you use [:alnum:] but not if you use \w. You can't beat the following for short and sweet:

    ^\w+$
    

    Along with readability, using the POSIX character classes (http://www.regular-expressions.info/posixbrackets.html) means that your regex can work on non ASCII strings, which the range based regexes won't do since they rely on the underlying ordering of the ASCII characters which may be different from other character sets and will therefore exclude some non-ASCII characters (letters such as œ) which you might want to capture.

提交回复
热议问题