Quite often, questions (especially those tagged regex) ask for ways to validate passwords. It seems users typically seek password validation methods that consist of
Our very own Jeff Atwood (blogger of Coding Horror and co-founder of Stack Overflow and Stack Exchange) wrote a blog about password rules back in March of 2017 titled Password Rules are Bullshit. If you haven't read this post, I would urge you to do so as it greatly mirrors the intent of this post.
If you have never heard of NIST (National Institute of Standards and Technology), then you're likely not using correct cybersecurity methods for your projects. In that case please take a look at their Digital Identity Guidelines. You should also stay up to date on best practices for cybersecurity. NIST Special Publication 800-63B (Revision 3) mentions the following about password rules:
Verifiers SHOULD NOT impose other composition rules (e.g. requiring mixtures of different character types or prohibiting consecutively repeated characters) for memorized secrets.
Even Mozilla's documentation on Form data validation pokes fun at password rules (page archive here):
"Your password needs to be between 8 and 30 characters long, and contain one uppercase letter, one symbol, and a number" (seriously?)
What happens if you impose composition rules for your passwords? You're limiting the number of potential passwords and removing password permutations that don't match your rules. This allows hackers to ensure their attacks do the same! "Ya but there's like a quadrillion (1,000,000,000,000,000 or 1x1015) password permutations": 25-GPU cluster cracks every standard Windows password in <6 hours (958 = 6,634,204,312,890,625 ~ 6.6x1015 passwords).
This StackExchange Security post extends the XKCD comic above.
Stop requiring passwords altogether, and let people log in with Google, Facebook, Twitter, Yahoo, or any other valid form of Internet driver's license that you're comfortable with. The best password is one you don't have to store.
Source: Your Password is Too Damn Short by Jeff Atwood.
If you really must create your own authentication methods, at least follow proven cybersecurity methods. The following two sections (2.1 and 2.2) are taken from the current NIST publication, section 5.1.1.2 Memorized Secret Verifiers.
NIST states that you SHOULD:
aaaaaa
, 1234abcd
)The same publication also states that you SHOULD NOT:
There are a plethora of websites out there explaining how to create "proper" password validation forms: Majority of these are outdated and should not be used.
Before you continue to read this section, please note that this section's intent is not to give you the tools necessary to roll out your own security scheme, but instead to give you information about how current security methods validate passwords. If you're considering creating your own security scheme, you should really think thrice and read this article from StackExchange's Security community.
At the most basic level, password entropy can be calculated using the following formula:
In the above formula:
This means that represents the number of possible passwords; or, in terms of entropy, the number of attempts required to exhaust all possibilities.
Unfortunately, what this formula doesn't consider are things such as:
Password1
, admin
John
, Mary
the
, I
drowssap
(password backwards)P@$$w0rd
Adding logic for these additional considerations presents a large challenge. See 3.2 for existing packages that you can add to your projects.
At the time of writing this, the best known existing library for estimating password strength is zxcvbn by Dropbox (an open-source project on GitHub). It's been adapted to support .netangularjscc#c++gojavajavascriptobjective-cocamlphppythonrestrubyrustscala
I understand, however, that everyone has different requirements and that sometimes people want to do things the wrong way. For those of you that fit this criterion (or don't have a choice and have presented everything above this section and more to your manager but they refuse to update their methods) at least allow Unicode characters. The moment you limit the password characters to a specific set of characters (i.e. ensuring a lowercase ASCII character exists a-z
or specifying characters that the user can or cannot enter !@#$%^&*()
), you're just asking for trouble!
P.S. Never trust client-side validation as it can very easily be disabled. That means for those of you trying to validate passwords using javascript STOP. See JavaScript: client-side vs. server-side validation for more information.
The following regular expression pattern does not work in all programming languages, but it does in many of the major programming languages (java.netphpperlruby). Please note that the following regex may not work in your language (or even language version) and you may need to use alternatives (i.e. python: see Python regex matching Unicode properties). Some programming languages even have better methods to check this sort of thing (i.e. using the Password Validation Plugin for mysql) instead of reinventing the wheel. Using node.js the following is valid if using the XRegExp addon or some other conversion tool for Unicode classes as discussed in Javascript + Unicode regexes.
If you need to prevent control characters from being entered, you can prompt the user when a regex match occurs using the pattern [^\P{C}\s]
. This will ONLY match control characters that are not also whitespace characters - i.e. horizontal tab, line feed, vertical tab.
The following regex ensures at least one lowercase, uppercase, number, and symbol exist in a 8+ character length password:
^(?=\P{Ll}*\p{Ll})(?=\P{Lu}*\p{Lu})(?=\P{N}*\p{N})(?=[\p{L}\p{N}]*[^\p{L}\p{N}])[\s\S]{8,}$
^
Assert position at the start of the line.(?=\P{Ll}*\p{Ll})
Ensure at least one lowercase letter (in any script) exists.(?=\P{Lu}*\p{Lu})
Ensure at least one uppercase letter (in any script) exists.(?=\P{N}*\p{N})
Ensure at least one number character (in any script) exists.(?=[\p{L}\p{N}]*[^\p{L}\p{N}])
Ensure at least one of any character (in any script) that isn't a letter or digit exists.[\s\S]{8,}
Matches any character 8 or more times.$
Assert position at the end of the line.Please use the above regular expression at your own discretion. You have been warned!