What data type to use for hashed password field and what length?

后端 未结 10 1840
孤街浪徒
孤街浪徒 2020-11-22 15:47

I\'m not sure how password hashing works (will be implementing it later), but need to create database schema now.

I\'m thinking of limiting passwords to 4-20 charact

相关标签:
10条回答
  • 2020-11-22 16:04

    I've always tested to find the MAX string length of an encrypted string and set that as the character length of a VARCHAR type. Depending on how many records you're going to have, it could really help the database size.

    0 讨论(0)
  • 2020-11-22 16:05

    Always use a password hashing algorithm: Argon2, scrypt, bcrypt or PBKDF2.

    Argon2 won the 2015 password hashing competition. Scrypt, bcrypt and PBKDF2 are older algorithms that are considered less preferred now, but still fundamentally sound, so if your platform doesn't support Argon2 yet, it's ok to use another algorithm for now.

    Never store a password directly in a database. Don't encrypt it, either: otherwise, if your site gets breached, the attacker gets the decryption key and so can obtain all passwords. Passwords MUST be hashed.

    A password hash has different properties from a hash table hash or a cryptographic hash. Never use an ordinary cryptographic hash such as MD5, SHA-256 or SHA-512 on a password. A password hashing algorithm uses a salt, which is unique (not used for any other user or in anybody else's database). The salt is necessary so that attackers can't just pre-calculate the hashes of common passwords: with a salt, they have to restart the calculation for every account. A password hashing algorithm is intrinsically slow — as slow as you can afford. Slowness hurts the attacker a lot more than you because the attacker has to try many different passwords. For more information, see How to securely hash passwords.

    A password hash encodes four pieces of information:

    • An indicator of which algorithm is used. This is necessary for agility: cryptographic recommendations change over time. You need to be able to transition to a new algorithm.
    • A difficulty or hardness indicator. The higher this value, the more computation is needed to calculate the hash. This should be a constant or a global configuration value in the password change function, but it should increase over time as computers get faster, so you need to remember the value for each account. Some algorithms have a single numerical value, others have more parameters there (for example to tune CPU usage and RAM usage separately).
    • The salt. Since the salt must be globally unique, it has to be stored for each account. The salt should be generated randomly on each password change.
    • The hash proper, i.e. the output of the mathematical calculation in the hashing algorithm.

    Many libraries include a pair functions that conveniently packages this information as a single string: one that takes the algorithm indicator, the hardness indicator and the password, generates a random salt and returns the full hash string; and one that takes a password and the full hash string as input and returns a boolean indicating whether the password was correct. There's no universal standard, but a common encoding is

    $algorithm$parameters$salt$output

    where algorithm is a number or a short alphanumeric string encoding the choice of algorithm, parameters is a printable string, and salt and output are encoded in Base64 without terminating =.

    16 bytes are enough for the salt and the output. (See e.g. recommendations for Argon2.) Encoded in Base64, that's 21 characters each. The other two parts depend on the algorithm and parameters, but 20–40 characters are typical. That's a total of about 82 ASCII characters (CHAR(82), and no need for Unicode), to which you should add a safety margin if you think it's going to be difficult to enlarge the field later.

    If you encode the hash in a binary format, you can get it down to 1 byte for the algorithm, 1–4 bytes for the hardness (if you hard-code some of the parameters), and 16 bytes each for the salt and output, for a total of 37 bytes. Say 40 bytes (BINARY(40)) to have at least a couple of spare bytes. Note that these are 8-bit bytes, not printable characters, in particular the field can include null bytes.

    Note that the length of the hash is completely unrelated to the length of the password.

    0 讨论(0)
  • 2020-11-22 16:07

    You might find this Wikipedia article on salting worthwhile. The idea is to add a set bit of data to randomize your hash value; this will protect your passwords from dictionary attacks if someone gets unauthorized access to the password hashes.

    0 讨论(0)
  • 2020-11-22 16:08

    It really depends on the hashing algorithm you're using. The length of the password has little to do with the length of the hash, if I remember correctly. Look up the specs on the hashing algorithm you are using, run a few tests, and truncate just above that.

    0 讨论(0)
  • You can actually use CHAR(length of hash) to define your datatype for MySQL because each hashing algorithm will always evaluate out to the same number of characters. For example, SHA1 always returns a 40-character hexadecimal number.

    0 讨论(0)
  • 2020-11-22 16:15

    for md5 vARCHAR(32) is appropriate. For those using AES better to use varbinary.

    0 讨论(0)
提交回复
热议问题