I want to create a token generator that generates tokens that cannot be guessed by the user and that are still unique (to be used for password resets and confirmation codes)
rand() is a security hazard and should never be used to generate a security token: rand() vs mt_rand() (Look at the "static" like images). But neither of these methods of generating random numbers is cryptographically secure. To generate secure secerts an application will needs to access a CSPRNG provided by the platform, operating system or hardware module.
In a web application a good source for secure secrets is non-blocking access to an entropy pool such as /dev/urandom
. As of PHP 5.3, PHP applications can use openssl_random_pseudo_bytes()
, and the Openssl library will choose the best entropy source based on your operating system, under Linux this means the application will use /dev/urandom
. This code snip from Scott is pretty good:
function crypto_rand_secure($min, $max) {
$range = $max - $min;
if ($range < 0) return $min; // not so random...
$log = log($range, 2);
$bytes = (int) ($log / 8) + 1; // length in bytes
$bits = (int) $log + 1; // length in bits
$filter = (int) (1 << $bits) - 1; // set all lower bits to 1
do {
$rnd = hexdec(bin2hex(openssl_random_pseudo_bytes($bytes)));
$rnd = $rnd & $filter; // discard irrelevant bits
} while ($rnd >= $range);
return $min + $rnd;
}
function getToken($length=32){
$token = "";
$codeAlphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$codeAlphabet.= "abcdefghijklmnopqrstuvwxyz";
$codeAlphabet.= "0123456789";
for($i=0;$i<$length;$i++){
$token .= $codeAlphabet[crypto_rand_secure(0,strlen($codeAlphabet))];
}
return $token;
}
First, the scope of this kind of procedure is to create a key/hash/code, that will be unique for one given database. It is impossible to create something unique for the whole world at a given moment.
That being said, you should create a plain, visible string, using a custom alphabet, and checking the created code against your database (table).
If that string is unique, then you apply a md5()
to it and that can't be guessed by anyone or any script.
I know that if you dig deep into the theory of cryptographic generation you can find a lot of explanation about this kind of code generation, but when you put it to real usage it's really not that complicated.
Here's the code I use to generate a simple 10 digit unique code.
$alphabet = "aA1!bB2@cC3#dD5%eE6^fF7&gG8*hH9(iI0)jJ4-kK=+lL[mM]nN{oO}pP\qQ/rR,sS.tT?uUvV>xX~yY|zZ`wW$";
$code = '';
$alplhaLenght = strlen($alphabet )-1;
for ($i = 1; $i <= 10; $i++) {
$n = rand(1, $alplhaLenght );
$code .= $alphabet [$n];
}
And here are some generated codes, although you can run it yourself to see it work:
SpQ0T0tyO%
Uwn[MU][.
D|[ROt+Cd@
O6I|w38TRe
Of course, there can be a lot of "improvements" that can be applied to it, to make it more "complicated", but if you apply a md5()
to this, it'll become, let's say "unguessable" . :)
I know the question is old, but it shows up in Google, so...
As others said, rand()
, mt_rand()
or uniqid()
will not guarantee you uniqueness... even openssl_random_pseudo_bytes()
should not be used, since it uses deprecated features of OpenSSL.
What you should use to generate random hash (same as md5) is random_bytes() (introduced in PHP7). To generate hash with same length as MD5:
bin2hex(random_bytes(16));
If you are using PHP 5.x you can get this function by including random_compat library.
Define "unique". If you mean that two tokens cannot have the same value, then hashing isn't enough - it should be backed with a uniqueness test. The fact that you supply the hash algorithm with unique inputs does not guarantee unique outputs.
I ran into an interesting idea a couple of years ago.
Storing two hash values in the datebase, one generated with md5($a) and the other with sha($a). Then chek if both the values are corect. Point is, if the attacker broke your md5(), he cannot break your md5 AND sha in the near future.
Problem is: how can that concept be used with the token generating needed for your problem?
MD5 is a decent algorithm for producing data dependent IDs. But in case you have more than one item which has the same bitstream (content), you will be producing two similar MD5 "ids".
So if you are just applying it to a rand()
function, which is guaranteed not to create the same number twice, you are quite safe.
But for a stronger distribution of keys, I'd personally use SHA1 or SHAx etc'... but you will still have the problem of similar data leads to similar keys.