PHP short unique ID generation using auto_increment?

强颜欢笑 提交于 2019-12-03 00:25:24

You'll need something that's correct by construction, i.e. a permutation function: this is a function that does a one-to-one, reversible mapping of one integer (your sequential counter) to another. Some examples (any combination of these should also work):

  • inverting some of the bits (f.i. using an XOR, ^ in PHP)
  • swapping the places of bits (($i & 0xc) >> 2 | ($i & 0x3) << 2), or just reversing the order of all bits
  • adding a constant value modulo your maximum range (must be a factor of two, if you're combining this with the ones above)

Example: this function will convert 0, 1, 2, 3, 5, .. into 13, 4, 12, 7, 15, .. for numbers up to 15:

$i=($input+97) & 0xf;
$result=((($i&0x1) << 3) + (($i&0xe) >> 1)) ^ 0x5;

EDIT

An easier way would to use a linear congruential generator (LCG, which is usually used for generating random numbers), which is defined by a formula of the form:

X_n+1 = (a * X_n + c) mod m

For good values of a, c and m, the sequence of X_0, X_1 .. X_m-1 will contain all numbers between 0 and m-1 exactly once. Now you can start from a linearly increasing index, and use the next value in the LCG sequence as your "secret" key.

EDIT2

Implementation: You can design your own LCG parameters, but if you get it wrong it won't cover the full range (and thus have duplicates) so I'll use a published and tried set of parameters here from this paper:

a = 16807, c = 0, m = 2147483647

This gives you a range of 2**31. With pack() you can get the resulting integer as a string, base64_encode() makes it a readable string (of up to 6 significant characters, 6 bits per byte) so this could be your function:

substr(base64_encode(pack("l", (16807 * $index) % 2147483647)), 0, 6)

You could probably generate a MD5 hash of the current datetime/random number and truncate it to the length you need (5-8 characters) and store it as the id field.

If you are using storing this information in a database, you don't need to use a for loop to do the collision check, but you could just do a select statement - something like

SELECT count(1) c FROM Table WHERE id = :id

where :id would be the newly generated id. If c is greater than 0 then you know it already exists.

EDIT

This may may not be the best way to go about it. But I'll give it a shot, so I guess what you need is someway of converting a numbers into a unique short string and that is not in sequence.

I guess as you said, base64 encoding already does the number to short string conversion. To avoid the sequence problem you could have some mapping between your auto-generated id's to some "random" value (unique mapping). Then you can base64 encode this unique value.

You could generate this mapping as follows. Have a temporary table store values from 1 - 10,000,000. Sort it in random order and store it into you Map table.

INSERT INTO MappingTable (mappedId) SELECT values FROM TemporaryTable ORDER BY RAND()

Where MappingTable would have the 2 fields id (your auto-generated id would look up against this) and mappedId (which is what you would generate the base64 encoding for).

As you get closer to 10,000,000 you could rerun the above code again and change the values in the temporary table with 10,000,001-20,000,000 or something like that.

you can use a bitwise XOR to scramble some of the bits:

select thefield ^ 377 from thetable;

+-----+---------+
| a   | a ^ 377 |
+-----+---------+
| 154 |     483 |
| 152 |     481 |
|  69 |     316 |
|  35 |     346 |
|  72 |     305 |
| 139 |     498 |
|  96 |     281 |
|  31 |     358 |
|  11 |     370 |
| 127 |     262 |
+-----+---------+

I think this will never be really secure, as you only need to find the encryption method behind the short unique string to hijack an ID. Is checking for collisions in a loop really that problematic in your setting?

An MD5 of an incrementing number should be fine, but I worry that if you're truncating your MD5 (which is normally 128 bits) down to 5-8 characters, you will almost certainly be damaging it's capability to act as a unique signature...

Completely true. Especially if you reach your 80% collision chance a truncated MD5 will be as good as any random number to guarantee uniqueness by itself, i.e. worthless.

But since you're using a database anyway, why not just use a UNIQUE INDEX ? This way the uniquness check is done (in a much more efficient way than using a loop) by MySQL itself. Just try to do the INSERT with your MD5-generated key, and if it fails, try again...

If you cannot use an auto increment field, and want an absolutely unique value, use UUID. If you decide to use anything else (besides auto increment), you would be silly to NOT check for collisions.

An MD5 of an incrementing number should be fine, but I worry that if you're truncating your MD5 (which is normally 128 bits) down to 5-8 characters, you will almost certainly be damaging it's capability to act as a unique signature...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!