问题
Background
I have a simple media client/server I've written, and I want to generate a non-obvious time value I send with each command from the client to the server. The timestamps will have a fair bit of data in them (nano-second resolution, even if it's not truly accurate, due to limitations of timer sampling in modern operating systems), etc.
What I'm trying to do (on Linux, in C), is to generate a one-to-one sequence of n-bit values (let's assume data is store in 128bit array-of-int elements for now) with no overlapping/colliding values. I would then take a pseudo-random 128bit value/number as a "salt", apply it to the timestamp, and then start sending off commands to the server, incrementing the pre-salted/pre-hashed value.
The reason the timestamp size is so large is because the timestamp may have to accommodate a very large duration of time.
Question
How could I accomplish such a sequence (non-colliding) with an initial salt value? The best approach that sounds along the lines of my goal is from this post, which notes:
If option 1 isn't "random" enough for you, use the CRC-32 hash of said global (32-bit) counter. There is a 1-to-1 mapping (bijection) between N-bit integers and their CRC-N so uniqueness will still be guaranteed.
However, I do not know:
- If that can (efficiently) be extended to 128-bit data.
- If some sort of addition-to/multiplication-by salt-value to provide the initial seed for the sequence would disrupt it or introduce collisions.
Follow-up
I realize that I could use a 128bit random hash from libssl
or something similar, but I want the remote server, using the same salt value, to be able to convert the hashed timestamps back into their true values.
Thank you.
回答1:
You could use a linear congruential generator. With the right parameters, it is guaranteed to produce non-repeating sequences [unique] sequences with a full period (i.e. no collisions).
This is what random(3)
uses in TYPE_0
mode. I adapted it for a full unsigned int
range and the seed can be any unsigned int
(See my sample code below).
I believe it can be extended to 64 or 128 bits. I'd have a look at: https://en.wikipedia.org/wiki/Linear_congruential_generator to see about the constraints on parameters to prevent collisions and good randomness.
Following the wiki page guidelines, you could produce one that can take any 128 bit value as the seed and will not repeat until all possible 128 bit numbers have been generated.
You may need to write a program to generate suitable parameter pairs and then test them for the "best" randomness. This would be a one time operation.
Once you've got them, just plug these parameters into your equation in your actual application.
Here's some code of mine that I had been playing with when I was looking for something similar:
// _prngstd -- get random number
static inline u32
_prngstd(prng_p prng)
{
long rhs;
u32 lhs;
// NOTE: random is faster and has a _long_ period, but it _only_ produces
// positive integers but jrand48 produces positive _and_ negative
#if 0
rhs = jrand48(btc->btc_seed);
lhs = rhs;
#endif
// this has collisions
#if 0
rhs = rand();
PRNG_FLIP;
#endif
// this has collisions because it defaults to TYPE_3
#if 0
rhs = random();
PRNG_FLIP;
#endif
// this is random in TYPE_0 (linear congruential) mode
#if 0
prng->prng_state = ((prng->prng_state * 1103515245) + 12345) & 0x7fffffff;
rhs = prng->prng_state;
PRNG_FLIP;
#endif
// this is random in TYPE_0 (linear congruential) mode with the mask
// removed to get full range numbers
// this does _not_ produce overlaps
#if 1
prng->prng_state = ((prng->prng_state * 1103515245) + 12345);
rhs = prng->prng_state;
lhs = rhs;
#endif
return lhs;
}
回答2:
The short answer is encryption. With a set of 128 bit values feed them into AES and get a different set of 128 bit values out. Because encryption is reversible the outputs are guaranteed unique for unique inputs with a fixed key.
Encryption is a reversible one-to-one mapping of the input values to the output values, each set is a full permutation of the other.
Since you are presumably not repeating your inputs, then ECB mode is probably sufficient, unless you want a greater degree of security. ECB mode is vulnerable if used repeatedly with identical inputs, which does not appear to be the case here.
For inputs shorter than 128 bits, then use a fixed padding method to make them the right length. As long as the uniqueness of inputs is not affected, then padding can be reasonably flexible. Zero padding, at either end (or at the beginning of internal fields) may well be sufficient.
I do not know your detailed requirements, so feel free to modify my advice.
回答3:
Somewhere between linear congruential generators and encryption functions there are hashes that can convert linear counts into passable pseudorandom numbers.
If you happen to have 128-bit integer types handy (eg., __int128
in GCC when building for a 64-bit target), or are willing to implement such long multiplies by hand, then you could extend on the construction used in SplitMix64. I did a fairly superficial search and came up with the following parameters:
uint128_t mix(uint128_t x) {
uint128_t m0 = (uint128_t)0xecfb1b9bc1f0564f << 64
| 0xc68dd22b9302d18d;
uint128_t m1 = (uint128_t)0x4a4cf0348b717188 << 64
| 0xe2aead7d60f8a0df;
x ^= x >> 59;
x *= m0;
x ^= x >> 60;
x *= m1;
x ^= x >> 84;
return x;
}
and its inverse:
uint128_t unmix(uint128_t x) {
uint128_t im0 = (uint128_t)0x367ce11aef44b547 << 64
| 0x424b0c012b51d945;
uint128_t im1 = (uint128_t)0xef0323293e8f059d << 64
| 0x351690f213b31b1f;
x ^= x >> 84;
x *= im1;
x ^= x >> 60 ^ x >> (2 * 60);
x *= im0;
x ^= x >> 59 ^ x >> (2 * 59);
return x;
}
I'm not sure if you wanted a just a random sequence, or a way to obfuscate an arbitrary timestamp (since you said you wanted to decode the values they must be more interesting than a linear counter), but one derives from the other simply enough:
uint128_t encode(uint128_t time, uint128_t salt) {
return mix((time + 1) * salt);
}
uint128_t generate(uint128_t salt) {
static uint128_t t = 0;
return encode(t++, salt);
}
static uint128_t inv(uint128_t d) {
uint128_t i = d;
while (i * d != 1) {
i *= 2 - i * d;
}
return i;
}
uint128_t decode(uint128_t etime, uint128_t salt) {
return unmix(etime) * inv(salt) - 1;
}
Note that salt
chooses one of 2127 sequences of non-repeating 128-bit values (we lose one bit because salt
must be odd), but there are (2128)! possible sequences that could have been generated. Elsewhere I'm looking at extending the parameterisation so that more of these sequences can be visited, but I started goofing around with the above method for increasing the randomness of the sequence to hide any problems where the parameters could pick not-so-random (but provably distinct) sequences.
Obviously uint128_t
isn't a standard type, and so my answer is not C, but you can use either a bignumber library or a compiler extension to make the arithmetic work. For clarity I relied on the compiler extension. All the operations rely on C-like unsigned overflow behaviour (take the low-order bits of the arbitrary-precision result).
来源:https://stackoverflow.com/questions/39215891/generating-very-large-non-repeating-integer-sequence-without-pre-shuffling