Create a uniform random number based on a hash

不羁的心 提交于 2019-12-03 08:58:25

You said "I should get the same random number when I query using the same key and also, I should get a very different number if I query using a slightly different key". If I understand your question correctly, you do not want a random number, but rather something like a cryptographic hash code.

You should look at passing whatever data you have through a hash function like SHA or MD5. This will give you something that is seemingly random with respect to the input, but will always be the same given the same input, and will vary wildly even if your input vary only very little.

EDIT: To consistently obtain double values try something like this (pseudo-code):

SHAHashValue v = ComputeSHA( yourObject);
Random r = new Random(v);
the_random_value = r.getNext();

The idea here is to use the SHA hash value as the seed to initialize your random generator. This is pretty much what you have, but I don't know what your HashBuilder produces in terms of different values. So using SHA hashes instead might improve the situation.

You should also consider that "very different" values for doubles between 0 and 1 might not be immediately apparent.

I would just use the key's hash itself as the "random" number. Assuming a sensible hash implementation, it will have all the properties you've mentioned.

That is a somewhat surprising result. I would have thought that a small difference in the seed should lead to a large difference in the stream of random numbers. On reflection, i don't know why i thought that.

Still, it's easily fixed!

The simplest thing, perhaps, is simply to let the random number generator warm up a bit before using it. The bitstreams produced by your different seeds start off similar, but diverge quite quickly, so simply throwing away the early parts of the bitstreams should do the job. Immediately after the line where you create the Random, add this:

rGen.nextLong();

Or, for more divergence:

for (int j = 0; j < 10; ++j) rGen.nextLong();

A quick test shows that this gets a much wider variety of numbers.

Another option would be to use a java.security.SecureRandom as the random number generator. This does a better job of generating different outputs from similar inputs. You seed it with a byte array; you could produce one by saying something like (str + time).getBytes().

A further option would be to take your seed, then hash it using a cryptographic hash such as SHA-256, then use some portion of that as the seed. The hashing would take very similar inputs and produce very different outputs, which would then give you suitably different random bitstreams.

What I understand is:

  • Your object has two instance variables - A long time and a string str which need to be taken into account to calculate the random number
  • You want the random number to be very sensitive to the time part.
  • The same time+str combination should produce the same random number.
  • It is OK if two different time+str combinations produce the same random number.

From the code you posted, it seems the HashCodeBuilder() is not as sensitive as you want it to be to the time.

Apart from what others have suggested, one idea could be to change the time itself in a consistent way.

You could take the last digit of time (the long part of the key) and move it to somewhere in the middle of the number. e.g, your hashCode() can be:

@Override
public int hashCode() {
    return (new org.apache.commons.lang.builder.HashCodeBuilder()
            .append(time+((time%10)*100000000)).append(str).toHashCode());
}

(The code is not exactly moving the last digit to the middle but is doing something similar in the context of the question)

But this would be kind of slow. So you could transform it to bit operators.

@Override
public int hashCode() {
    return (new org.apache.commons.lang.builder.HashCodeBuilder()
            .append(time+((time & 63l) << 57)).append(str).toHashCode());
}

Kind of like extracting the last 6 bits of time (time & 63l) and putting those bits way in the front (57 is kind of random. I just want to move those bits to more significant positions). This does not match the "move digit to somewhere in the middle" analogy exactly, but is similar to that conceptually.

You will get more variance if you extract only the last 5 bits (time & 31l). You could try different values. For the code posted in the question, the time & 63l version returns the following output:

1339343005559:-1084202043:0.339762681480
1339343005585:1801482883:0.323979029483
1339343005586:559968862:0.786162684846
1339343005587:-681545159:0.241820545267
1339343005588:-580881900:0.692788956755
1339343005590:1231057354:0.624686671170
1339343005591:-10456667:0.530394885899
1339343005592:1700819920:0.894868466104
1339343005593:459305899:0.149584882259
1339343005595:-2023722143:0.289584988289

which, as expected, shows much more variance for small changes in the long part of the key.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!