Why multiply by a prime before xoring in many GetHashCode Implementations?

前端 未结 4 2122
既然无缘
既然无缘 2021-01-06 10:33

I understand that multiplication by a large number before xoring should help with badly distributed operands but why should the multiplier be a prime?

相关标签:
4条回答
  • 2021-01-06 11:02

    There's a good article on the Computing Life blog that discusses this topic in detail. It was originally posted as a response to the Java hashCode() question I linked to in the question. According to the article:

    Primes are unique numbers. They are unique in that, the product of a prime with any other number has the best chance of being unique (not as unique as the prime itself of-course) due to the fact that a prime is used to compose it. This property is used in hashing functions.

    Given a string “Samuel”, you can generate a unique hash by multiply each of the constituent digits or letters with a prime number and adding them up. This is why primes are used.

    However using primes is an old technique. The key here to understand that as long as you can generate a sufficiently unique key you can move to other hashing techniques too. Go here for more on this topic about hashes without primes.

    0 讨论(0)
  • 2021-01-06 11:06

    I'm not sure exactly which algorithm you're talking about, but typically the constants in such algorithms need to be relatively prime. Otherwise, you get cycles and not all the possible values show up in the result.

    The number probably doesn't need to be prime in your case, only relatively prime to some other numbers, but making it prime guarantees that. It also covers the cases where the other magic numbers change.

    For example, if you are talking about taking the last bits of some number, then the multiplier needs to not be a multiple of 2. So, 9 would work even though it's not prime.

    0 讨论(0)
  • 2021-01-06 11:19

    Multiplying by a non-prime has a cyclic repeating pattern much smaller than the number. If you use a prime then the cyclic repeating pattern is guaranteeed to be at least as large as the prime number.

    0 讨论(0)
  • 2021-01-06 11:23

    Consider the simplest multiplication: x2.

    It is equivalent to a left-bitshift. In other words, it really didn't "randomize" the data, it just shifted it over.

    Same with x4, or any power of two. The original data is intact, just shifted.

    Now, multiplication by other numbers (non-powers of two) are not as obvious, but still have the same problem, more or less. The original data is intact, or trivially transformed. (eg. x5 is the same as left-bitshift two places, then add on the original data).

    The point of GetHashCode is to essentially distribute the data as randomly as possible. Multiplying by a prime number guarantees that the answer won't be a simpler transform like bit-shifting or adding a number to itself.

    0 讨论(0)
提交回复
热议问题