Why does Java's hashCode() in String use 31 as a multiplier?

前端未结

关注

 13  2271

Per the Java documentation, the hash code for a String object is computed as:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
<

相关标签:

13条回答

遇见更好的自我

2020-11-22 02:26

By multiplying, bits are shifted to the left. This uses more of the available space of hash codes, reducing collisions.

By not using a power of two, the lower-order, rightmost bits are populated as well, to be mixed with the next piece of data going into the hash.

The expression n * 31 is equivalent to (n << 5) - n.

0 讨论(0)
发布评论:

提交评论
- 加载中...
既然无缘

2020-11-22 02:27
In latest version of JDK, 31 is still used. https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/lang/String.html#hashCode()

The purpose of hash string is
- unique (Let see operator ^ in hashcode calculation document, it help unique)
- cheap cost for calculating
31 is max value can put in 8 bit (= 1 byte) register, is largest prime number can put in 1 byte register, is odd number.

Multiply 31 is <<5 then subtract itself, therefore need cheap resources.
0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2020-11-22 02:29

I'm not sure, but I would guess they tested some sample of prime numbers and found that 31 gave the best distribution over some sample of possible Strings.

0 讨论(0)
发布评论:

提交评论
- 加载中...
感动是毒

2020-11-22 02:29
This is because 31 has a nice property – it's multiplication can be replaced by a bitwise shift which is faster than the standard multiplication:
```
31 * i == (i << 5) - i
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
灰色年华

2020-11-22 02:32

Goodrich and Tamassia computed from over 50,000 English words (formed as the union of the word lists provided in two variants of Unix) that using the constants 31, 33, 37, 39, and 41 will produce fewer than 7 collisions in each case. This may be the reason that so many Java implementations choose such constants.

See section 9.2 Hash Tables (page 522) of Data Structures and Algorithms in Java.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2020-11-22 02:32

Neil Coffey explains why 31 is used under Ironing out the bias.

Basically using 31 gives you a more even set-bit probability distribution for the hash function.

0 讨论(0)
发布评论:

提交评论
- 加载中...