Java HashMap array size

你。 提交于 2020-04-11 05:45:13

问题


I am reading the implementation details of Java 8 HashMap, can anyone let me know why Java HashMap initial array size is 16 specifically? What is so special about 16? And why is it the power of two always? Thanks


回答1:


The reason why powers of 2 appear everywhere is because when expressing numbers in binary (as they are in circuits), certain math operations on powers of 2 are simpler and faster to perform (just think about how easy math with powers of 10 are with the decimal system we use). For example, multication is not a very efficient process in computers - circuits use a method similar to the one you use when multiplying two numbers each with multiple digits. Multiplying or dividing by a power of 2 requires the computer to just move bits to the left for multiplying or the right for dividing.

And as for why 16 for HashMap? 10 is a commonly used default for dynamically growing structures (arbitrarily chosen), and 16 is not far off - but is a power of 2.

You can do modulus very efficiently for a power of 2. n % d = n & (d-1) when d is a power of 2, and modulus is used to determine which index an item maps to in the internal array - which means it occurs very often in a Java HashMap. Modulus requires division, which is also much less efficient than using the bitwise and operator. You can convince yourself of this by reading a book on Digital Logic.

The reason why bitwise and works this way for powers of two is because every power of 2 is expressed as a single bit set to 1. Let's say that bit is t. When you subtract 1 from a power of 2, you set every bit below t to 1, and every bit above t (as well as t) to 0. Bitwise and therefore saves the values of all bits below position t from the number n (as expressed above), and sets the rest to 0.

But how does that help us? Remember that when dividing by a power of 10, you can count the number of zeroes following the 1, and take that number of digits starting from the least significant of the dividend in order to find the remainder. Example: 637989 % 1000 = 989. A similar property applies to binary numbers with only one bit set to 1, and the rest set to 0. Example: 100101 % 001000 = 000101




回答2:


There's one more thing about choosing the hash & (n - 1) versus modulo and that is negative hashes. hashcode is of type int, which of course can be negative. modulo on a negative number (in Java) is negative also, while & is not.




回答3:


Another reason is that you want all of the slots in the array to be equally likely to be used. Since hash() is evenly distributed over 32 bits, if the array size didn't divide into the hash space, then there would be a remainder causing lower indexes to have a slightly higher chance of being used. Ideally, not just the hash, but (hash() % array_size) is random and evenly distributed.

But this only really matters for data with a small hash range (like a byte or character).



来源:https://stackoverflow.com/questions/42941286/java-hashmap-array-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!