问题
I know the difference between Hashtable and HashMap. However, both these classes seemingly are using a hash function to get the job done. Is there a difference between the hash function used in Hashtable, and the hash function used in HashMap?
In particular, is there a difference between the hashing algorithm they use? What is the formula used to hash in these two classes?
In other words, is the way for calculating index (hash value) different?
回答1:
In particular, is there a difference between the hashing algorithm they use? What is the formula used to hash in these two classes?
The primary hash function used when you use an object as a hash table key is the object's hashCode()
method. It is up the to the key class to implement a decent hash function.
The Hashtable
and HashMap
classes take the key's hashcode value and convert it to an index in the primary hashtable array-of-chains. However, there are differences in how this happens between Hashtable
and HashMap
.
For
Hashtable
(Java 8) the code is this:hash = key.hashCode(); index = (hash & 0x7FFFFFFF) % tab.length;
For
HashMap
(Java 8) the code is (effectively) this:// (I have restructured the code for ease of comparison.) int h; hash = (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); index = (tab.length - 1) & hash;
As you can see, HashMap
is scrambling the hashcode value returned by the key's hashcode function. This is explained in the source code as follows:
[This method] computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.
Notes:
The
&
versus%
difference is because inHashtable
the hash array size is a prime number, but inHashMap
(Java 8) the size is a power of 2.In Java 8
HashMap
, the implementation will turn a long hash chain into a binary tree if the key class implementsComparable
.HashMap
handlesnull
keys, butHashtable
doesn't.
However, all of this extra complexity in HashMap
only comes into play if your key class has a poorly designed / implemented hashCode()
method ... or if someone is deliberately trying to engineer hash collisions.
In other words, if your key class is well designed, the differences should not matter.
回答2:
java.util.Hashtable<K,V>
is like java.util.Vector<T>
. It's a class added to the SDK very early in development which has been superseded by HashMap<K,V>
(as ArrayList<T>
superseded Vector<T>
).
So you simply shouldn't use it unless you require implicit synchronization of all the operations, which comes by default with Hashtable
, but you can stil use Collections.synchronizedMap
for that purpose or a ConcurrentHashMap<K,V>
.
As stated in Javadoc:
As of the Java 2 platform v1.2, this class was retrofitted to implement the
Map
interface, making it a member of the Java Collections Framework. Unlike the new collection implementations,Hashtable
is synchronized. If a thread-safe implementation is not needed, it is recommended to useHashMap
in place ofHashtable
. If a thread-safe highly-concurrent implementation is desired, then it is recommended to useConcurrentHashMap
in place ofHashtable
.
The hashing of the two classes should be the same, as they'll both use int Object::hashCode
for their purpose.
来源:https://stackoverflow.com/questions/38287366/hashing-function-in-hashtable-vs-hashmap