As per my understanding I think:
As it is said, a picture is worth 1000 words. I say: some code is better than 1000 words. Here's the source code of HashMap. Get method:
/**
* Implements Map.get and related methods
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
}
So it becomes clear that hash is used to find the "bucket" and the first element is always checked in that bucket. If not, then equals
of the key is used to find the actual element in the linked list.
Let's see the put()
method:
/**
* Implements Map.put and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @param onlyIfAbsent if true, don't change existing value
* @param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
It's slightly more complicated, but it becomes clear that the new element is put in the tab at the position calculated based on hash:
i = (n - 1) & hash
here i
is the index where the new element will be put (or it is the "bucket"). n
is the size of the tab
array (array of "buckets").
First, it is tried to be put as the first element of in that "bucket". If there is already an element, then append a new node to the list.
import java.util.HashMap;
public class Students {
String name;
int age;
Students(String name, int age ){
this.name = name;
this.age=age;
}
@Override
public int hashCode() {
System.out.println("__hash__");
final int prime = 31;
int result = 1;
result = prime * result + age;
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
System.out.println("__eq__");
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Students other = (Students) obj;
if (age != other.age)
return false;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
public static void main(String[] args) {
Students S1 = new Students("taj",22);
Students S2 = new Students("taj",21);
System.out.println(S1.hashCode());
System.out.println(S2.hashCode());
HashMap<Students,String > HM = new HashMap<Students,String > ();
HM.put(S1, "tajinder");
HM.put(S2, "tajinder");
System.out.println(HM.size());
}
}
Output:
__ hash __
116232
__ hash __
116201
__ hash __
__ hash __
2
So here we see that if both the objects S1 and S2 have different content, then we are pretty sure that our overridden Hashcode method will generate different Hashcode(116232,11601) for both objects. NOW since there are different hash codes, so it won't even bother to call EQUALS method. Because a different Hashcode GUARANTEES DIFFERENT content in an object.
public static void main(String[] args) {
Students S1 = new Students("taj",21);
Students S2 = new Students("taj",21);
System.out.println(S1.hashCode());
System.out.println(S2.hashCode());
HashMap<Students,String > HM = new HashMap<Students,String > ();
HM.put(S1, "tajinder");
HM.put(S2, "tajinder");
System.out.println(HM.size());
}
}
Now lets change out main method a little bit. Output after this change is
__ hash __
116201
__ hash __
116201
__ hash __
__ hash __
__ eq __
1
We can clearly see that equal method is called. Here is print statement __eq__, since we have same hashcode, then content of objects MAY or MAY not be similar. So program internally calls Equal method to verify this.
Conclusion
If hashcode is different , equal method will not get called.
if hashcode is same, equal method will get called.
Thanks , hope it helps.
two objects are equal, implies that they have same hashcode, but not vice versa.
2 equal objects ------> they have same hashcode
2 objects have same hashcode ----xxxxx--> they are NOT equal
Java 8 update in HashMap-
you do this operation in your code -
myHashmap.put("old","old-value");
myHashMap.put("very-old","very-old-value");
so, suppose your hashcode returned for both keys "old"
and "very-old"
is same. Then what will happen.
myHashMap
is a HashMap, and suppose that initially you didn't specify its capacity. So default capacity as per java is 16. So now as soon as you initialised hashmap using the new keyword, it created 16 buckets. now when you executed first statement-
myHashmap.put("old","old-value");
then hashcode for "old"
is calculated, and because the hashcode could be very large integer too, so, java internally did this - (hash is hashcode here and >>> is right shift)
hash XOR hash >>> 16
so to give as a bigger picture, it will return some index, which would be between 0 to 15. Now your key value pair "old"
and "old-value"
would be converted to Entry object's key and value instance variable. and then this entry object will be stored in the bucket, or you can say that at a particular index, this entry object would be stored.
FYI- Entry is a class in Map interface- Map.Entry, with these signature/definition
class Entry{
final Key k;
value v;
final int hash;
Entry next;
}
now when you execute next statement -
myHashmap.put("very-old","very-old-value");
and "very-old"
gives same hashcode as "old"
, so this new key value pair is again sent to the same index or the same bucket. But since this bucket is not empty, then the next
variable of the Entry object is used to store this new key value pair.
and this will be stored as linked list for every object which have the same hashcode, but a TRIEFY_THRESHOLD is specified with value 6. so after this reaches, linked list is converted to the balanced tree(red-black tree) with first element as the root.
The hashcode determines which bucket for the hashmap to check. If there is more than one object in the bucket then a linear search is done to find which item in the bucket equals the desired item (using the equals()
) method.
In other words, if you have a perfect hashcode then hashmap access is constant, you will never have to iterate through a bucket (technically you would also have to have MAX_INT buckets, the Java implementation may share a few hash codes in the same bucket to cut down on space requirements). If you have the worst hashcode (always returns the same number) then your hashmap access becomes linear since you have to search through every item in the map (they're all in the same bucket) to get what you want.
Most of the time a well written hashcode isn't perfect but is unique enough to give you more or less constant access.
Hash map works on the principle of hashing
HashMap get(Key k) method calls hashCode method on the key object and applies returned hashValue to its own static hash function to find a bucket location(backing array) where keys and values are stored in form of a nested class called Entry (Map.Entry) . So you have concluded that from the previous line that Both key and value is stored in the bucket as a form of Entry object . So thinking that Only value is stored in the bucket is not correct and will not give a good impression on the interviewer .
If key is null , then Null keys always map to hash 0, thus index 0.
If key is not null then , it will call hashfunction on the key object , see line 4 in above method i.e. key.hashCode() ,so after key.hashCode() returns hashValue , line 4 looks like
int hash = hash(hashValue)
and now ,it applies returned hashValue into its own hashing function .
We might wonder why we are calculating the hashvalue again using hash(hashValue). Answer is It defends against poor quality hash functions.
Now final hashvalue is used to find the bucket location at which the Entry object is stored . Entry object stores in the bucket like this (hash,key,value,bucketindex)
You can find excellent information at http://javarevisited.blogspot.com/2011/02/how-hashmap-works-in-java.html
To Summarize:
HashMap works on the principle of hashing
put(key, value): HashMap stores both key and value object as Map.Entry. Hashmap applies hashcode(key) to get the bucket. if there is collision ,HashMap uses LinkedList to store object.
get(key): HashMap uses Key Object's hashcode to find out bucket location and then call keys.equals() method to identify correct node in LinkedList and return associated value object for that key in Java HashMap.