Please consider the below piece of code:
HashSet hs = new HashSet();
hs.add(\"hi\"); -- (1)
hs.add(\"hi\"); -- (2)
hs.size()
w
You need to check put method in Hash map first as HashSet is backed up by HashMap
To say it differently: When you insert a key-value-pair into a HashMap where the key already exists (in a sense hashvalue() gives the same value und equal() is true, but the two objects can still differ in several ways), the key isn't replaced but the value is overwritten. The key is just used to get the hashvalue() and find the value in the table with it. Since HashSet uses the keys of a HashMap and sets arbitrary values which don't really matter (to the user) as a result the Elements of the Set aren't replaced either.
The first thing you need to know is that HashSet
acts like a Set
, which means you add your object directly to the HashSet
and it cannot contain duplicates. You just add your value directly in HashSet
.
However, HashMap
is a Map
type. That means every time you add an entry, you add a key-value pair.
In HashMap
you can have duplicate values, but not duplicate keys. In HashMap
the new entry will replace the old one. The most recent entry will be in the HashMap
.
Understanding Link between HashMap and HashSet:
Remember, HashMap
can not have duplicate keys. Behind the scene HashSet
uses a HashMap
.
When you attempt to add any object into a HashSet
, this entry is actually stored as a key in the HashMap
- the same HashMap
that is used behind the scene of HashSet
. Since this underlying HashMap
needs a key-value pair, a dummy value is generated for us.
Now when you try to insert another duplicate object into the same HashSet
, it will again attempt to be insert it as a key in the HashMap
lying underneath. However, HashMap
does not support duplicates. Hence, HashSet
will still result in having only one value of that type. As a side note, for every duplicate key, since the value generated for our entry in HashSet is some random/dummy value, the key is not replaced at all. it will be ignored as removing the key and adding back the same key (the dummy value is the same) would not make any sense at all.
Summary:
HashMap
allows duplicate values
, but not keys
.
HashSet
cannot contains duplicates.
To play with whether the addition of an object is successfully completed or not, you can check the boolean
value returned when you call .add()
and see if it returns true
or false
. If it returned true
, it was inserted.
The docs are pretty clear on this: HashSet.add
doesn't replace:
Adds the specified element to this set if it is not already present. More formally, adds the specified element e to this set if this set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false.
But HashMap.put will replace:
If the map previously contained a mapping for the key, the old value is replaced.
HashMap
basically contains Entry
which subsequently contains Key(Object)
and Value(Object)
.Internally HashSet
are HashMap
and HashMap
do replace values as some of you already pointed..but does it really replaces the keys???No ..and that is the trick here. HashMap
keeps its value as key in the underlying HashMap
and value is just a dummy object.So if u try to reinsert same Value in HashMap(Key in underlying Map).It just replaces the dummy value and not the Key(Value for HashSet).
Look at the below code for HashSet Class:
public boolean [More ...] add(E e) {
return map.put(e, PRESENT)==null;
}
Here e is the value for HashSet but key for underlying map.and key is never replaced. Hope i am able to clear the confusion.
Correct me if I'm wrong but what you're getting at is that with strings, "Hi" == "Hi" doesn't always come out true (because they're not necessarily the same object).
The reason you're getting an answer of 1 though is because the JVM will reuse strings objects where possible. In this case the JVM is reusing the string object, and thus overwriting the item in the Hashmap/Hashset.
But you aren't guaranteed this behavior (because it could be a different string object that has the same value "Hi"). The behavior you see is just because of the JVM's optimization.