问题
I have a program for my Java class where I want to use hashSets to compare a directory of text documents. Essentially, my plan is to create a hashSet of strings for each paper, and then add two of the papers hashSets together into one hashSet and find the number of same 6-word sequences.
My question is, do I have to manually check for, and handle, collisions, or does Java do that for me?
回答1:
Java Hash Maps/Sets Automatically handel Hash collisions, this is why it is important to override both the equals
and the hashCode
methods. As both of them are utilised by Sets to differentiate duplicate or unique entries.
It is also important to note that these hash collisions hava a performance impace since multiple objects are referenced by the same Hash.
public class MyObject {
private String name;
//getter and setters
public int hashCode() {
int hashCode = //Do some object specifc stuff to gen hashCode
return int;
}
public boolean equals(Object obj) {
if(this==obj) return true;
if(obj instanceOf MyObject) {
if(this.name.equals((MyObject)obj.getName())) {
return true;
}
return false;
}
}
}
Note: Standard Java Objects such as String have already implemented hashCode and equals so you only have to do that for your own kind of Data Objects.
回答2:
I think you did not ask for hash collisions, right? The question is what happens when HashSet a and HashSet b are added into a single set e.g. by a.addAll(b).
The answer is a will contain all elements and no duplicates. In case of Strings this means you can count the number of equal String from the sets with a.size() before add - a.size() after add + b.size().
It does not even matter if some of the Strings have the same hash code but are not equal.
来源:https://stackoverflow.com/questions/12909325/hashset-collisions-in-java