Why we need to override hashCode and equals?

问题

By default hashCode and equals works fine. I have used objects with hash tables like HashMap, without overriding this methods, and it was fine. For example:

public class Main{
public static void main(String[] args) throws Exception{
    Map map = new HashMap<>();
    Object key = new Main();
    map.put(key, "2");
    Object key2 = new Main();
    map.put(key2, "3");
    System.out.println(map.get(key));
    System.out.println(map.get(key2));
}
}

This code works fine. By default hashCode returning memory address of object, and equals checks if two objects is the same. So what is the problem with using default implementation of this methods?

回答1:

In your example, whenever you want to retrieve something from you HashMap, you need to have key and key2, because their equals() is the same as object identity. This makes the HashMap completely useless, because you cannot retrieve anything from it without having these two keys. Passing the keys around doesn't make sense, because you could just as well pass the values around, it would be equally awkward.

Now try to imagine some use case, where a HashMap actually makes sense. For example, suppose that you get String-valued requests from the outside, and want to return, say, ip-addresses. The keys that come from the outside obviously cannot be the same as the keys you used to set up your map. Therefore you need some methods that compare requests from the outside to the keys you used during the initialization phase. This is exactly what equals is good for: it defines an equivalence relation on objects that are not identical in the sense of being represented by the same bits in physical memory. hashCode is a coarser version of equals, which is necessary to retrieve values from HashMaps quickly.

回答2:

Note this example from an old pdf I have:

This code

    public class Name {

private String first, last;

public Name(String first, String last) { this.first = first; this.last = last;

}

public boolean equals(Object o) {

if (!(o instanceof Name)) return false;

Name n = (Name)o;

return n.first.equals(first) && n.last.equals(last);

}

public static void main(String[] args) {

Set s = new HashSet();

s.add(new Name("Donald", "Duck"));

System.out.println(

s.contains(new Name("Donald", "Duck")));

}

}

...will not always give the same result because as it is stated in the pdf

Donald is in the set, but the set can’t find him. The Name class violates the hashCode contract

Because, in this case, there are two strings composing the object the hashcode should also be composed of those two elements.

To fix this code we should add a hashCode method:

public int hashCode() { 
return 31 * first.hashCode() + last.hashCode();
}

This question in the pdf ends saying that we should

override hashCode when overriding equals

回答3:

Your example is not very useful as it would be simpler to have simple variables. i.e. the only way to lookup the value in the map is to hold the original key. In which case, you may as well just hold the value and not have a Map in the first place.

If instead you want to be able to create a new key which is considered equivalent to a key used previously, you have to provide how equivalence is determined.

回答4:

Given that most objects are never asked for their identity hash code, the system does not keep for most objects any information that would be sufficient to establish a permanent identity. Instead, Java uses two bits in the object header to distinguish three states:

The identity hashcode for the object has never been queried.
The identity hashcode has been queried, but the object has not been moved by the GC since then.
The identity hashcode has been queried, and the object has been moved since then.

For objects in the first state, asking for the identity hash code will change the object to the second state and process it as a second-state object.

For objects in the second state, including those which had moments before been in the first state, the identity hash code will be formed from the address.

When an object in the second state is moved by the GC, the GC will allocate an extra 32 bits to the object, which will be used to hold a hash-code derived from its original address. The object will then be assigned to the third state.

Subsequent requests for the hash code from a state-3 object will use that value that was stored when it was moved.

At times when the system knows that no objects within a certain address range are in state 2, it may change the formula used to compute hash codes from addresses in that range.

Although at any given time there may only be one object at any given address, it is entirely possible that an object might be asked for its identity hash code and later moved, and that another object might be placed at the either same address as the first one, or an address that would hash to the same value (the system might change the formula used to compute hash values to avoid duplication, but would be unable to eliminate it).

来源：https://stackoverflow.com/questions/29736296/why-we-need-to-override-hashcode-and-equals

标签

java

equals

hashcode