two unequal objects with same hashcode

问题

Hashcode() and equals() concept is

1) If two Objects are equal according to equal(), then calling the hashcode method on each of those two objects should produce same hashcode.

and other one is

2) It is not required that if two objects are unequal according to the equal(), then calling the hashcode method on each of the two objects must produce distinct values.

I tried and understood first one and this is the code for first point.

public class Test {
    public static void main(String[] args) {

        Map<Integer, Integer> map = new HashMap<Integer, Integer>();
        map.put(1, 11);
        map.put(4, 11);
        System.out.println(map.hashCode());
        Map<Integer, Integer> map1 = new HashMap<Integer, Integer>();
        map1.put(1, 11);
        map1.put(4, 11);
        System.out.println(map1.hashCode());
        if (map.equals(map1)) {
            System.out.println("equal ");
        }
    }
}

the above program gives same hashcode for two different objects.

Can someone explain me with an example,how can two different objects which are unequal according to the equals() have same hashcode.

回答1:

2) It is not required that if two objects are unequal according to the equal(), then calling the hashcode method on each of the two objects must produce distinct values.

Depending on the hashing function, 2 different objects can have the same hash code. However, 2 objects which are the same must produce the same result when hashed (unless someone implemented a hashing function with random numbers in which case it's useless)

For example, if I am hashing integers and my hashing function is simply (n % 10) then the number 17 and the number 27 will produce the same result. This does not mean that those numbers are the same.

回答2:

Example with Strings (all the strings below have a hashcode of 0):

public static void main(String[] args) {
    List<String> list = Arrays.asList("pollinating sandboxes",
                                      "amusement & hemophilias",
                                      "schoolworks = perversive",
                                      "electrolysissweeteners.net",
                                      "constitutionalunstableness.net",
                                      "grinnerslaphappier.org",
                                      "BLEACHINGFEMININELY.NET",
                                      "WWW.BUMRACEGOERS.ORG",
                                      "WWW.RACCOONPRUDENTIALS.NET",
                                      "Microcomputers: the unredeemed lollipop...",
                                      "Incentively, my dear, I don't tessellate a derangement.",
                                      "A person who never yodelled an apology, never preened vocalizing transsexuals.");
    for (String s : list) {
        System.out.println(s.hashCode());
    }
}

(stolen from this post).

回答3:

hashCode() has 32-bit possible values. Your objects can have much more than this so you are going to have some objects with the same hashCode, i.e. you cannot ensure they will be unique.

This is made worse in a hash collection of a limited size. The maximum capacity of HashMap is 1 << 30 or about one billion. This means that only 30 bits are really used and if your collection doesn't use 16+ GB and is only say one thousand buckets (or 1 << 10 technically) then really you have only 1000 possible buckets.

Note: on the HotSpot JVM, the default Object.hashCode() is never negative i.e. only 31-bit, though I am not sure why.

If you want to generate lots of objects with the same hashCode look at Long.

// from Long
public int hashCode() {
    return (int)(value ^ (value >>> 32));
}

for(long i = Integer.MIN_VALUE; i < Integer.MAX_VALUE;i++) {
    Long l = (i << 32) + i;
    System.out.print(l.hashCode()+" ");
    if (i % 100 == 0)
        System.out.println();
}

This will generate 4 billion Long all with a hashCode of 0.

回答4:

I't pretty simple to understand if you know how a HashMap is implemented and it's purpose. A Hashmap takes a large set of values, and splits them into much smaller sets(buckets) for much faster retrieval of elements. Basically you only need to search the one bucket instead of the full list for your element. The buckets are in an array where the index is the hash code. Each bucket contains a linked list of elements with the same hashcode, but are not equal(). I think in Java 8 they switched to using a treemap when the bucket sizes becomes large.

回答5:

The purpose of hashCode is to enable the following axiom and corollary:

If one happens to know the hash codes of two objects, and those hash codes don't match, one need not bother examining the objects any further to know that the objects won't match. Even if two arbitrarily-chosen non-matching objects would have a 10% chance of having matching hash codes, testing hash codes would let one eliminate 90% of the comparisons one would otherwise need. Not as big a win as eliminating 99.99%, but definitely worthwhile nonetheless.
Knowledge that none of the objects in a bunch have a particular hash code implies that none of the objects in that bunch will match an object with that hash code. If one partitioned a collection of objects into those whose hash code was an even number and those whose hash was odd, and one wanted to find whether one had a given item whose hash code happened to be even, there would be no need to examine anything in the collection of of odd-hash items. Likewise there would be no need to look for an odd-hash item in the even-hash collection. Even a two-value hash could thus speed up searches by almost half. If one divides a collection into smaller partitions, one can speed things up even further.

Note that hashCode() will offer the most benefit if every different item returns a different hash, but it can offer substantial benefit even when many items have the same hash value. The difference between a 90% savings and a 99.99% savings is often much greater than the numbers would suggest, and thus one if one can reasonably easily improve things to 99%, 99.9%, or better one should do so, but he difference between having zero false matches and having a few false matches in a collection is pretty slight.

回答6:

It's pretty simple actually,

First we have to know what a hash code is.

In java, a hash code is simple a 32 bit signed integer that is somehow derived from the data in question. The integer types are usually just (Int Data) Mod (some reasonable large prime number).

Let's do a simple hash on integers.
Define:

public int hash(int num){ return num % 19 ; }

In this case, both 19 and 38 will return the hash value of 0.

For string types, the hash is derived from the individual characters and each ones position in the string, divided by a reasonably large number. (Or, in the case of Java, ignoring overflow in a 32 bit sum).

Given that there are arbitrarily many strings possible, and there is a limited number of hashcodes (2^32) for a string, the pigeon-hole principle states that there are at least two different strings that result in the same hashcode.

回答7:

Actullay, this link explain what happens if hashcode equals more clearly.

http://www.javamadesoeasy.com/2015/02/hashmap-custom-implementation.html

回答8:

I believe it will help you understand...

The hashcode of a Java Object is simply a number, it is 32-bit signed int, that allows an object to be managed by a hash-based data structure. We know that hash code is an unique id number allocated to an object by JVM. But actually speaking, Hash code is not an unique number for an object. If two objects are equals then these two objects should return same hash code. So we have to implement hashcode() method of a class in such way that if two objects are equals, ie compared by equal() method of that class, then those two objects must return same hash code. If you are overriding hashCode you need to override equals method also.

ref: https://www.java2novice.com/java_interview_questions/hashcode/

回答9:

My understanding is that hashCode is a numeric representation of the memory address, but is not the actual address. It can be changed, without affecting the actual address. Thus, it should be possible to set all objects to the same hashCode, even if they are all entirely different things. Think of everybody on one block all suddenly having the same street address. They are truly different people, but now all share the same street address. Their house didn't move, a mischevious teen just labeled everybody as "100 N. Main".

I am pretty new to Java, so take my reply with a bit of caution.

来源：https://stackoverflow.com/questions/16400711/two-unequal-objects-with-same-hashcode

标签

java

hashmap

equals

hashcode