equals and hashCode: Is Objects.hash method broken?

后端 未结 4 1806
有刺的猬
有刺的猬 2021-01-03 04:51

I am using Java 7, and I have the following class below. I implemented equals and hashCode correctly, but the problem is that equals r

相关标签:
4条回答
  • There is no requirement that unequal objects must have different hashCodes. Equal objects are expected to have equal hashCodes, but hash collisions are not forbidden. return 1; would be a perfectly legal implementation of hashCode, if not very useful.

    There are only 32 bits worth of possible hash codes, and an unbounded number of possible objects, after all :) Collisions will happen sometimes.

    0 讨论(0)
  • 2021-01-03 05:15

    it's not necessary for two unequal objects to have different hashes, the important thing is to have the same hash for two equal objects.

    I can implement hashCode() like this :

    public int hashCode() {
        return 5;
    }
    

    and it will stay correct (but inefficient).

    0 讨论(0)
  • 2021-01-03 05:34

    HashCode being 32 bit int value, there is always a possibility of collisions(same hash code for two objects), but its rare/coincidental. Your example is one of the such a highly coincidental one. Here is the explanation.

    When you call Objects.hash, it internally calls Arrays.hashCode() with logic as below:

    public static int hashCode(Object a[]) {
        if (a == null)
            return 0;
        int result = 1;
        for (Object element : a)
            result = 31 * result + (element == null ? 0 : element.hashCode());
        return result;
    }
    

    For your 3 param hashCode, it results into below:

       31 * (31 * (31 *1 +hashOfString1)+hashOfString2) + hashOfString3
    

    For your first object. Hash value of individual Strings are:

    chamorro --> 1140493257 english --> 1698758127 notes --> 0

    And for second object:

    chamorro --> 1140494218 english --> 1698728336 notes -->0

    If you notice, first two values of the hash code in both objects are different.

    But when it computes the final hash code as:

      int hashCode1 = 31*(31*(31+1140493257) + 1698758127)+0;
      int hashCode2 = 31*(31*(31+1140494218) + 1698728336)+0;
    

    Coincidentally it results into same hash code 1919283673 because int is stored in 32 bits.

    Verify the theory your self be using the code segment below:

      public static void main(String... args) {
        ChamorroEntry entry1 = new ChamorroEntry("Åguigan", 
                             "Second island south of Saipan. Åguihan.", "");
        ChamorroEntry entry2 = new ChamorroEntry("Åguihan", 
                             "Second island south of Saipan. Åguigan.", "");
        System.out.println(entry1.equals(entry2)); // returns false
        System.out.println("Åguigan".hashCode());
        System.out.println("Åguihan".hashCode());
        System.out.println("Second island south of Saipan. Åguihan.".hashCode());
        System.out.println("Second island south of Saipan. Åguigan.".hashCode());
        System.out.println("".hashCode());
        System.out.println("".hashCode());
        int hashCode1 = 31*(31*(31+1140493257) + 1698758127)+0;
        int hashCode2 = 31*(31*(31+1140494218) + 1698728336)+0;
        System.out.println(entry1.hashCode() + "\n" + entry2.hashCode()); 
        System.out.println(getHashCode(
                        new String[]{entry1.chamorro, entry1.english, entry1.notes}) 
                        + "\n" + getHashCode(
                        new String[]{entry2.chamorro, entry2.english, entry2.notes})); 
        System.out.println(hashCode1 + "\n" + hashCode2); // returns same hash code!
      }
    
        public static int getHashCode(Object a[]) {
            if (a == null)
                return 0;
            int result = 1;
            for (Object element : a)
                result = 31 * result + (element == null ? 0 : element.hashCode());
            return result;
        }
    

    If you use some different string parameters, hope it will result into different hashCode.

    0 讨论(0)
  • 2021-01-03 05:36

    Actually, you happened to trigger pure coincidence. :)

    Objects.hash happens to be implemented by successively adding the hash code of each given object and then multiplying the result by 31, while String.hashCode does the same with each of its characters. By coincidence, the differences in the "English" strings you used occur at exactly one offset more from the end of the string as the same difference in the "Chamorro" string, so everything cancels out perfectly. Congratulations!

    Try with other strings, and you'll probably find that it works as expected. As others have already pointed out, this effect is not actually wrong, strictly speaking, since hash codes may correctly collide even if the objects they represent are unequal. If anything, it might be worthwhile trying to find a more efficient hash, but I hardly think it should be necessary in realistic situations.

    0 讨论(0)
提交回复
热议问题