Using Integer as a key with HashMap in Java

前端 未结 3 812
逝去的感伤
逝去的感伤 2021-01-20 10:49

Recently I was looking for good implementation of hashCode() method in Java API and looked through Integer source code. Didn\'t expect that, but th

相关标签:
3条回答
  • 2021-01-20 11:21

    Integer is the worst data type candidate for a key when used with HashMap, as all consecutive keys will be places in one bin

    No, that statement is wrong.

    In fact, the implementation of Integer's hashCode() is the best possible implementation. It maps each Integer value to a unique hashCode value, which reduces the chance of different keys being mapped into the same bucket.

    Sometimes a simple implementation is the best.

    From the Javadoc of hashCode() in the Object class:

    It is not required that if two objects are unequal according to the java.lang.Object.equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

    Integer is one of the few classes that actually guarantees that unequal objects will have different hashCode().

    0 讨论(0)
  • 2021-01-20 11:25

    From the docs:

    The general contract of hashCode is:

    Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

    --> Integer#hashCode fulfills this.

    If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

    --> Integer#hashCode fulfills this too.

    It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

    --> Integer#hashCode fulfills this to the maximum extent, i.e. two unequal Integers will never have the same hash code.

    0 讨论(0)
  • 2021-01-20 11:33

    Adding to @Eran's answer, Java's HashMap also has a protection against "bad hash functions" (which Integer.hashCode() isn't, but still).

    /**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    

    So your "simple hash" of an integer will actually be a bit different when working with HashMap.

    0 讨论(0)
提交回复
热议问题