HashSet order and difference with JDK 7 / 8

前端 未结 4 1539
悲&欢浪女
悲&欢浪女 2021-01-13 18:30

This is a two part question:

  1. Does HashSet implement some hidden ordering mechanic or it just, to quote the documentation:It makes no guarantees as to the
4条回答
  •  -上瘾入骨i
    2021-01-13 18:50

    To have something more exciting, change your example code to

    public static void main(String... args) {
      System.out.println(System.getProperty("java.version"));
      List strings=Arrays.asList("qwe", "rtz", "123", "qwea",
          "12334rasefasd", "asdxasd", "arfskt6734", "123121", "", "qwr",
          "rtzz", "1234", "qwes", "1234rasefasd", "asdxasdq", "arfskt6743",
          "123121 ", " ");
    
      for (int i = 5; i < 26; i++) {
          Set stringSet = new HashSet<>(1<

    This still adds the same strings in the same order to a HashSet, but the HashSet has been initialized with different capacities.
    The result is

    1.7.0_51
    [,  , qwea, 123, asdxasdq, qwe, qwr, 123121 , arfskt6743, 1234rasefasd, qwes, rtzz, rtz, 1234, 12334rasefasd, arfskt6734, asdxasd, 123121]
    [, qwr, arfskt6743, rtzz, 12334rasefasd,  , qwea, 123, asdxasdq, qwe, 123121 , 1234rasefasd, qwes, rtz, 1234, arfskt6734, asdxasd, 123121]
    [, qwr, arfskt6743, rtzz, 12334rasefasd,  , 123, rtz, 1234, arfskt6734, asdxasd, 123121, qwea, asdxasdq, qwe, 123121 , 1234rasefasd, qwes]
    [, qwr, arfskt6743, rtzz, 12334rasefasd,  , arfskt6734, asdxasd, 123121, 123121 , 1234rasefasd, 123, rtz, 1234, qwea, asdxasdq, qwe, qwes]
    [, rtzz,  , 123121, 123121 , 1234rasefasd, 123, rtz, qwea, asdxasdq, qwe, qwes, qwr, arfskt6743, 12334rasefasd, arfskt6734, asdxasd, 1234]
    [, rtzz,  , 123121, 123, asdxasdq, 123121 , 1234rasefasd, rtz, qwea, qwe, qwes, qwr, arfskt6743, 12334rasefasd, arfskt6734, asdxasd, 1234]
    [,  , 123121, asdxasdq, 1234rasefasd, rtz, qwea, qwes, arfskt6743, 12334rasefasd, arfskt6734, asdxasd, rtzz, 123, 123121 , qwe, qwr, 1234]
    [,  , 123121, 1234rasefasd, rtz, qwea, qwes, asdxasd, rtzz, 123, 1234, asdxasdq, arfskt6743, 12334rasefasd, arfskt6734, 123121 , qwe, qwr]
    [,  , rtz, asdxasd, rtzz, arfskt6743, 12334rasefasd, arfskt6734, 123121 , qwe, qwr, 123121, 1234rasefasd, qwea, qwes, 123, 1234, asdxasdq]
    [,  , arfskt6743, 12334rasefasd, arfskt6734, qwea, qwes, 1234, asdxasdq, rtz, asdxasd, rtzz, 123121 , qwe, qwr, 123121, 1234rasefasd, 123]
    [,  , qwea, qwes, rtz, asdxasd, rtzz, 123121 , qwe, qwr, 123, arfskt6743, 12334rasefasd, arfskt6734, 1234, asdxasdq, 123121, 1234rasefasd]
    [,  , qwea, qwes, asdxasd, 1234, asdxasdq, 123121, 1234rasefasd, rtz, rtzz, 123121 , qwe, qwr, 123, arfskt6743, 12334rasefasd, arfskt6734]
    [,  , qwea, qwes, asdxasd, 1234, 1234rasefasd, rtzz, 123121 , 123, arfskt6743, arfskt6734, asdxasdq, 123121, rtz, qwe, qwr, 12334rasefasd]
    [,  , 1234rasefasd, 123, asdxasdq, rtz, qwe, qwr, 12334rasefasd, qwea, qwes, asdxasd, 1234, rtzz, 123121 , arfskt6743, arfskt6734, 123121]
    [,  , 123, asdxasdq, rtz, qwe, qwr, 12334rasefasd, 123121 , 123121, 1234rasefasd, qwea, qwes, asdxasd, 1234, rtzz, arfskt6743, arfskt6734]
    [,  , 123, asdxasdq, rtz, qwe, qwr, 12334rasefasd, 1234rasefasd, qwea, qwes, asdxasd, 1234, rtzz, 123121 , 123121, arfskt6743, arfskt6734]
    [,  , 123, rtz, qwe, qwr, 12334rasefasd, asdxasd, asdxasdq, 1234rasefasd, qwea, qwes, 1234, rtzz, 123121 , 123121, arfskt6743, arfskt6734]
    [,  , 123, rtz, qwe, qwr, asdxasd, asdxasdq, 1234, arfskt6743, arfskt6734, 12334rasefasd, 1234rasefasd, qwea, qwes, rtzz, 123121 , 123121]
    [,  , 123, rtz, qwe, qwr, asdxasdq, 1234, 12334rasefasd, 1234rasefasd, qwea, qwes, rtzz, 123121 , 123121, asdxasd, arfskt6743, arfskt6734]
    [,  , 123, rtz, qwe, qwr, 1234, 12334rasefasd, qwea, qwes, rtzz, 123121 , asdxasdq, 1234rasefasd, 123121, asdxasd, arfskt6743, arfskt6734]
    [,  , 123, rtz, qwe, qwr, 1234, qwea, qwes, rtzz, 12334rasefasd, 123121 , asdxasdq, 1234rasefasd, 123121, asdxasd, arfskt6743, arfskt6734]
    
    1.8.0_111
    [,  , qwes, arfskt6743, asdxasdq, 123121, 123121 , arfskt6734, qwr, 123, 1234, qwea, rtzz, rtz, 12334rasefasd, 1234rasefasd, qwe, asdxasd]
    [, 123121, arfskt6734, qwr, 1234, asdxasd,  , qwes, arfskt6743, asdxasdq, 123121 , 123, qwea, rtzz, rtz, 12334rasefasd, 1234rasefasd, qwe]
    [, arfskt6734, qwr,  , arfskt6743, asdxasdq, 123, rtzz, 123121, 1234, asdxasd, qwes, 123121 , qwea, rtz, 12334rasefasd, 1234rasefasd, qwe]
    [, qwr,  , 123, rtzz, 1234, asdxasd, qwes, 123121 , qwea, rtz, 1234rasefasd, arfskt6734, arfskt6743, asdxasdq, 123121, 12334rasefasd, qwe]
    [,  , 123, 1234, rtz, arfskt6734, arfskt6743, asdxasdq, 123121, 12334rasefasd, qwe, qwr, rtzz, asdxasd, qwes, 123121 , qwea, 1234rasefasd]
    [,  , 1234, asdxasdq, 123121, 12334rasefasd, rtzz, asdxasd, qwes, 123121 , qwea, 123, rtz, arfskt6734, arfskt6743, qwe, qwr, 1234rasefasd]
    [,  , 1234, 12334rasefasd, qwes, qwea, rtz, asdxasdq, 123121, rtzz, asdxasd, 123121 , 123, arfskt6734, arfskt6743, qwe, qwr, 1234rasefasd]
    [,  , asdxasdq, rtzz, 123121 , arfskt6734, arfskt6743, qwe, qwr, 1234, 12334rasefasd, qwes, qwea, rtz, 123121, asdxasd, 123, 1234rasefasd]
    [,  , asdxasdq, 123121 , arfskt6734, arfskt6743, 1234, 12334rasefasd, qwes, qwea, 123121, asdxasd, 1234rasefasd, rtzz, qwe, qwr, rtz, 123]
    [,  , asdxasdq, 1234, rtzz, 123121 , arfskt6734, arfskt6743, 12334rasefasd, qwes, qwea, 123121, asdxasd, 1234rasefasd, qwe, qwr, rtz, 123]
    [,  , 1234, rtzz, 123121 , arfskt6734, arfskt6743, 12334rasefasd, qwes, qwea, 123121, asdxasd, qwe, qwr, rtz, 123, asdxasdq, 1234rasefasd]
    [,  , 1234, 123121 , arfskt6734, arfskt6743, qwes, qwea, asdxasd, rtzz, 12334rasefasd, 123121, qwe, qwr, rtz, 123, asdxasdq, 1234rasefasd]
    [,  , arfskt6734, arfskt6743, asdxasd, 123, asdxasdq, 1234rasefasd, 1234, 123121 , qwes, qwea, rtzz, 12334rasefasd, 123121, qwe, qwr, rtz]
    [,  , arfskt6734, arfskt6743, 123, asdxasdq, 123121 , qwes, qwea, rtzz, 12334rasefasd, 123121, qwe, qwr, rtz, asdxasd, 1234rasefasd, 1234]
    [,  , 123, 123121 , qwe, qwr, rtz, asdxasd, 1234rasefasd, arfskt6734, arfskt6743, asdxasdq, qwes, qwea, rtzz, 12334rasefasd, 123121, 1234]
    [,  , 123, qwe, qwr, rtz, asdxasd, 1234rasefasd, arfskt6734, arfskt6743, qwes, qwea, rtzz, 123121, 1234, 123121 , asdxasdq, 12334rasefasd]
    [,  , 123, qwe, qwr, rtz, arfskt6734, arfskt6743, 123121 , asdxasdq, asdxasd, 1234rasefasd, qwes, qwea, rtzz, 123121, 1234, 12334rasefasd]
    [,  , 123, qwe, qwr, rtz, arfskt6734, arfskt6743, 123121 , 1234, 12334rasefasd, asdxasdq, asdxasd, 1234rasefasd, qwes, qwea, rtzz, 123121]
    [,  , 123, qwe, qwr, rtz, arfskt6734, arfskt6743, 1234, asdxasdq, asdxasd, 1234rasefasd, qwes, qwea, rtzz, 123121 , 12334rasefasd, 123121]
    [,  , 123, qwe, qwr, rtz, 1234, asdxasdq, asdxasd, qwes, qwea, rtzz, 123121 , 12334rasefasd, 123121, arfskt6734, arfskt6743, 1234rasefasd]
    [,  , 123, qwe, qwr, rtz, 1234, qwes, qwea, rtzz, 123121 , 123121, arfskt6734, arfskt6743, asdxasdq, asdxasd, 12334rasefasd, 1234rasefasd]
    

    This demonstrates that the iteration order is not only depending on the particular implementation, but also on the history of the HashSet. A higher capacity might also be the result of being previously bigger but having elements removed.

    While the hash code determines which array position to use for an element, there might also be collisions, causing elements to share an entry. Within that entry the collision might get resolved through a linked list, in which case, the order within this bucket reflects the insertion order, so it also depends on the set’s history, or might get resolved by using a balanced tree since Java 8, which will reflect the order of either, the hashcodes or the element’s natural order, depending on whether this as a true hash collision or just a bucket collision.

    But Java 8’s HashSet will only use the tree, if there are a certain number of collisions at a bucket, otherwise, it also uses a linked list. To avoid switching back and forth between these variants, it uses different thresholds for converting to a tree and for converting back to a linked list. So, if the number of collision is in-between these threshold, it will again depend on the set’s history, i.e. whether there were more elements previously, which form and hence which order it will have.


    Note that Java 7’s “alternative String hash function” was disabled by default and the collision resolution addresses a corner case. Still, as you can see from the output, there is almost always a notable difference in the iteration order.

    The reason is that now that collisions are handled more efficiently, the attempts to avoid collisions have been reduced. In Java 7, hash codes underwent the following transformation before getting mapped to array positions:

     h ^= (h >>> 20) ^ (h >>> 12);
     return h ^ (h >>> 7) ^ (h >>> 4);
    

    In contrast, Java 8 uses the following transformation:

    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    

    This has an immediate impact on the iteration order, even if no collisions occurred.

提交回复
热议问题