Why is HashSet maintaining natural/ alphabetical order? [duplicate]

孤街浪徒 提交于 2019-12-13 16:28:59

问题


When I run below code it is always giving the o/p in natural/ alphabetical order. As per I know HashSet doesn't sort the entries. I know that HashSet is backed by HashMap and not LinkedHashMap. I tried to explore the source code of HashSet and HashMap but couldn't find the code for this behavior.

From the source code there is below constructor in HashSet class:

HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = new LinkedHashMap<>(initialCapacity, loadFactor);
    }

which uses LinkedHashMap. If I had used this constructor I would think this is the reason for this behavior but I'm not using this constructor.

Could someone please explain the reason/ code for this behavior?

Here's my simple code:

Set<String> mySet = new HashSet<>();

        mySet.add("D");
        mySet.add("B");
        mySet.add("1");
        mySet.add("E");
        mySet.add("A");
        mySet.add("F");

        mySet.stream().forEach(x -> System.out.println(x));

OP:

1
A
B
D
E
F

回答1:


This is coincidence because the default HashSet is bigger than the range of the hashes and there are no collisions, and the hashes for the Strings end up being in alphabetical order.

This is the code for String.hashCode:

   public int hashCode() {
        int h = hash;
        if (h == 0) {
            int off = offset;
            char val[] = value;
            int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

As you can see, the hash for one-character string ends up just being the character value.

The default capacity of HashSet is 16, which means all your values end up in the bucket char value % 16 which turns out to be alphabetical order for your example. Try with "2" instead of "1", for example, this should end up after "A". Even if you swap "A" and "1" this should swap them in the output too. See Ascii table.




回答2:


From Java 8 Docs

This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.

So in other words, you cannot rely on the order of the elements in a HashSet.




回答3:


The hashCode of Strings of length 1 hash just the sole char, and its hash code is its own numeric value. Voilà, all is ordered.

This phenomenon partly also can be found for Strings with the same prefix, of the same length and is relevant for security exploits. (I believe MD5 needs an artifical seed.)




回答4:


This is just a coincidence for your testing/working data set which is generating ordered hashes. I have added some more element to your set. Try to run the below code, I think you will get your answer.

Set<String> mySet = new HashSet<>();

mySet.add("D");
mySet.add("B");
mySet.add("1");
mySet.add("E");
mySet.add("A");
mySet.add("F");
mySet.add("C");
mySet.add("Z");
mySet.add("M");
mySet.add("Q");


mySet.stream().forEach(x -> System.out.println(x));

Here is my output (which is not in natural order) : 1 A Q B C D E F Z M




回答5:


Use the following code and you would see that the hashcode of the added elements are in a ascending order :

Set<String> mySet = new HashSet<>();

mySet.add("D");
mySet.add("B");
mySet.add("1");
mySet.add("E");
mySet.add("A");
mySet.add("F");

mySet.stream()
     .forEach(x -> System.out.println(x + " : " + x.hashCode()));

System.out.println(mySet);

1 : 49

A : 65

B : 66

D : 68

E : 69

F : 70

[1, A, B, D, E, F]

Here you used a very particular example : you added only Strings containing a single character (letter or number).
As hashcode of these correspond to their ASCII code you get a predictable order that respects the ASCCI order.

Distinct hashcode values ares physically represented by distinct elements of an array in the HashMap implementation :

transient Node<K,V>[] table;

And Iterator of HashMap iterates array elements index by index.
Whereas the result.

Now the ASCII order that the Map uses to iterate that looks like a natural order for numeric and alphabetical character is right only for very simple cases where added Strings are composed of only 1 letter or 1 digit.

Add Strings that contain more than one of character and you would have a unpredictable order :

Set<String> mySet = new HashSet<>();
mySet.add("Dad");
mySet.add("Mum");
mySet.add("15454");
mySet.add("90000");

mySet.stream()
     .forEach(x -> System.out.println(x + " : " + x.hashCode()));

System.out.println(mySet);

90000 : 54118329

Mum : 77733

15454 : 46883119

Dad : 68455

[90000, Mum, 15454, Dad]



来源:https://stackoverflow.com/questions/48560736/why-is-hashset-maintaining-natural-alphabetical-order

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!