Adding arrays with same values to HashSet results in duplicate items

核能气质少年 提交于 2019-12-17 20:07:38

问题


I'm trying to create a set of arrays of ints, the thing is that if I try to do:

HashSet<int[]> s = new HashSet<int[]>();
int a1[] = {1,2,3};
int a2[] = {1,2,3};
s.add(a1);
s.add(a2)
System.out.println(s.size());

Then s has two objects, but there should be only one. Note: it doesn't matter if it is HashSet< Integer[]>. It just doesn't work.

Now If I try to do this with an ArrayList< Integer>, something like:

HashSet<ArrayList<Integer>> s = new HashSet<ArrayList<Integer>>();
ArrayList<Integer> a1 = new ArrayList<Integer>();
ArrayList<Integer> a2 = new ArrayList<Integer>();
a1.add(1);
a1.add(2);
a1.add(3);

a2.add(1);
a2.add(2);
a2.add(3);

s.add(a1);
s.add(a2)
System.out.println(s.size());

Then s has one object.

I though a way to avoid the error in the first code and was storing hashcodes of each array in a hashset as follows:

int a1[] = {0,10083,10084,1,0,1,10083,0,0,0,0};
int a2[] = {1 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,1 ,0,2112};
HashSet<Integer> s= new HashSet<Integer>();//hashcodes of each array
s.add(Arrays.hashCode(a1));
s.add(Arrays.hashCode(a2));
System.out.println(Arrays.hashCode(a1));
System.out.println(Arrays.hashCode(a2));
System.out.println(s.size());

It works for the first case(1,2,3) but in cases where there are collisions it doesn't work, so I would have to manage collisions. So, I think that what I am doing is implementing a HashSet by myself.

With HashSet< ArrayList< Integer>> it works perfectly. I suppose that java manage collisions in that case.

My question is why java does not allow to manage a HashSet< int[]> or HashSet< Integer[]> if hashcodes generated are the same as in ArrayList< Integer> and hashcodes of arrays can be calculated simply by calling Arrays.hashCode(...).

And finally, if I want to do a HashSet< int[]>(or HashSet< Integer[]>) I would have to implement it by myself? Or there is a better way to do it?

Thanks.

UPDATE: Ok, finally I think I have came to a complete answer. As @ZiyaoWei and @user1676075 commented it doesn't work because equals returns false and hashcodes are differents. But, why java does not override this methods(with Arrays.equals(), Arrays.hashCode()) so one can do something like HashSet< int[]>? The answer is because an array is a mutable object, and hashcode can not depend on mutable values(each element of array is a mutable value) according to the general contract of hashcode. Mutable objects and hashCode

Here nice explanations of using mutable fields in hashCode http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/ and mutable keys in hashmaps Are mutable hashmap keys a dangerous practice?

My answer is, if you want to use a HashSet< int[]> you have to create a class that has an array and if you want that hashcode and equals to depend on values, override methods equals() and hashCode() with Arrays.equals() and Arrays.hashCode(). If you don't want to violate the contract just make the array final.

Thanks everyone!


回答1:


It has nothing to do with collision at the end of the day:

a1.equals(a2) == false

Since they are not equal, a Set will treat them as different.

Note Array in Java does not override the equals method from Object.

And since add in Set is defined as

More formally, adds the specified element e to this set if the set contains no element e2 such that (e==null ? e2==null : e.equals(e2))

is seems to be impossible to properly implement a Set that might meet your requirement (compare elements with Arrays.equals) without violating some contracts.




回答2:


The reason the HashSet> works is because the HashSet will use .equals() comparison to decide if you're inserting the same object twice. In the case of List, two lists of the same base type (e.g. ArrayList) with the same content, in the same order, will compare as equal. Thus you're telling the HashSet to insert the same object twice. It only takes a single instance once.

When you try to do that same operation with an array. See this post: equals vs Arrays.equals in Java for more details about array comparisons in Java. When you insert two arrays, the default .equals() tests if they are the same object, which they are not. Thus it fails.



来源:https://stackoverflow.com/questions/16657905/adding-arrays-with-same-values-to-hashset-results-in-duplicate-items

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!