HashSet vs. ArrayList

[亡魂溺海] 提交于 2019-12-18 04:44:17

问题


So I have a custom class Class that will have a set of another custom class Students. So it will look something like this:

public class Class {
    private Set<Student> students;

    // other methods
}

Now I will be adding and removing many students to the set students and i will also be changing many of the private fields of a student already in the set of students.

QUESTION: What data structure should I use to best implement this? Since I will be changing the property of the Student objects in set student (thereby changing the hashcodes) should I use an ArrayList instead?


回答1:


What data structure should I use to best implement this? Since I will be changing the property of the Student objects in set student (thereby changing the hashcodes) should I use an ArrayList instead?

If the hashcodes for the set elements are liable to change, then you should NOT be using a HashSet. (If you do, the data structure will break, and elements in the set are liable to go missing.)

But I doubt you should be using ArrayList either, because if hashcode() is sensitive to changes to the object, then equals(Object) will most likely be too. And that means that contains(...) and similar methods won't be able to find objects.

I think you should be using a Map type, and using a "student identifier" as the key.

(You could also override hashcode and equals so that equality means that two objects have the same id. But that makes equals(Object) useless for other purposes.)




回答2:


When its comes to the behavior of ArrayList and HashSet they are completely different classes.

ArrayList

  • ArrayList Does not validate duplicates.
  • get() is O(1)
  • contains() is O(n) but you have fully control over the order of the entries.

                          get  add  contains next remove(0) iterator.remove
    ArrayList             O(1) O(1) O(n)     O(1) O(1)      O(1)
    
  • Not thread safe and to make it thread safe you have to use Collections.synchronizedList(...)

HashSet

  • HashSet ensures there are no duplicates.
  • Gives you an O(1) contains() method but doesn't preserve order.

                          add      contains next     notes
    HashSet               O(1)     O(1)     O(h/n)   h is the table 
    
  • Not thread safe and to make it thread safe you have to use Collections.synchronizedSet(...)



回答3:


If you have duplicate data in your code then you should use ArrayList otherwise you can use hashset as shown below So, if your code don't need the duplicate values then use Set instead of list because the set will give much better performance (O(n) vs O(n^2) for the list), and that's normal because avoiding duplicates is the very purpose of a set.

ArrayList

public static void main(String[] args) {

ArrayList arr =new ArrayList();
arr.add("Hello");
arr.add("is");
arr.add("Hello");
System.out.println(arr);  //As we are using Arraylist therefore 
                          //the duplicate elements are allowed therefore
                          //"Hello" is not removed in the output

}

HashSet

public static void main(String[] args) {

HashSet arr =new HashSet();
arr.add("Hello");
arr.add("is");
arr.add("Hello");
System.out.println(arr);  //As we are using Hashset therefore 
                          //the duplicate elements removed therefore
                          //"Hello" is removed in the output

}




回答4:


It depends. As you are talking about student so must be there is somthing like id or rollno which is unique. If yes then override the hashcode method and implement the hashcode on the basis of their id's. Then there is no effect on the hashcode by changeing any of the other properties of student.

To chose Set or List is totaly depends upon your requirements. Read this link, and it will clarify the difference between Set and list
What is the difference between Set and List?

And if you are using objects in a Set then you can try to override both the hashcode and the equals method so that control of uniqueness is in you hands.




回答5:


From your requirement, I thought the best structure should be Map. Set actually underlying uses the Map structure inside, and you also need taking care the equals method override for better lookup. And set and arraylist find the target object need take some find algorithm so it's not so efficient as you expected (especially in the very large collection situation). Even map will waste some space, but if your ID is some kind of primitive type, you could consider the primitive type of map implementation in the Trove library.




回答6:


QUESTION: What data structure should I use to best implement this? Since I will be changing the property of the Student objects in set student (thereby changing the hashcodes) should I use an ArrayList instead?

Definitely if you are gonna to change values used by hashCode or equals it is not possible to use HashMap or HashSet.

You are saying that you want to remove and add a lot. The question is if you want to do it sequntially or randomly(based on index). If you add, remove sequentially then definitely the best choice is LinkedList. If you access objects randomly then ArrayList is much more efficient.




回答7:


You should not use a Set when the results of objects' equals methods will change. If you're identifying students by a stable unique ID number, and equals just checks that ID, then using a Set is fine.

Note that HashSet will use hashCode for indexing and comparison, and hashCode should incorporate exactly those fields that are used to determine equals.




回答8:


For a hashed collection such as HashSet, the key should be immutable. Hashset uses hashing internally to decide the bucket to store the object. And also while retrieving the object it will use hash to find the object bucket. If you are changing the object after storing, it may change the hashcode of the object and Set may not be able to retrieve the correct object. If you need to change the object even after adding it to the collection then using a hashed collection is not a good choice. Rather go for Arraylist, but note that with ArrayList you will lose the advantage to retrieve the desired Student quickly, as it could be with a Set.




回答9:


The javadoc for Set says

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.

So if you are going to use a HashSet if you make hashCode() and equals() based with inmutable fields then you won't have this problem. For example using an unique studentID for each instance.



来源:https://stackoverflow.com/questions/17985029/hashset-vs-arraylist

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!