Hash Set and Array List performances

后端 未结 4 1309
迷失自我
迷失自我 2020-11-30 02:27

I have implemented a method which simply loops around a set of CSV files that contain data on a number of different module. This then adds the \'moduleName\' into a hashSet.

相关标签:
4条回答
  • 2020-11-30 02:55

    It depends upon the usage of the data structure.

    You are storing the data in HashSet, and for your case for storage HashSet is better than ArrayList (as you do not want duplicate entries). But just storing is not the usual intent.

    It depends as how you wish to read and process the stored data. If you want sequential access or random index based access then ArrayList is better or if ordering does not matter then HashSet is better.

    If ordering matters but you want to do lot of modifications (additions and deletions) the LinkedList is better.

    For accessing a particular element HashSet will have time complexity as O (1) and if you would have used ArrayList it would have been O (N) as you yourself have pointed out you would have to iterate through the list and see if the element is not present.

    0 讨论(0)
  • 2020-11-30 02:56

    My experiment shows that HashSet is faster than an ArrayList starting at collections of 3 elements inclusively.

    A complete results table

    | Boost  |  Collection Size  |
    |  2x    |       3 elements  |
    |  3x    |      10 elements  |
    |  6x    |      50 elements  |
    |  12x   |     200 elements  |  <= proportion 532-12 vs 10.000-200 elements
    |  532x  |  10.000 elements  |  <= shows linear lookup growth for the ArrayList
    
    0 讨论(0)
  • 2020-11-30 02:59

    I believe using the hash set has a better performance than an array list. Am I correct in stating that?

    With many (whatever it means) entries, yes. With small data sizes, raw linear search could be faster than hashing, though. Where exactly the break-even is, you have to just measure. My gut feeling is that with fewer than 10 elements, linear look-up is probably faster; with more than 100 elements hashing is probably faster, but that's just my feeling...

    Lookup from a HashSet is constant time, O(1), provided that the hashCode implementation of the elements is sane. Linear look-up from a list is linear time, O(n).

    0 讨论(0)
  • 2020-11-30 03:00

    They're completely different classes, so the question is: what kind of behaviour do you want?

    HashSet ensures there are no duplicates, gives you an O(1) contains() method but doesn't preserve order.
    ArrayList doesn't ensure there are no duplicates, contains() is O(n) but you can control the order of the entries.

    0 讨论(0)
提交回复
热议问题