From the JavaDocs of HashSet:
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the ha
If your concern is the time it takes to iterate around the set, and you are using Java 6 or greater take a look at this beauty:
ConcurrentSkipListSet
Using LinkedHashSet follows the "linked" list of entries so the number of blanks doesn't matter. Normally you wouldn't have a HashSet where the capacity is much more than double the size actually used. Even if you do, scanning a million entries, mostly null
doesn't take much time (milli-seconds)
HashSet
is imlemented using a HashMap
where the elements are the map keys. Since a map has a defined number of buckets that can contain one or more elements, iteration needs to check each bucket, whether it contains elements or not.
Why does iteration takes time proportional to the sum(number of elements in set+ capacity of backing map) and not only to the number of elements in the set itself ?
The elements are dispersed inside the underlying HashMap
which is backed by an array.
So it is not known which buckets are occupied (but it is known how many elements are totally available).
So to iterate over all elements all buckets must be checked