I\'ve always loved trees, that nice O(n*log(n))
and the tidiness of them. However, every software engineer I\'ve ever known has asked me pointedly why I would u
1.HashSet allows null object.
2.TreeSet will not allow null object. If you try to add null value it will throw a NullPointerException.
3.HashSet is much faster than TreeSet.
e.g.
TreeSet<String> ts = new TreeSet<String>();
ts.add(null); // throws NullPointerException
HashSet<String> hs = new HashSet<String>();
hs.add(null); // runs fine
A lot of answers have been given, based on technical considerations, especially around performance.
According to me, choice between TreeSet
and HashSet
matters.
But I would rather say the choice should be driven by conceptual considerations first.
If, for the objects your need to manipulate, a natural ordering does not make sense, then do not use TreeSet
.
It is a sorted set, since it implements SortedSet
. So it means you need to override function compareTo
, which should be consistent with what returns function equals
. For example if you have a set of objects of a class called Student, then I do not think a TreeSet
would make sense, since there is no natural ordering between students. You can order them by their average grade, okay, but this is not a "natural ordering". Function compareTo
would return 0 not only when two objects represent the same student, but also when two different students have the same grade. For the second case, equals
would return false (unless you decide to make the latter return true when two different students have the same grade, which would make equals
function have a misleading meaning, not to say a wrong meaning.)
Please note this consistency between equals
and compareTo
is optional, but strongly recommended. Otherwise the contract of interface Set
is broken, making your code misleading to other people, thus also possibly leading to unexpected behavior.
This link might be a good source of information regarding this question.
HashSet is much faster than TreeSet (constant-time versus log-time for most operations like add, remove and contains) but offers no ordering guarantees like TreeSet.
last()
, headSet(), and tailSet() etcHashSet
and TreeSet
. Implemented as a hash table with a linked list running through it, however,it provides insertion-ordered iteration which is not same as sorted traversal guaranteed by TreeSet.So a choice of usage depends entirely on your needs but I feel that even if you need an ordered collection then you should still prefer HashSet to create the Set and then convert it into TreeSet.
SortedSet<String> s = new TreeSet<String>(hashSet);
HashSet implementations are, of course, much much faster -- less overhead because there's no ordering. A good analysis of the various Set implementations in Java is provided at http://java.sun.com/docs/books/tutorial/collections/implementations/set.html.
The discussion there also points out an interesting 'middle ground' approach to the Tree vs Hash question. Java provides a LinkedHashSet, which is a HashSet with an "insertion-oriented" linked list running through it, that is, the last element in the linked list is also the most recently inserted into the Hash. This allows you to avoid the unruliness of an unordered hash without incurring the increased cost of a TreeSet.
One advantage not yet mentioned of a TreeSet
is that its has greater "locality", which is shorthand for saying (1) if two entries are nearby in the order, a TreeSet
places them near each other in the data structure, and hence in memory; and (2) this placement takes advantage of the principle of locality, which says that similar data is often accessed by an application with similar frequency.
This is in contrast to a HashSet
, which spreads the entries all over memory, no matter what their keys are.
When the latency cost of reading from a hard drive is thousands of times the cost of reading from cache or RAM, and when the data really is accessed with locality, the TreeSet
can be a much better choice.
Even after 11 years, nobody thought of mentioning a very important difference.
Do you think that if HashSet
equals TreeSet
then the opposite is true as well? Take a look at this code:
TreeSet<String> treeSet = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
HashSet<String> hashSet = new HashSet<>();
treeSet.add("a");
hashSet.add("A");
System.out.println(hashSet.equals(treeSet));
System.out.println(treeSet.equals(hashSet));
Try to guess the output and then hover below snippet for seeing what the real output is. Ready? Here you go:
false
true
That's right, they don't hold equivalence relation for a comparator that is inconsistent with equals. The reason for this is that a TreeSet
uses a comparator to determine the equivalence while HashSet
uses equals
. Internally they use HashMap
and TreeMap
so you should expect this behavior with the mentioned Map
s as well.
Originally answered