When I'm looking at the Java Object Ordering tutorial, the last section 'Comparators' of the article confused me a little bit.
By defining a class Employee
which itself is comparable by employee's name, the tutorial doesn't show if this class has overridden the equals
method. Then it uses a customized Comparator
in which the employees are sorted by the seniority to sort a list of employees and which I could understand.
Then the tutorial explains why this won't work for a sorted collection such as TreeSet
(a SortedSet
), and the reason is:
it generates an ordering that is not compatible with equals. This means that this Comparator equates objects that the equals method does not. In particular, any two employees who were hired on the same date will compare as equal. When you're sorting a List, this doesn't matter; but when you're using the Comparator to order a sorted collection, it's fatal. If you use this Comparator to insert multiple employees hired on the same date into a TreeSet, only the first one will be added to the set; the second will be seen as a duplicate element and will be ignored.
Now I'm confused, since I know List
allows duplicate elements while Set
doesn't based on equals
method. So I wonder when the tutorial says the ordering generated by the Comparator
is not compatible with equals, what does it mean? And it also says 'If you use this Comparator to insert multiple employees hired on the same date into a TreeSet, only the first one will be added to the set; the second will be seen as a duplicate element and will be ignored.' I don't understand how using a Comparator
will affect the use of original equals
method. I think my question is how the TreeSet
will be produced and sorted in this case and when the compare
and equals
methods are used.
So I wonder when the tutorial says the ordering generated by the Comparator is not compatible with equals, what does it mean?
In this example, the Comparator
compares two Employee
objects based on their seniority alone. This comparison does not in any way use equals
or hashCode
. Keeping that in mind, when we pass this Comparator
to a TreeSet
, the set will consider any result of 0 from the Comparator
as equality. Therefore, if any Employee
s share starting dates, only one will be added because the set thinks they are equal.
Finally:
I think my question is how the TreeSet will be produced and sorted in this case and when the compare and equals methods are used.
For the TreeSet
, if a Comparator
is given, it uses the compare
method to determine equality and ordering of objects. If no Comparator
is given, then the set uses the compareTo
method of the objects being sorted (they must implement Comparable
).
The reason why the Java specification claims that the compare
/compareTo
method being used must be in line with equals
is because the Set
specification makes use of equals
, even though this specific type of Set
, the TreeSet
, uses comparisons instead.
If you ever receive a Set
from some method implementation, you can expect that there are no duplicates of the objects in that Set
as defined by the equals
method. Because TreeSet
doesn't use this method, however, developers must be careful to ensure that the comparison method results in the same equality as equals
does.
A TreeSet
uses only the Comparator to determine if two elements are "equal":
https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
This means that the Comparator should return 0 if and only if the equals
returns true, to get a consistent behaviour between TreeSet and other sets, like a HashSet
. The HashSet
indeed uses equals
and the hash code to determine if two elements are "equal".