What does comparison being consistent with equals mean ? What can possibly happen if my class doesn't follow this principle?

后端 未结 4 1315
臣服心动
臣服心动 2020-11-30 04:24

From the JavaDoc of TreeMap :

Note that the ordering maintained by a sorted map (whether or not an explicit comparator is provided) must be consiste

相关标签:
4条回答
  • 2020-11-30 04:44

    The contract of the Comparable interface allows for non-consistent behaviour:

    It is strongly recommended (though not required) that natural orderings be consistent with equals.

    So in theory, it is possible that a class in the JDK had a compareTo not consistent with equals. One good example is BigDecimal.

    Below is a contrived example of a comparator that is not consistent with equals (it basically says that all strings are equal).

    Output:

    size: 1
    content: {a=b}

    public static void main(String[] args) {
        Map<String, String> brokenMap = new TreeMap<String, String> (new Comparator<String>() {
    
            @Override
            public int compare(String o1, String o2) {
                return 0;
            }
        });
    
        brokenMap.put("a", "a");
        brokenMap.put("b", "b");
        System.out.println("size: " + brokenMap.size());
        System.out.println("content: " + brokenMap);
    }
    
    0 讨论(0)
  • 2020-11-30 04:45

    Say we have this simple Student class implementing Comparable<Student> but not overriding equals()/hashCode(). Of course equals() is not consistent with compareTo() - two different students with the same age aren't equal:

    class Student implements Comparable<Student> {
    
        private final int age;
    
        Student(int age) {
            this.age = age;
        }
    
        @Override
        public int compareTo(Student o) {
            return this.age - o.age;
        }
    
        @Override
        public String toString() {
            return "Student(" + age + ")";
        }
    }
    

    We can safely use it in TreeMap<Student, String>:

    Map<Student, String> students = new TreeMap<Student, String>();
    students.put(new Student(25), "twenty five");
    students.put(new Student(22), "twenty two");
    students.put(new Student(26), "twenty six");
    for (Map.Entry<Student, String> entry : students.entrySet()) {
        System.out.println(entry);
    }
    System.out.println(students.get(new Student(22)));
    

    The results are easy to predict: students are nicely sorted according to their age (despite being inserted in different order) and fetching student using new Student(22) key works as well and returns "twenty two". This means we can safely use Student class in TreeMap.

    However change students to HashMap and things go bad:

    Map<Student, String> students = new HashMap<Student, String>();
    

    Obviously the enumeration of items returns "random" order due to hashing - that's fine, it doesn't violate any Map contract. But the last statement is completely broken. Because HashMap uses equals()/hashCode() to compare instances, fetching value by new Student(22) key fails and returns null!

    This is what the JavaDoc tries to explain: such classes will work with TreeMap but might fail to work with other Map implementations. Note that Map operations are documented and defined in terms of equals()/hashCode(), e.g. containsKey():

    [...] returns true if and only if this map contains a mapping for a key k such that (key==null ? k==null : key.equals(k))

    Thus I don't believe there are any standard JDK classes that implemente Comparable but fail to implement equals()/hashCode() pair.

    0 讨论(0)
  • 2020-11-30 04:49

    Here's a simple but realistic example of what can happen if a comparison method is inconsistent with equals. In the JDK, BigDecimal implements Comparable but its comparison method is inconsistent with equals. For example:

    > BigDecimal z = new BigDecimal("0.0")
    > BigDecimal zz = new BigDecimal("0.00")
    > z.compareTo(zz)
    0
    > z.equals(zz)
    false
    

    This is because the comparison method of BigDecimal considers only the numeric value, but equals also considers the precision. Since 0.0 and 0.00 have different precisions, they are unequal even though they have the same numeric value.

    Here's an example of what it means for a TreeSet to violate the general contract of Set. (It's the same situation with TreeMap and Map but it's a bit easier to demonstrate using sets.) Let's compare the results of contains to the result of getting the element out of the set and calling equals:

    > TreeSet<BigDecimal> ts = new TreeSet<>()
    > ts.add(z)
    > ts.contains(z)
    true
    > z.equals(ts.iterator().next())
    true
    > ts.contains(zz)
    true
    > zz.equals(ts.iterator().next())
    false
    

    The surprising thing here is that the TreeSet says it contains the object zz, but it's unequal to the element that's actually contained in the set. The reason is that TreeSet uses its comparison method (BigDecimal.compareTo) to determine set membership, not equals.

    Now let's compare TreeSet to HashSet:

    > HashSet<BigDecimal> hs = new HashSet<>(ts)
    > hs.equals(ts)
    true
    > ts.contains(zz)
    true
    > hs.contains(zz)
    false
    

    This is strange. We have two sets that are equal, but one set says that it contains an object but another set says that it doesn't contain the same object. Again, this reflects the fact that TreeSet is using the comparison method whereas HashSet is using equals.

    Now let's add the other object to a HashSet and see what happens:

    > HashSet<BigDecimal> hs2 = new HashSet<>()
    > hs2.add(zz)
    > ts.equals(hs2)
    true
    > hs2.equals(ts)
    false
    

    Now that's weird. One set says it's equal to the other, but the other set says it's not equal to the first! To understand this, you need to understand how equality of sets is determined. Two sets are considered equal if a) they are of the same size, and b) each element in the other set is also contained in this set. That is, if you have

    set1.equals(set2)
    

    then the equality algorithm looks at the sizes and then it iterates over set2, and for each element it checks whether that element is contained in set1. That's where the asymmetry comes in. When we do

    ts.equals(hs2)
    

    both sets are of size 1, so we proceed to the iteration step. We iterate over hs2 and use then call the TreeSet.contains method -- which uses the comparison method. As far as the TreeSet is concerned, it's equal to the HashSet hs2.

    Now when we do

    hs2.equals(ts)
    

    the comparison goes the other way. We iterate over the TreeSet and get its element, and ask hs2 whether it contains that element. Since the HashSet.contains uses equals, it returns false, and the overall result is false.

    0 讨论(0)
  • 2020-11-30 05:00

    Here is another example of when consistency with equals AND total ordering are important to implement.

    Say we have an object MyObject which has two fields: id and quantity. id as its name suggests is the natural key of the object and quantity is just an attribute.

    public class MyObject {
      int id;
      int quantity;
      ...
    }
    

    Let's imagine that we want to use a collections of MyObject sorted by quantity descending. The First comparator we can write is:

    Comparator<MyObject> naiveComp = new Comparator<MyObject>() {
      @Override
      public int compare(MyObject o1, MyObject o2) {
        return o2.quantity - o1.quantity;
      }
    };
    

    Using MyObject instances equipped with this comparator in a TreeMap/TreeSet fails because it comparator is not consistent with equals (see full code below). Let's make it consistent with equals:

    Comparator<MyObject> slightlyBetterComp = new Comparator<MyObject>() {
      @Override
      public int compare(MyObject o1, MyObject o2) {
        if (o1.equals(o2)) {
          return 0;
        }
        if (o1.quantity == o2.quantity) {
          return -1; // never 0
        }
        return o2.quantity - o1.quantity; // never 0
      }
    };
    

    However, this fails again to fit in TreeSet/TreeMap! (see full code below) This is because the ordering relation is not total, i.e. not any two objects can be strictly put in an ordering relationship. In this comparator, when quantity fields are equal, the resulting ordering is undetermined.

    A better comparator would be:

    Comparator<MyObject> betterComp = new Comparator<MyObject>() {
      @Override
      public int compare(MyObject o1, MyObject o2) {
        if (o1.equals(o2)) {
          return 0;
        }
        if (o1.quantity == o2.quantity) {
          return o1.id - o2.id; // never 0
        }
        return o2.quantity - o1.quantity; // never 0
      }
    };
    

    This comparator ensures that:

    • when compareTo returns 0 it implies that two objects are equal (initial check for equality)
    • all items are totally ordered by using id as a discriminant ordering field when quantity are equal

    Full Testing Code:

    package treemap;
    
    import java.util.Comparator;
    import java.util.HashMap;
    import java.util.Map;
    import java.util.TreeMap;
    
    public class MyObject {
      int id;
      int quantity;
    
      public MyObject(int id, int quantity) {
        this.id = id;
        this.quantity = quantity;
      }
    
      @Override
      public int hashCode() {
        int hash = 7;
        hash = 97 * hash + this.id;
        return hash;
      }
    
      @Override
      public boolean equals(Object obj) {
        if (obj == null) {
          return false;
        }
        if (getClass() != obj.getClass()) {
          return false;
        }
        final MyObject other = (MyObject) obj;
        if (this.id != other.id) {
          return false;
        }
        return true;
      }
    
      @Override
      public String toString() {
        return "{" + id + ", " + quantity + "}";
      }
    
      public static void main(String[] args) {
        String format = "%30.30s: %s\n";
        Map<MyObject, Object> map = new HashMap();
        map.put(new MyObject(1, 100), 0);
        map.put(new MyObject(2, 100), 0);
        map.put(new MyObject(3, 200), 0);
        map.put(new MyObject(4, 100), 0);
        map.put(new MyObject(5, 500), 0);
        System.out.printf(format, "Random Order", map.keySet());
    
        // Naive non-consisten-with-equal and non-total comparator
        Comparator<MyObject> naiveComp = new Comparator<MyObject>() {
          @Override
          public int compare(MyObject o1, MyObject o2) {
            return o2.quantity - o1.quantity;
          }
        };
        Map<MyObject, Object> badMap = new TreeMap(naiveComp);
        badMap.putAll(map);
        System.out.printf(format, "Non Consistent and Non Total", badMap.keySet());
    
        // Better consisten-with-equal but non-total comparator
        Comparator<MyObject> slightlyBetterComp = new Comparator<MyObject>() {
          @Override
          public int compare(MyObject o1, MyObject o2) {
            if (o1.equals(o2)) {
              return 0;
            }
            if (o1.quantity == o2.quantity) {
              return -1; // never 0
            }
            return o2.quantity - o1.quantity; // never 0
          }
        };
        Map<MyObject, Object> slightlyBetterMap = new TreeMap(naiveComp);
        slightlyBetterMap.putAll(map);
        System.out.printf(format, "Non Consistent but Total", slightlyBetterMap.keySet());
    
        // Consistent with equal AND total comparator
        Comparator<MyObject> betterComp = new Comparator<MyObject>() {
          @Override
          public int compare(MyObject o1, MyObject o2) {
            if (o1.equals(o2)) {
              return 0;
            }
            if (o1.quantity == o2.quantity) {
              return o1.id - o2.id; // never 0
            }
            return o2.quantity - o1.quantity; // never 0
          }
        };
        Map<MyObject, Object> betterMap = new TreeMap(betterComp);
        betterMap.putAll(map);
        System.out.printf(format, "Consistent and Total", betterMap.keySet());
      }
    }
    

    Output:

                      Random Order: [{5, 500}, {4, 100}, {3, 200}, {2, 100}, {1, 100}]
      Non Consistent and Non Total: [{5, 500}, {3, 200}, {4, 100}]
          Consistent but Not Total: [{5, 500}, {3, 200}, {4, 100}]
              Consistent and Total: [{5, 500}, {3, 200}, {1, 100}, {2, 100}, {4, 100}]
    

    Conclusion:

    Although I think it is very legitimate to segregate identity from ordering conceptually. For instance, in relational database terms:

    select * from MyObjects order by quantity
    

    works perfectly. We don't care about object identity here, nor we want total ordering

    However, due to constraints in tree based collections implementation, one has to ensure that any comparator they write:

    • is consistency with equals
    • provide a total ordering over all possible objects
    0 讨论(0)
提交回复
热议问题