Why not allow an external interface to provide hashCode/equals for a HashMap?

前端 未结 9 1733
闹比i
闹比i 2020-12-17 15:28

With a TreeMap it\'s trivial to provide a custom Comparator, thus overriding the semantics provided by Comparable objects added to the

相关标签:
9条回答
  • 2020-12-17 16:01

    I suspect this has not been done because it would prevent hashCode caching?

    I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).

    (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)

    0 讨论(0)
  • 2020-12-17 16:03

    HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.

    public interface HashingStrategy<E>
    {
        int computeHashCode(E object);
        boolean equals(E object1, E object2);
    }
    

    You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.

    Let's look at an example.

    public class Data
    {
        private final int id;
    
        public Data(int id)
        {
            this.id = id;
        }
    
        public int getId()
        {
            return id;
        }
    
        // No equals or hashcode
    }
    

    Here's how you might set up a UnifiedSetWithHashingStrategy and use it.

    java.util.Set<Data> set =
      new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
    Assert.assertTrue(set.add(new Data(1)));
    
    // contains returns true even without hashcode and equals
    Assert.assertTrue(set.contains(new Data(1)));
    
    // Second call to add() doesn't do anything and returns false
    Assert.assertFalse(set.add(new Data(1)));
    

    Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.

    How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().

    Here's a simple approach to handle case-insensitive Strings.

    UnifiedSetWithHashingStrategy<String> set = 
      new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
    set.add("ABC");
    Assert.assertTrue(set.contains("ABC"));
    Assert.assertTrue(set.contains("abc"));
    Assert.assertFalse(set.contains("def"));
    Assert.assertEquals("ABC", set.get("aBc"));
    

    This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.

    public static final HashingStrategy<String> CASE_INSENSITIVE =
      new HashingStrategy<String>()
      {
        @Override
        public int computeHashCode(String string)
        {
          int hashCode = 0;
          for (int i = 0; i < string.length(); i++)
          {
            hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
          }
          return hashCode;
        }
    
        @Override
        public boolean equals(String string1, String string2)
        {
          return string1.equalsIgnoreCase(string2);
        }
      };
    

    Note: I am a developer on GS collections.

    0 讨论(0)
  • 2020-12-17 16:04

    .NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).

    In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...

    But yes, basically the idea is sound.

    0 讨论(0)
提交回复
热议问题