HashSet.CreateSetComparer() cannot specify IEqualityComparer, is there an alternative?

前端 未结 3 541
广开言路
广开言路 2021-01-14 14:13

In the internal source there is such a constructor public HashSetEqualityComparer(IEqualityComparer comparer) but it\'s internal so I can\'t use it.<

相关标签:
3条回答
  • 2021-01-14 14:40

    Avoid this class if you use custom comparers. It uses its own equality comparer to perform GetHashCode, but when performing Equals(Set1, Set2) if Set1 and Set2 have the same equality comparer, the the HashSetEqualityComparer will use the comparer of the sets. HashsetEqualityComparer will only use its own comparer for equals if Set1 and Set2 have different comparers

    It gets worse. It calls HashSet.HashSetEquals, which has a bug in it (See https://referencesource.microsoft.com/#system.core/System/Collections/Generic/HashSet.cs line 1489, which is missing a if (set1.Count != set2.Count) return false before performing the subset check.

    The bug is illustrated by the following program:

    class Program
    {
        private class MyEqualityComparer : EqualityComparer<int>
        {
            public override bool Equals(int x, int y)
            {
                return x == y;
            }
    
            public override int GetHashCode(int obj)
            {
                return obj.GetHashCode();
            }
        }
    
        static void Main(string[] args)
        {
            var comparer = HashSet<int>.CreateSetComparer();
            var set1 = new HashSet<int>(new MyEqualityComparer()) { 1 };
            var set2 = new HashSet<int> { 1, 2 };
    
            Console.WriteLine(comparer.Equals(set1, set2));
            Console.WriteLine(comparer.Equals(set2, set1)); //True!
    
            Console.ReadKey();
        }
    }
    

    Regarding other answers to this question (I don't have the rep to comment):

    • Wilhelm Liao: His answer also contains the bug because it's copied from the reference source
    • InBetween: The solution is not symmetric. CustomHashSetEqualityComparer.Equals(A, B) does not always equals CustomHashSetEqualityComparer.Equals(B, A). I would be scared of that.

    I think a robust implementation should throw an exception if it encounters a set which has a different comparer to its own. It could always use its own comparer and ignore the set comparer, but that would give strange and unintuitive behaviour.

    0 讨论(0)
  • 2021-01-14 14:51

    I think best solution is using SetEquals. It does the job you need and exactly in the same way that HashSetEqualityComparer does but it will account for any custom comparers defined in the sets its comparing.

    So, in your specific scenario where you want to use a HashSet<T> as a key of a dictionary, you need to implement an IEqualityComparer<HashSet<T>> that makes use of SetEquals and "borrows" the reference source of HashSetEqualityComparer.GetHashCode():

    public class CustomHashSetEqualityComparer<T>
        : IEqualityComparer<HashSet<T>>
    {
        public bool Equals(HashSet<T> x, HashSet<T> y)
        {
            if (ReferenceEquals(x, null))
                return false;
    
            return x.SetEquals(y);
        }
    
        public int GetHashCode(HashSet<T> set)
        {
            int hashCode = 0;
    
            if (set != null)
            {
                foreach (T t in set)
                {
                    hashCode = hashCode ^ 
                        (set.Comparer.GetHashCode(t) & 0x7FFFFFFF);
                }
            }
    
            return hashCode;
        }
    }
    

    But yes, its a small pain that there is not way to directly create a SetEqualityComparer that leverages custom comparers but this unfortunate behavior is due, IMHO, more to a bug of the existing implementation than a lack of the needed overload; there is no reason why CreateSetComparer() can't return an IEqualityComparer that actually uses the comparers of the sets its comparing as the code above demonstrates.

    If I had a voice in it, CreateSetComparer() wouldn't be static method at all. It would then be obvious, or at least predictable, that whatever comparer was returned would be created with the current set's comparer.

    0 讨论(0)
  • 2021-01-14 14:51

    I agree @InBetween, using SetEquals is the best way. Even if add the constructor still can not achieve what you want.

    please see this code: http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360

    Here is I try to do:

    class HashSetEqualityComparerWrapper<T> : IEqualityComparer<HashSet<T>>
    {
        static private Type HashSetEqualityComparerType = HashSet<T>.CreateSetComparer().GetType();
        private IEqualityComparer<HashSet<T>> _comparer;
    
        public HashSetEqualityComparerWrapper()
        {
            _comparer = HashSet<T>.CreateSetComparer();
        }
        public HashSetEqualityComparerWrapper(IEqualityComparer<T> comparer)
        {
            _comparer = HashSet<T>.CreateSetComparer();
            if (comparer != null)
            {
                FieldInfo m_comparer_field = HashSetEqualityComparerType.GetField("m_comparer", BindingFlags.NonPublic | BindingFlags.Instance);
                m_comparer_field.SetValue(_comparer, comparer);
            }
        }
    
        public bool Equals(HashSet<T> x, HashSet<T> y)
        {
            return _comparer.Equals(x, y);
        }
        public int GetHashCode(HashSet<T> obj)
        {
            return _comparer.GetHashCode(obj);
        }
    }
    

    UPDATE

    I took 5 mins to implement another version form HashSetEqualityComparer<T> source code. And rewrite the bool Equals(HashSet<T> x, HashSet<T> y) method. It is not complex. All code just copy and paste from source, I just revise a bit.

    class CustomHashSetEqualityComparer<T> : IEqualityComparer<HashSet<T>>
    {
        private IEqualityComparer<T> m_comparer;
    
        public CustomHashSetEqualityComparer()
        {
            m_comparer = EqualityComparer<T>.Default;
        }
    
        public CustomHashSetEqualityComparer(IEqualityComparer<T> comparer)
        {
            if (comparer == null)
            {
                m_comparer = EqualityComparer<T>.Default;
            }
            else
            {
                m_comparer = comparer;
            }
        }
    
        // using m_comparer to keep equals properties in tact; don't want to choose one of the comparers
        public bool Equals(HashSet<T> x, HashSet<T> y)
        {
            // http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360
            // handle null cases first
            if (x == null)
            {
                return (y == null);
            }
            else if (y == null)
            {
                // set1 != null
                return false;
            }
    
            // all comparers are the same; this is faster
            if (AreEqualityComparersEqual(x, y))
            {
                if (x.Count != y.Count)
                {
                    return false;
                }
            }
            // n^2 search because items are hashed according to their respective ECs
            foreach (T set2Item in y)
            {
                bool found = false;
                foreach (T set1Item in x)
                {
                    if (m_comparer.Equals(set2Item, set1Item))
                    {
                        found = true;
                        break;
                    }
                }
                if (!found)
                {
                    return false;
                }
            }
            return true;
        }
    
        public int GetHashCode(HashSet<T> obj)
        {
            int hashCode = 0;
            if (obj != null)
            {
                foreach (T t in obj)
                {
                    hashCode = hashCode ^ (m_comparer.GetHashCode(t) & 0x7FFFFFFF);
                }
            } // else returns hashcode of 0 for null hashsets
            return hashCode;
        }
    
        // Equals method for the comparer itself. 
        public override bool Equals(Object obj)
        {
            CustomHashSetEqualityComparer<T> comparer = obj as CustomHashSetEqualityComparer<T>;
            if (comparer == null)
            {
                return false;
            }
            return (this.m_comparer == comparer.m_comparer);
        }
    
        public override int GetHashCode()
        {
            return m_comparer.GetHashCode();
        }
    
        static private bool AreEqualityComparersEqual(HashSet<T> set1, HashSet<T> set2)
        {
            return set1.Comparer.Equals(set2.Comparer);
        }
    }
    
    0 讨论(0)
提交回复
热议问题