Comparing two collections for equality irrespective of the order of items in them

后端 未结 19 1593
我在风中等你
我在风中等你 2020-11-22 10:28

I would like to compare two collections (in C#), but I\'m not sure of the best way to implement this efficiently.

I\'ve read the other thread about Enumerable.Sequen

相关标签:
19条回答
  • 2020-11-22 11:07

    If comparing for the purpose of Unit Testing Assertions, it may make sense to throw some efficiency out the window and simply convert each list to a string representation (csv) before doing the comparison. That way, the default test Assertion message will display the differences within the error message.

    Usage:

    using Microsoft.VisualStudio.TestTools.UnitTesting;
    
    // define collection1, collection2, ...
    
    Assert.Equal(collection1.OrderBy(c=>c).ToCsv(), collection2.OrderBy(c=>c).ToCsv());
    

    Helper Extension Method:

    public static string ToCsv<T>(
        this IEnumerable<T> values,
        Func<T, string> selector,
        string joinSeparator = ",")
    {
        if (selector == null)
        {
            if (typeof(T) == typeof(Int16) ||
                typeof(T) == typeof(Int32) ||
                typeof(T) == typeof(Int64))
            {
                selector = (v) => Convert.ToInt64(v).ToStringInvariant();
            }
            else if (typeof(T) == typeof(decimal))
            {
                selector = (v) => Convert.ToDecimal(v).ToStringInvariant();
            }
            else if (typeof(T) == typeof(float) ||
                    typeof(T) == typeof(double))
            {
                selector = (v) => Convert.ToDouble(v).ToString(CultureInfo.InvariantCulture);
            }
            else
            {
                selector = (v) => v.ToString();
            }
        }
    
        return String.Join(joinSeparator, values.Select(v => selector(v)));
    }
    
    0 讨论(0)
  • 2020-11-22 11:08

    Here's my extension method variant of ohadsc's answer, in case it's useful to someone

    static public class EnumerableExtensions 
    {
        static public bool IsEquivalentTo<T>(this IEnumerable<T> first, IEnumerable<T> second)
        {
            if ((first == null) != (second == null))
                return false;
    
            if (!object.ReferenceEquals(first, second) && (first != null))
            {
                if (first.Count() != second.Count())
                    return false;
    
                if ((first.Count() != 0) && HaveMismatchedElement<T>(first, second))
                    return false;
            }
    
            return true;
        }
    
        private static bool HaveMismatchedElement<T>(IEnumerable<T> first, IEnumerable<T> second)
        {
            int firstCount;
            int secondCount;
    
            var firstElementCounts = GetElementCounts<T>(first, out firstCount);
            var secondElementCounts = GetElementCounts<T>(second, out secondCount);
    
            if (firstCount != secondCount)
                return true;
    
            foreach (var kvp in firstElementCounts)
            {
                firstCount = kvp.Value;
                secondElementCounts.TryGetValue(kvp.Key, out secondCount);
    
                if (firstCount != secondCount)
                    return true;
            }
    
            return false;
        }
    
        private static Dictionary<T, int> GetElementCounts<T>(IEnumerable<T> enumerable, out int nullCount)
        {
            var dictionary = new Dictionary<T, int>();
            nullCount = 0;
    
            foreach (T element in enumerable)
            {
                if (element == null)
                {
                    nullCount++;
                }
                else
                {
                    int num;
                    dictionary.TryGetValue(element, out num);
                    num++;
                    dictionary[element] = num;
                }
            }
    
            return dictionary;
        }
    
        static private int GetHashCode<T>(IEnumerable<T> enumerable)
        {
            int hash = 17;
    
            foreach (T val in enumerable.OrderBy(x => x))
                hash = hash * 23 + val.GetHashCode();
    
            return hash;
        }
    }
    
    0 讨论(0)
  • 2020-11-22 11:10

    It turns out Microsoft already has this covered in its testing framework: CollectionAssert.AreEquivalent

    Remarks

    Two collections are equivalent if they have the same elements in the same quantity, but in any order. Elements are equal if their values are equal, not if they refer to the same object.

    Using reflector, I modified the code behind AreEquivalent() to create a corresponding equality comparer. It is more complete than existing answers, since it takes nulls into account, implements IEqualityComparer and has some efficiency and edge case checks. plus, it's Microsoft :)

    public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
    {
        private readonly IEqualityComparer<T> m_comparer;
        public MultiSetComparer(IEqualityComparer<T> comparer = null)
        {
            m_comparer = comparer ?? EqualityComparer<T>.Default;
        }
    
        public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
        {
            if (first == null)
                return second == null;
    
            if (second == null)
                return false;
    
            if (ReferenceEquals(first, second))
                return true;
    
            if (first is ICollection<T> firstCollection && second is ICollection<T> secondCollection)
            {
                if (firstCollection.Count != secondCollection.Count)
                    return false;
    
                if (firstCollection.Count == 0)
                    return true;
            }
    
            return !HaveMismatchedElement(first, second);
        }
    
        private bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
        {
            int firstNullCount;
            int secondNullCount;
    
            var firstElementCounts = GetElementCounts(first, out firstNullCount);
            var secondElementCounts = GetElementCounts(second, out secondNullCount);
    
            if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
                return true;
    
            foreach (var kvp in firstElementCounts)
            {
                var firstElementCount = kvp.Value;
                int secondElementCount;
                secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);
    
                if (firstElementCount != secondElementCount)
                    return true;
            }
    
            return false;
        }
    
        private Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
        {
            var dictionary = new Dictionary<T, int>(m_comparer);
            nullCount = 0;
    
            foreach (T element in enumerable)
            {
                if (element == null)
                {
                    nullCount++;
                }
                else
                {
                    int num;
                    dictionary.TryGetValue(element, out num);
                    num++;
                    dictionary[element] = num;
                }
            }
    
            return dictionary;
        }
    
        public int GetHashCode(IEnumerable<T> enumerable)
        {
            if (enumerable == null) throw new ArgumentNullException(nameof(enumerable));
    
            int hash = 17;
    
            foreach (T val in enumerable.OrderBy(x => x))
                hash = hash * 23 + (val?.GetHashCode() ?? 42);
    
            return hash;
        }
    }
    

    Sample usage:

    var set = new HashSet<IEnumerable<int>>(new[] {new[]{1,2,3}}, new MultiSetComparer<int>());
    Console.WriteLine(set.Contains(new [] {3,2,1})); //true
    Console.WriteLine(set.Contains(new [] {1, 2, 3, 3})); //false
    

    Or if you just want to compare two collections directly:

    var comp = new MultiSetComparer<string>();
    Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","c","b"})); //true
    Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","b"})); //false
    

    Finally, you can use your an equality comparer of your choice:

    var strcomp = new MultiSetComparer<string>(StringComparer.OrdinalIgnoreCase);
    Console.WriteLine(strcomp.Equals(new[] {"a", "b"}, new []{"B", "A"})); //true
    
    0 讨论(0)
  • 2020-11-22 11:10

    In the case of no repeats and no order, the following EqualityComparer can be used to allow collections as dictionary keys:

    public class SetComparer<T> : IEqualityComparer<IEnumerable<T>> 
    where T:IComparable<T>
    {
        public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
        {
            if (first == second)
                return true;
            if ((first == null) || (second == null))
                return false;
            return first.ToHashSet().SetEquals(second);
        }
    
        public int GetHashCode(IEnumerable<T> enumerable)
        {
            int hash = 17;
    
            foreach (T val in enumerable.OrderBy(x => x))
                hash = hash * 23 + val.GetHashCode();
    
            return hash;
        }
    }
    

    Here is the ToHashSet() implementation I used. The hash code algorithm comes from Effective Java (by way of Jon Skeet).

    0 讨论(0)
  • 2020-11-22 11:10

    Allowing for duplicates in the IEnumerable<T> (if sets are not desirable\possible) and "ignoring order" you should be able to use a .GroupBy().

    I'm not an expert on the complexity measurements, but my rudimentary understanding is that this should be O(n). I understand O(n^2) as coming from performing an O(n) operation inside another O(n) operation like ListA.Where(a => ListB.Contains(a)).ToList(). Every item in ListB is evaluated for equality against each item in ListA.

    Like I said, my understanding on complexity is limited, so correct me on this if I'm wrong.

    public static bool IsSameAs<T, TKey>(this IEnumerable<T> source, IEnumerable<T> target, Expression<Func<T, TKey>> keySelectorExpression)
        {
            // check the object
            if (source == null && target == null) return true;
            if (source == null || target == null) return false;
    
            var sourceList = source.ToList();
            var targetList = target.ToList();
    
            // check the list count :: { 1,1,1 } != { 1,1,1,1 }
            if (sourceList.Count != targetList.Count) return false;
    
            var keySelector = keySelectorExpression.Compile();
            var groupedSourceList = sourceList.GroupBy(keySelector).ToList();
            var groupedTargetList = targetList.GroupBy(keySelector).ToList();
    
            // check that the number of grouptings match :: { 1,1,2,3,4 } != { 1,1,2,3,4,5 }
            var groupCountIsSame = groupedSourceList.Count == groupedTargetList.Count;
            if (!groupCountIsSame) return false;
    
            // check that the count of each group in source has the same count in target :: for values { 1,1,2,3,4 } & { 1,1,1,2,3,4 }
            // key:count
            // { 1:2, 2:1, 3:1, 4:1 } != { 1:3, 2:1, 3:1, 4:1 }
            var countsMissmatch = groupedSourceList.Any(sourceGroup =>
                                                            {
                                                                var targetGroup = groupedTargetList.Single(y => y.Key.Equals(sourceGroup.Key));
                                                                return sourceGroup.Count() != targetGroup.Count();
                                                            });
            return !countsMissmatch;
        }
    
    0 讨论(0)
  • 2020-11-22 11:11

    This is my (heavily influenced by D.Jennings) generic implementation of the comparison method (in C#):

    /// <summary>
    /// Represents a service used to compare two collections for equality.
    /// </summary>
    /// <typeparam name="T">The type of the items in the collections.</typeparam>
    public class CollectionComparer<T>
    {
        /// <summary>
        /// Compares the content of two collections for equality.
        /// </summary>
        /// <param name="foo">The first collection.</param>
        /// <param name="bar">The second collection.</param>
        /// <returns>True if both collections have the same content, false otherwise.</returns>
        public bool Execute(ICollection<T> foo, ICollection<T> bar)
        {
            // Declare a dictionary to count the occurence of the items in the collection
            Dictionary<T, int> itemCounts = new Dictionary<T,int>();
    
            // Increase the count for each occurence of the item in the first collection
            foreach (T item in foo)
            {
                if (itemCounts.ContainsKey(item))
                {
                    itemCounts[item]++;
                }
                else
                {
                    itemCounts[item] = 1;
                }
            }
    
            // Wrap the keys in a searchable list
            List<T> keys = new List<T>(itemCounts.Keys);
    
            // Decrease the count for each occurence of the item in the second collection
            foreach (T item in bar)
            {
                // Try to find a key for the item
                // The keys of a dictionary are compared by reference, so we have to
                // find the original key that is equivalent to the "item"
                // You may want to override ".Equals" to define what it means for
                // two "T" objects to be equal
                T key = keys.Find(
                    delegate(T listKey)
                    {
                        return listKey.Equals(item);
                    });
    
                // Check if a key was found
                if(key != null)
                {
                    itemCounts[key]--;
                }
                else
                {
                    // There was no occurence of this item in the first collection, thus the collections are not equal
                    return false;
                }
            }
    
            // The count of each item should be 0 if the contents of the collections are equal
            foreach (int value in itemCounts.Values)
            {
                if (value != 0)
                {
                    return false;
                }
            }
    
            // The collections are equal
            return true;
        }
    }
    
    0 讨论(0)
提交回复
热议问题