Comparing two collections for equality irrespective of the order of items in them

后端 未结 19 1559
我在风中等你
我在风中等你 2020-11-22 10:28

I would like to compare two collections (in C#), but I\'m not sure of the best way to implement this efficiently.

I\'ve read the other thread about Enumerable.Sequen

相关标签:
19条回答
  • 2020-11-22 10:57
    static bool SetsContainSameElements<T>(IEnumerable<T> set1, IEnumerable<T> set2) {
        var setXOR = new HashSet<T>(set1);
        setXOR.SymmetricExceptWith(set2);
        return (setXOR.Count == 0);
    }
    

    Solution requires .NET 3.5 and the System.Collections.Generic namespace. According to Microsoft, SymmetricExceptWith is an O(n + m) operation, with n representing the number of elements in the first set and m representing the number of elements in the second. You could always add an equality comparer to this function if necessary.

    0 讨论(0)
  • 2020-11-22 10:59

    Why not use .Except()

    // Create the IEnumerable data sources.
    string[] names1 = System.IO.File.ReadAllLines(@"../../../names1.txt");
    string[] names2 = System.IO.File.ReadAllLines(@"../../../names2.txt");
    // Create the query. Note that method syntax must be used here.
    IEnumerable<string> differenceQuery =   names1.Except(names2);
    // Execute the query.
    Console.WriteLine("The following lines are in names1.txt but not names2.txt");
    foreach (string s in differenceQuery)
         Console.WriteLine(s);
    

    http://msdn.microsoft.com/en-us/library/bb397894.aspx

    0 讨论(0)
  • 2020-11-22 11:00

    If you use Shouldly, you can use ShouldAllBe with Contains.

    collection1 = {1, 2, 3, 4};
    collection2 = {2, 4, 1, 3};
    
    collection1.ShouldAllBe(item=>collection2.Contains(item)); // true
    

    And finally, you can write an extension.

    public static class ShouldlyIEnumerableExtensions
    {
        public static void ShouldEquivalentTo<T>(this IEnumerable<T> list, IEnumerable<T> equivalent)
        {
            list.ShouldAllBe(l => equivalent.Contains(l));
        }
    }
    

    UPDATE

    A optional parameter exists on ShouldBe method.

    collection1.ShouldBe(collection2, ignoreOrder: true); // true
    
    0 讨论(0)
  • 2020-11-22 11:04

    erickson is almost right: since you want to match on counts of duplicates, you want a Bag. In Java, this looks something like:

    (new HashBag(collection1)).equals(new HashBag(collection2))
    

    I'm sure C# has a built-in Set implementation. I would use that first; if performance is a problem, you could always use a different Set implementation, but use the same Set interface.

    0 讨论(0)
  • 2020-11-22 11:06

    You could use a Hashset. Look at the SetEquals method.

    0 讨论(0)
  • 2020-11-22 11:06

    EDIT: I realized as soon as I posed that this really only works for sets -- it will not properly deal with collections that have duplicate items. For example { 1, 1, 2 } and { 2, 2, 1 } will be considered equal from this algorithm's perspective. If your collections are sets (or their equality can be measured that way), however, I hope you find the below useful.

    The solution I use is:

    return c1.Count == c2.Count && c1.Intersect(c2).Count() == c1.Count;
    

    Linq does the dictionary thing under the covers, so this is also O(N). (Note, it's O(1) if the collections aren't the same size).

    I did a sanity check using the "SetEqual" method suggested by Daniel, the OrderBy/SequenceEquals method suggested by Igor, and my suggestion. The results are below, showing O(N*LogN) for Igor and O(N) for mine and Daniel's.

    I think the simplicity of the Linq intersect code makes it the preferable solution.

    __Test Latency(ms)__
    N, SetEquals, OrderBy, Intersect    
    1024, 0, 0, 0    
    2048, 0, 0, 0    
    4096, 31.2468, 0, 0    
    8192, 62.4936, 0, 0    
    16384, 156.234, 15.6234, 0    
    32768, 312.468, 15.6234, 46.8702    
    65536, 640.5594, 46.8702, 31.2468    
    131072, 1312.3656, 93.7404, 203.1042    
    262144, 3765.2394, 187.4808, 187.4808    
    524288, 5718.1644, 374.9616, 406.2084    
    1048576, 11420.7054, 734.2998, 718.6764    
    2097152, 35090.1564, 1515.4698, 1484.223
    
    0 讨论(0)
提交回复
热议问题