Comparing two collections for equality irrespective of the order of items in them

后端未结

关注

 19  1592

我在风中等你

I would like to compare two collections (in C#), but I\'m not sure of the best way to implement this efficiently.

I\'ve read the other thread about Enumerable.Sequen

相关标签:

19条回答

孤独总比滥情好

2020-11-22 10:57
```
static bool SetsContainSameElements<T>(IEnumerable<T> set1, IEnumerable<T> set2) {
    var setXOR = new HashSet<T>(set1);
    setXOR.SymmetricExceptWith(set2);
    return (setXOR.Count == 0);
}
```
Solution requires .NET 3.5 and the System.Collections.Generic namespace. According to Microsoft, SymmetricExceptWith is an O(n + m) operation, with n representing the number of elements in the first set and m representing the number of elements in the second. You could always add an equality comparer to this function if necessary.
0 讨论(0)
发布评论:

提交评论
- 加载中...

鱼传尺愫

2020-11-22 10:59

Why not use .Except()

// Create the IEnumerable data sources.
string[] names1 = System.IO.File.ReadAllLines(@"../../../names1.txt");
string[] names2 = System.IO.File.ReadAllLines(@"../../../names2.txt");
// Create the query. Note that method syntax must be used here.
IEnumerable<string> differenceQuery =   names1.Except(names2);
// Execute the query.
Console.WriteLine("The following lines are in names1.txt but not names2.txt");
foreach (string s in differenceQuery)
     Console.WriteLine(s);

http://msdn.microsoft.com/en-us/library/bb397894.aspx

0 讨论(0)

情歌与酒

2020-11-22 11:00

If you use Shouldly, you can use ShouldAllBe with Contains.

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1.ShouldAllBe(item=>collection2.Contains(item)); // true

And finally, you can write an extension.

public static class ShouldlyIEnumerableExtensions
{
    public static void ShouldEquivalentTo<T>(this IEnumerable<T> list, IEnumerable<T> equivalent)
    {
        list.ShouldAllBe(l => equivalent.Contains(l));
    }
}

UPDATE

A optional parameter exists on ShouldBe method.

collection1.ShouldBe(collection2, ignoreOrder: true); // true

0 讨论(0)

我寻月下人不归

2020-11-22 11:04
erickson is almost right: since you want to match on counts of duplicates, you want a Bag. In Java, this looks something like:
```
(new HashBag(collection1)).equals(new HashBag(collection2))
```
I'm sure C# has a built-in Set implementation. I would use that first; if performance is a problem, you could always use a different Set implementation, but use the same Set interface.
0 讨论(0)
发布评论:

提交评论
- 加载中...
遇见更好的自我

2020-11-22 11:06

You could use a Hashset. Look at the SetEquals method.

0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2020-11-22 11:06
EDIT: I realized as soon as I posed that this really only works for sets -- it will not properly deal with collections that have duplicate items. For example { 1, 1, 2 } and { 2, 2, 1 } will be considered equal from this algorithm's perspective. If your collections are sets (or their equality can be measured that way), however, I hope you find the below useful.

The solution I use is:
```
return c1.Count == c2.Count && c1.Intersect(c2).Count() == c1.Count;
```
Linq does the dictionary thing under the covers, so this is also O(N). (Note, it's O(1) if the collections aren't the same size).

I did a sanity check using the "SetEqual" method suggested by Daniel, the OrderBy/SequenceEquals method suggested by Igor, and my suggestion. The results are below, showing O(N*LogN) for Igor and O(N) for mine and Daniel's.

I think the simplicity of the Linq intersect code makes it the preferable solution.
```
__Test Latency(ms)__
N, SetEquals, OrderBy, Intersect    
1024, 0, 0, 0    
2048, 0, 0, 0    
4096, 31.2468, 0, 0    
8192, 62.4936, 0, 0    
16384, 156.234, 15.6234, 0    
32768, 312.468, 15.6234, 46.8702    
65536, 640.5594, 46.8702, 31.2468    
131072, 1312.3656, 93.7404, 203.1042    
262144, 3765.2394, 187.4808, 187.4808    
524288, 5718.1644, 374.9616, 406.2084    
1048576, 11420.7054, 734.2998, 718.6764    
2097152, 35090.1564, 1515.4698, 1484.223
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 4 下一页