I need to compare two lists where each list contains about 60,000 objects. what would be the most efficient way of doing this? I want to select all the items that are in the so
LINQ has an Except() method for this purpose. You can just use a.Except(b);
Use Except()
and read more about set operations with linq and set operations with HashSet.
Create an equality comparer for the type, then you can use that to efficiently compare the sets:
public class MyFileComparer : IEqualityComparer<MyFile> {
public bool Equals(MyFile a, MyFile b) {
return
a.compareName == b.compareName &&
a.size == b.size &&
a.dateCreated == b.dateCreated;
}
public int GetHashCode(MyFile a) {
return
(a.compareName.GetHashCode() * 251 + a.size.GetHashCode()) * 251 +
a.dateCreated.GetHashCode();
}
}
Now you can use this with methods like Intersect
to get all items that exist in both lists, or Except
to get all items that exist in one list but not the other:
List<MyFile> tempList =
s.lstFiles.Intersect(d.lstFiles, new MyFileComparer()).ToList();
As the methods can use the hash code to divide the items into buckets, there are a lot less comparisons that needs to be done compared to a join where it has to compare all items in one list to all items in the other list.