How to subtract one huge list from another efficiently in C#

后端未结

关注

 4  884

I have a very long list of Ids (integers) that represents all the items that are currently in my database:

var idList = GetAllIds();

I also hav

相关标签:

4条回答

迷失自我

2021-02-07 00:41
LINQ could help:
```
itemsToAdd.Except(idList)
```
Your code is slow because List<T>.Contains is O(n). So your total cost is O(itemsToAdd.Count*idList.Count).

You can make idList into a HashSet<T> which has O(1) .Contains. Or just use the Linq .Except extension method which does it for you.

Note that .Except will also remove all duplicates from the left side. i.e. new int[]{1,1,2}.Except(new int[]{2}) will result in just {1} and the second 1 was removed. But I assume it's no problem in your case because IDs are typically unique.
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2021-02-07 00:44
Assuming the following premises are true:
- idList and itemsToAdd may not contain duplicate values
- you are using the .NET Framework 4.0
you could use a HashSet<T> this way:
```
var itemsToAddSet = new HashSet(itemsToAdd);
itemsToAddSet.ExceptWith(idList);
```
According to the documentation the ISet<T>.ExceptWith method is pretty efficient:

This method is an O(n) operation, where n is the number of elements in the other parameter.

In your case n is the number of items in idList.
0 讨论(0)
发布评论:

提交评论
- 加载中...
逝去的感伤

2021-02-07 00:49
Transform temporarily idList to an HashSet<T> and use the same method i.e.:
```
items.RemoveAll(e => idListHash.Contains(e.Id));
```
it should be much faster
0 讨论(0)
发布评论:

提交评论
- 加载中...
花落未央

2021-02-07 00:49

You should use two HashSet<int>s.
Note that they're unique and unordered.

0 讨论(0)
发布评论:

提交评论
- 加载中...