LINQ Performance for Large Collections

前端 未结 6 1705
梦毁少年i
梦毁少年i 2021-02-02 11:48

I have a large collection of strings (up to 1M) alphabetically sorted. I have experimented with LINQ queries against this collection using HashSet, SortedDictionary, and Dictio

6条回答
  •  野趣味
    野趣味 (楼主)
    2021-02-02 12:13

    If you're doing a "starts with", you only care about ordinal comparisons, and you can have the collection sorted (again in ordinal order) then I would suggest you have the values in a list. You can then binary search to find the first value which starts with the right prefix, then go down the list linearly yielding results until the first value which doesn't start with the right prefix.

    In fact, you could probably do another binary search for the first value which doesn't start with the prefix, so you'd have a start and an end point. Then you just need to apply the length criterion to that matching portion. (I'd hope that if it's sensible data, the prefix matching is going to get rid of most candidate values.) The way to find the first value which doesn't start with the prefix is to search for the lexicographically-first value which doesn't - e.g. with a prefix of "ABC", search for "ABD".

    None of this uses LINQ, and it's all very specific to your particular case, but it should work. Let me know if any of this doesn't make sense.

提交回复
热议问题