What is the simplest way to achieve O(n) performance when creating the union of 3 IEnumerables?

前端 未结 3 1517
心在旅途
心在旅途 2021-02-07 00:51

Say a, b, c are all List and I want to create an unsorted union of them. Although performance isn\'t super-critical, they might have 10,000 entries in each

3条回答
  •  借酒劲吻你
    2021-02-07 01:43

    You should use Enumerable.Union because it is as efficient as the HashSet approach. Complexity is O(n+m) because:

    Enumerable.Union

    When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.

    Source-code here.


    Ivan is right, there is an overhead if you use Enumerable.Union with multiple collections since a new set must be created for every chained call. So it might be more efficient(in terms of memory consumption) if you use one of these approaches:

    1. Concat + Distinct:

      a.Concat(b).Concat(c)...Concat(x).Distinct()
      
    2. Union + Concat

      a.Union(b.Concat(c)...Concat(x))
      
    3. HashSet constructor that takes IEnumerable(f.e. with int):

      new HashSet(a.Concat(b).Concat(c)...Concat(x))
      

    The difference between the first two might be negligible. The third approach is not using deferred execution, it creates a HashSet<> in memory. It's a good and efficient way 1. if you need this collection type or 2. if this is the final operation on the query. But if you need to to further operations on this chained query you should prefer either Concat + Distinct or Union + Concat.

提交回复
热议问题