Fastest way to find common items across multiple lists in C#

后端未结

关注

 11  1841

Given the following:

List> optionLists;

what would be a quick way to determine the subset of Option objects that a

相关标签:

11条回答

心在旅途

2021-01-19 02:40
Fastest to write :)
```
var subset = optionLists.Aggregate((x, y) => x.Intersect(y))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
北荒

2021-01-19 02:41
Building on Matt's answer, since we are only interested in options that all lists have in common, we can simply check for any options in the first list that the others share:
```
var sharedOptions =
    from option in optionLists.First( ).Distinct( )
    where optionLists.Skip( 1 ).All( l => l.Contains( option ) )
    select option;
```
If an option list cannot contain duplicate entires, the Distinct call is unnecessary. If the lists vary greatly in size, it would be better to iterate over the options in the shortest list, rather than whatever list happens to be First. Sorted or hashed collections could be used to improve the lookup time of the Contains call, though it should not make much difference for a moderate number of items.
0 讨论(0)
发布评论:

提交评论
- 加载中...

夕颜

2021-01-19 02:48

Here's a much more efficent implementation:

static SortedDictionary<T,bool>.KeyCollection FindCommon<T> (List<List<T>> items)
{
  SortedDictionary<T, bool>
    current_common = new SortedDictionary<T, bool> (),
    common = new SortedDictionary<T, bool> ();

  foreach (List<T> list in items)
  {
    if (current_common.Count == 0)
    {
      foreach (T item in list)
      {
        common [item] = true;
      }
    }
    else
    {
      foreach (T item in list)
      {
        if (current_common.ContainsKey(item))
          common[item] = true;
        else
          common[item] = false;
      }
    }

    if (common.Count == 0)
    {
      current_common.Clear ();
      break;
    }

    SortedDictionary<T, bool>
      swap = current_common;

    current_common = common;
    common = swap;
    common.Clear ();
  }

  return current_common.Keys;
}

It works by creating a set of all items common to all lists processed so far and comparing each list with this set, creating a temporary set of the items common to the current list and the list of common items so far. Effectively an O(n.m) where n is the number of lists and m the number of items in the lists.

An example of using it:

static void Main (string [] args)
{
  Random
    random = new Random();

  List<List<int>>
    items = new List<List<int>>();

  for (int i = 0 ; i < 10 ; ++i)
  {
    List<int>
      list = new List<int> ();

    items.Add (list);

    for (int j = 0 ; j < 100 ; ++j)
    {
      list.Add (random.Next (70));
    }
  }

  SortedDictionary<int, bool>.KeyCollection
    common = FindCommon (items);

  foreach (List<int> list in items)
  {
    list.Sort ();
  }

  for (int i = 0 ; i < 100 ; ++i)
  {
    for (int j = 0 ; j < 10 ; ++j)
    {
      System.Diagnostics.Trace.Write (String.Format ("{0,-4:D} ", items [j] [i]));
    }

    System.Diagnostics.Trace.WriteLine ("");
  }

  foreach (int item in common)
  {
    System.Diagnostics.Trace.WriteLine (String.Format ("{0,-4:D} ", item));
  }
}

0 讨论(0)

盖世英雄少女心

2021-01-19 02:50
Sort, then do something akin to a merge-sort.

Basically you would do this:
1. Retrieve the first item from each list
2. Compare the items, if equal, output
3. If any of the items are before the others, sort-wise, retrieve a new item from the corresponding list to replace it, otherwise, retrieve new items to replace them all, from all the list
4. As long as you still got items, go back to 2.
0 讨论(0)
发布评论:

提交评论
- 加载中...

旧时难觅i

2021-01-19 02:54

@Skizz The method is not correct. It returns also items that are not common to all the lists in items. Here is the corrected method:

/// <summary>.
    /// The method FindAllCommonItemsInAllTheLists, returns a HashSet that contains all the common items in the lists contained in the listOfLists,
    /// regardless of the order of the items in the various lists.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="listOfLists"></param>
    /// <returns></returns>
    public static HashSet<T> FindAllCommonItemsInAllTheLists<T>(List<List<T>> listOfLists)
    {
        if (listOfLists == null || listOfLists.Count == 0)
        {
            return null;
        }
        HashSet<T> currentCommon = new HashSet<T>();
        HashSet<T> common = new HashSet<T>();

        foreach (List<T> currentList in listOfLists)
        {
            if (currentCommon.Count == 0)
            {
                foreach (T item in currentList)
                {
                    common.Add(item);
                }
            }
            else
            {
                foreach (T item in currentList)
                {
                    if (currentCommon.Contains(item))
                    {
                        common.Add(item);
                    }
                }
            }
            if (common.Count == 0)
            {
                currentCommon.Clear();
                break;
            }
            currentCommon.Clear(); // Empty currentCommon for a new iteration.
            foreach (T item in common) /* Copy all the items contained in common to currentCommon. 
                                        *            currentCommon = common; 
                                        * does not work because thus currentCommon and common would point at the same object and 
                                        * the next statement: 
                                        *            common.Clear();
                                        * will also clear currentCommon.
                                        */
            {
                if (!currentCommon.Contains(item))
                {
                    currentCommon.Add(item);
                }
            }
            common.Clear();
        }

        return currentCommon;
    }

0 讨论(0)

后悔当初

2021-01-19 02:57

what about using a hashSet? that way you can do what you want in O(n) where n is the number of items in all the lists combined, and I think that's the fastest way to do it.

you just have to iterate over every list and insert the values you find into the hashset When you insert a key that already exists you will receive false as the return value of the .add method, otherwise true is returned

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页