Remove duplicates from a List in C#

前端 未结 27 1811
广开言路
广开言路 2020-11-22 04:41

Anyone have a quick method for de-duplicating a generic List in C#?

相关标签:
27条回答
  • 2020-11-22 05:32

    Simply initialize a HashSet with a List of the same type:

    var noDupes = new HashSet<T>(withDupes);
    

    Or, if you want a List returned:

    var noDupsList = new HashSet<T>(withDupes).ToList();
    
    0 讨论(0)
  • 2020-11-22 05:33

    This takes distinct (the elements without duplicating elements) and convert it into a list again:

    List<type> myNoneDuplicateValue = listValueWithDuplicate.Distinct().ToList();
    
    0 讨论(0)
  • 2020-11-22 05:33

    All answers copy lists, or create a new list, or use slow functions, or are just painfully slow.

    To my understanding, this is the fastest and cheapest method I know (also, backed by a very experienced programmer specialized on real-time physics optimization).

    // Duplicates will be noticed after a sort O(nLogn)
    list.Sort();
    
    // Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
    int lastItem = -1;
    int currItem = -1;
    
    int size = list.Count;
    
    // Store the index pointing to the last item we want to keep in the list
    int last = size - 1;
    
    // Travel the items from last to first O(n)
    for (int i = last; i >= 0; --i)
    {
        currItem = list[i];
    
        // If this item was the same as the previous one, we don't want it
        if (currItem == lastItem)
        {
            // Overwrite last in current place. It is a swap but we don't need the last
           list[i] = list[last];
    
            // Reduce the last index, we don't want that one anymore
            last--;
        }
    
        // A new item, we store it and continue
        else
            lastItem = currItem;
    }
    
    // We now have an unsorted list with the duplicates at the end.
    
    // Remove the last items just once
    list.RemoveRange(last + 1, size - last - 1);
    
    // Sort again O(n logn)
    list.Sort();
    

    Final cost is:

    nlogn + n + nlogn = n + 2nlogn = O(nlogn) which is pretty nice.

    Note about RemoveRange: Since we cannot set the count of the list and avoid using the Remove funcions, I don't know exactly the speed of this operation but I guess it is the fastest way.

    0 讨论(0)
提交回复
热议问题