Remove duplicates from list of object

后端 未结 5 1954
悲&欢浪女
悲&欢浪女 2021-01-20 11:52

I have MyObject with field: id, a, b, c, e, f and I have List with 500 000 items, now how can I remove all duplicate items with of the same value of the parameter a, c, f? <

相关标签:
5条回答
  • 2021-01-20 12:17

    Well you can always use LINQ Distinct() like this :

    var matches = list.Distinct(new Comparer()).ToList();
    

    But for Ditsinct() to work you need to impletemnt Comparer for your Class:

    class Comparer : IEqualityComparer<MyObject>
    {
        public bool Equals(MyObject x, MyObject y)
        {
            return x.a == y.a && x.c == y.c && x.f == y.f;
        }
    
        public int GetHashCode(MyObject obj)
        {
            return (obj.a + obj.c + obj.f).GetHashCode();
        }
    }
    
    0 讨论(0)
  • 2021-01-20 12:23

    If what you are looking for is speed, and don't mind using up some memory then I would recommend that you use a HashSet, if you are interested in doing some custom comparison, then you can make an IEqualityComparer<T>, something like this:

    var original = new ArrayList(); // whatever your original collection is 
    var unique = new HasSet<YourClass>(new MyCustomEqualityComparer());
    
    foreach(var item in original)
    {
        if(!unique.Contains(item))
            unique.Add(item);
    }
    
    return unique;
    

    the issue here is that you may end up gobbling up twice the original memory.

    Update:

    I made some extra research and I think you can achieve just what you want by simply doing:

    var original // your original data
    var unique = new HashSet<YourClass>(origin, new CustomEqualityComparer());
    

    that should take care of removing duplicated data as no duplication is allowed in a HashSet. I'd recommend that you also take a look at this question about GetHasCode implementation guidelines.

    If you want to know some more about the HashSet class follow these links:

    About HashSet
    About IEqualityComparer constructor
    IEqualityComparer documentation

    hope this helps

    0 讨论(0)
  • 2021-01-20 12:23

    One efficient method would be first to to a quicksort (or similar n Log n sort), based on a hash of (a, c, f) and then you can iterate through the resultant list, picking one every time the value of (a, c, f) changes.

    That would give a n log n speed solution, which is probably the best you can do.

    0 讨论(0)
  • 2021-01-20 12:26

    Drakko! You can use the Distinct() method to get only the values that has different values for the properties you specify.
    You could do something like this:

    List<MyObj> list = new List<MyObj>();
    
    //Run the code that is going to populate your list.
    var result = list.Select(myObj => new { myObj.a, myObj.c, myObj.f})
                     .Distinct().ToList();
    
    //Result contains the data based on the difference.
    
    0 讨论(0)
  • 2021-01-20 12:39

    Code from this link worked great for me. https://nishantrana.me/2014/08/14/remove-duplicate-objects-in-list-in-c/

    public class MyClass
    {
    public string ID { get; set; }
    public string Value { get; set; }
    
    }
    
    List<MyClass> myList = new List<MyClass>();
    var xrmOptionSet = new MyClass();
    xrmOptionSet.ID = "1";
    xrmOptionSet.Value = "100";
    var xrmOptionSet1 = new MyClass();
    xrmOptionSet1.ID = "2";
    xrmOptionSet1.Value = "200";
    var xrmOptionSet2 = new MyClass();
    xrmOptionSet2.ID = "1";
    xrmOptionSet2.Value = "100";
    myList.Add(xrmOptionSet);
    myList.Add(xrmOptionSet1);
    myList.Add(xrmOptionSet2);
    
    // here we are first grouping the result by label and then picking the first item from each group
    var myDistinctList = myList.GroupBy(i => i.ID)
    .Select(g => g.First()).ToList();
    
    0 讨论(0)
提交回复
热议问题