Using Distinct with LINQ and Objects

后端 未结 5 1395
余生分开走
余生分开走 2021-01-03 23:20

Until recently, I was using a Distinct in LINQ to select a distinct category (an enum) from a table. This was working fine.

I now need to have it distinct on a class

相关标签:
5条回答
  • 2021-01-03 23:21

    try an IQualityComparer

    public class MyObjEqualityComparer : IEqualityComparer<MyObj>
    {
        public bool Equals(MyObj x, MyObj y)
        {
            return x.Category.Equals(y.Category) &&
                   x.Country.Equals(y.Country);
        }
    
        public int GetHashCode(MyObj obj)
        {
            return obj.GetHashCode();
        }
    }
    

    then use here

    var comparer = new MyObjEqualityComparer();
    myObjs.Where(m => m.SomeProperty == "whatever").Distinct(comparer);
    
    0 讨论(0)
  • 2021-01-03 23:29

    I believe this post explains your problem: http://blog.jordanterrell.com/post/LINQ-Distinct()-does-not-work-as-expected.aspx

    The content of the above link can be summed up by saying that the Distinct() method can be replaced by doing the following.

    var distinctItems = items
           .GroupBy(x => x.PropertyToCompare)
           .Select(x => x.First());
    
    0 讨论(0)
  • 2021-01-03 23:30

    You're not doing it wrong, it is just the bad implementation of .Distinct() in the .NET Framework.

    One way to fix it is already shown in the other answers, but there is also a shorter solution available, which has the advantage that you can use it as an extension method easily everywhere without having to tweak the object's hash values.

    Take a look at this:


    **Usage:**
    var myQuery=(from x in Customers select x).MyDistinct(d => d.CustomerID);
    

    Note: This example uses a database query, but it does also work with an enumerable object list.


    Declaration of MyDistinct:

    public static class Extensions
    {
        public static IEnumerable<T> MyDistinct<T, V>(this IEnumerable<T> query, 
                                                        Func<T, V> f)
        {
            return query.GroupBy(f).Select(x=>x.First());
        }
    }
    

    Or if you want it shorter, this is the same as above, but as "one-liner":

    public static IEnumerable<T> MyDistinct<T, V>(this IEnumerable<T> query, Func<T, V> f) 
                                 => query.GroupBy(f).Select(x => x.First());
    

    And it works for everything, objects as well as entities. If required, you can create a second overloaded extension method for IQueryable<T> by just replacing the return type and first parameter type in the example I've given above.

    0 讨论(0)
  • 2021-01-03 23:31

    I know this is an old question, but I am not satisfied with any of the answers. I took time to figure this out for myself and I wanted to share my findings.

    First it is important to read and understand these two things:

    1. IEqualityComparer
    2. EqualityComparer

    Long story short in order to make the .Distinct() extension understand how to determine equality of your object - you must define a "EqualityComparer" for your object T. When you read the Microsoft docs it literally states:

    We recommend that you derive from the EqualityComparer class instead of implementing the IEqualityComparer interface...

    That is how you determine what to use, because it had been decided for you already.

    For the .Distinct() extension to work successfully you must ensure that your objects can be compared accurately. In the case of .Distinct() the GetHashCode() method is what really matters.

    You can test this out for yourself by writing a GetHashCode() implementation that just returns the current Hash Code of the object being passed in and you will see the results are bad because this value changes on each run. That makes your objects too unique which is why it is important to actually write a proper implementation of this method.

    Below is an exact copy of the code sample from IEqualityComparer<T>'s page with test data, small modification to the GetHashCode() method and comments to demonstrate the point.

    //Did this in LinqPad
    void Main()
    {
        var lst = new List<Box>
        {
            new Box(1, 1, 1),
            new Box(1, 1, 1),
            new Box(1, 1, 1),
            new Box(1, 1, 1),
            new Box(1, 1, 1)
        };
    
        //Demonstration that the hash code for each object is fairly 
        //random and won't help you for getting a distinct list
        lst.ForEach(x => Console.WriteLine(x.GetHashCode()));
    
        //Demonstration that if your EqualityComparer is setup correctly
        //then you will get a distinct list
        lst = lst
            .Distinct(new BoxEqualityComparer())
            .ToList();
    
        lst.Dump();
    }
    
    public class Box
    {
        public Box(int h, int l, int w)
        {
            this.Height = h;
            this.Length = l;
            this.Width = w;
        }
    
        public int Height { get; set; }
        public int Length { get; set; }
        public int Width { get; set; }
    
        public override String ToString()
        {
            return String.Format("({0}, {1}, {2})", Height, Length, Width);
        }
    }
    
    public class BoxEqualityComparer 
        : EqualityComparer<Box>
    {
        public override bool Equals(Box b1, Box b2)
        {
            if (b2 == null && b1 == null)
                return true;
            else if (b1 == null || b2 == null)
                return false;
            else if (b1.Height == b2.Height && b1.Length == b2.Length
                                && b1.Width == b2.Width)
                return true;
            else
                return false;
        }
    
        public override int GetHashCode(Box bx)
        {
            #region This works
            //In this example each component of the box object are being XOR'd together
            int hCode = bx.Height ^ bx.Length ^ bx.Width;
    
            //The hashcode of an integer, is that same integer
            return hCode.GetHashCode();
            #endregion
    
            #region This won't work
            //Comment the above lines and uncomment this line below if you want to see Distinct() not work
            //return bx.GetHashCode();
            #endregion
        }
    }
    
    0 讨论(0)
  • 2021-01-03 23:38

    For explanation, take a look at other answers. I'm just providing one way to handle this issue.

    You might like this:

    public class LambdaComparer<T>:IEqualityComparer<T>{
      private readonly Func<T,T,bool> _comparer;
      private readonly Func<T,int> _hash;
      public LambdaComparer(Func<T,T,bool> comparer):
        this(comparer,o=>0) {}
      public LambdaComparer(Func<T,T,bool> comparer,Func<T,int> hash){
        if(comparer==null) throw new ArgumentNullException("comparer");
        if(hash==null) throw new ArgumentNullException("hash");
        _comparer=comparer;
        _hash=hash;
      }
      public bool Equals(T x,T y){
        return _comparer(x,y);
      }
      public int GetHashCode(T obj){
        return _hash(obj);
      }
    }
    

    Usage:

    public void Foo{
      public string Fizz{get;set;}
      public BarEnum Bar{get;set;}
    }
    
    public enum BarEnum {One,Two,Three}
    
    var lst=new List<Foo>();
    lst.Distinct(new LambdaComparer<Foo>(
      (x1,x2)=>x1.Fizz==x2.Fizz&&
               x1.Bar==x2.Bar));
    

    You can even wrap it around to avoid writing noisy new LambdaComparer<T>(...) thing:

    public static class EnumerableExtensions{
     public static IEnumerable<T> SmartDistinct<T>
      (this IEnumerable<T> lst, Func<T, T, bool> pred){
       return lst.Distinct(new LambdaComparer<T>(pred));
     }
    }
    

    Usage:

    lst.SmartDistinct((x1,x2)=>x1.Fizz==x2.Fizz&&x1.Bar==x2.Bar);
    

    NB: works reliably only for Linq2Objects

    0 讨论(0)
提交回复
热议问题