I am playing with LINQ to learn about it, but I can\'t figure out how to use Distinct
when I do not have a simple list (a simple list of integers is pretty easy
If you don't want to add the MoreLinq library to your project just to get the DistinctBy
functionality then you can get the same end result using the overload of Linq's Distinct
method that takes in an IEqualityComparer
argument.
You begin by creating a generic custom equality comparer class that uses lambda syntax to perform custom comparison of two instances of a generic class:
public class CustomEqualityComparer<T> : IEqualityComparer<T>
{
Func<T, T, bool> _comparison;
Func<T, int> _hashCodeFactory;
public CustomEqualityComparer(Func<T, T, bool> comparison, Func<T, int> hashCodeFactory)
{
_comparison = comparison;
_hashCodeFactory = hashCodeFactory;
}
public bool Equals(T x, T y)
{
return _comparison(x, y);
}
public int GetHashCode(T obj)
{
return _hashCodeFactory(obj);
}
}
Then in your main code you use it like so:
Func<Person, Person, bool> areEqual = (p1, p2) => int.Equals(p1.Id, p2.Id);
Func<Person, int> getHashCode = (p) => p.Id.GetHashCode();
var query = people.Distinct(new CustomEqualityComparer<Person>(areEqual, getHashCode));
Voila! :)
The above assumes the following:
Person.Id
is of type int
people
collection does not contain any null elementsIf the collection could contain nulls then simply rewrite the lambdas to check for null, e.g.:
Func<Person, Person, bool> areEqual = (p1, p2) =>
{
return (p1 != null && p2 != null) ? int.Equals(p1.Id, p2.Id) : false;
};
EDIT
This approach is similar to the one in Vladimir Nesterovsky's answer but simpler.
It is also similar to the one in Joel's answer but allows for complex comparison logic involving multiple properties.
However, if your objects can only ever differ by Id
then another user gave the correct answer that all you need to do is override the default implementations of GetHashCode()
and Equals()
in your Person
class and then just use the out-of-the-box Distinct()
method of Linq to filter out any duplicates.
You can do this with the standard Linq.ToLookup(). This will create a collection of values for each unique key. Just select the first item in the collection
Persons.ToLookup(p => p.Id).Select(coll => coll.First());
When we faced such a task in our project we defined a small API to compose comparators.
So, the use case was like this:
var wordComparer = KeyEqualityComparer.Null<Word>().
ThenBy(item => item.Text).
ThenBy(item => item.LangID);
...
source.Select(...).Distinct(wordComparer);
And API itself looks like this:
using System;
using System.Collections;
using System.Collections.Generic;
public static class KeyEqualityComparer
{
public static IEqualityComparer<T> Null<T>()
{
return null;
}
public static IEqualityComparer<T> EqualityComparerBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keyFunc)
{
return new KeyEqualityComparer<T, K>(keyFunc);
}
public static KeyEqualityComparer<T, K> ThenBy<T, K>(
this IEqualityComparer<T> equalityComparer,
Func<T, K> keyFunc)
{
return new KeyEqualityComparer<T, K>(keyFunc, equalityComparer);
}
}
public struct KeyEqualityComparer<T, K>: IEqualityComparer<T>
{
public KeyEqualityComparer(
Func<T, K> keyFunc,
IEqualityComparer<T> equalityComparer = null)
{
KeyFunc = keyFunc;
EqualityComparer = equalityComparer;
}
public bool Equals(T x, T y)
{
return ((EqualityComparer == null) || EqualityComparer.Equals(x, y)) &&
EqualityComparer<K>.Default.Equals(KeyFunc(x), KeyFunc(y));
}
public int GetHashCode(T obj)
{
var hash = EqualityComparer<K>.Default.GetHashCode(KeyFunc(obj));
if (EqualityComparer != null)
{
var hash2 = EqualityComparer.GetHashCode(obj);
hash ^= (hash2 << 5) + hash2;
}
return hash;
}
public readonly Func<T, K> KeyFunc;
public readonly IEqualityComparer<T> EqualityComparer;
}
More details is on our site: IEqualityComparer in LINQ.
The best way to do this that will be compatible with other .NET versions is to override Equals and GetHash to handle this (see Stack Overflow question This code returns distinct values. However, what I want is to return a strongly typed collection as opposed to an anonymous type), but if you need something that is generic throughout your code, the solutions in this article are great.
You can do it (albeit not lightning-quickly) like so:
people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));
That is, "select all people where there isn't another different person in the list with the same ID."
Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.
I think it is enough:
list.Select(s => s.MyField).Distinct();