I want to get the distinct values in a list, but not by the standard equality comparison.
What I want to do is something like this:
return myList.Dis
Jon, your solution is pretty good. One minor change though. I don't think we need EqualityComparer.Default in there. Here is my solution (ofcourse the starting point was Jon Skeet's solution)
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
{
//TODO All arg checks
HashSet<TKey> keys = new HashSet<TKey>();
foreach (T item in source)
{
TKey key = keySelector(item);
if (!keys.Contains(key))
{
keys.Add(key);
yield return item;
}
}
}
But that seems messy.
It's not messy, it's correct.
Distinct
Programmers by FirstName and there are four Amy's, which one do you want?Group
programmers By FirstName and take the First
one, then it is clear what you want to do in the case of four Amy's.I can only use it here because I have a single key.
You can do a multiple key "distinct" with the same pattern:
return myList
.GroupBy( x => new { x.Url, x.Age } )
.Select( g => g.First() );
It's annoying, certainly. It's also part of my "MoreLINQ" project which I must pay some attention to at some point :) There are plenty of other operations which make sense when acting on a projection, but returning the original - MaxBy and MinBy spring to mind.
As you say, it's easy to write - although I prefer the name "DistinctBy" to match OrderBy etc. Here's my implementation if you're interested:
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
return source.DistinctBy(keySelector,
EqualityComparer<TKey>.Default);
}
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (keySelector == null)
{
throw new ArgumentNullException("keySelector");
}
if (comparer == null)
{
throw new ArgumentNullException("comparer");
}
return DistinctByImpl(source, keySelector, comparer);
}
private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>
(IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
Using AmyB's answer, I've written a small DistinctBy
extension method, to allow a predicate to be passed:
/// <summary>
/// Distinct method that accepts a perdicate
/// </summary>
/// <typeparam name="TSource">The type of the t source.</typeparam>
/// <typeparam name="TKey">The type of the t key.</typeparam>
/// <param name="source">The source.</param>
/// <param name="predicate">The predicate.</param>
/// <returns>IEnumerable<TSource>.</returns>
/// <exception cref="System.ArgumentNullException">source</exception>
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> predicate)
{
if (source == null)
throw new ArgumentNullException("source");
return source
.GroupBy(predicate)
.Select(x => x.First());
}
You can now pass a predicate to group the list by:
var distinct = myList.DistinctBy(x => x.Id);
Or group by multiple properties:
var distinct = myList.DistinctBy(x => new { x.Id, x.Title });