Using LINQ, from a List
, how can I retrieve a list that contains entries repeated more than once and their values?
To find the duplicate values only :
var duplicates = list.GroupBy(x => x.Key).Any(g => g.Count() > 1);
Eg. var list = new[] {1,2,3,1,4,2};
so group by will group the numbers by their keys and will maintain the count(number of times it repeated) with it. After that, we are just checking the values who have repeated more than once.
To find the uniuqe values only :
var unique = list.GroupBy(x => x.Key).All(g => g.Count() == 1);
Eg. var list = new[] {1,2,3,1,4,2};
so group by will group the numbers by their keys and will maintain the count(number of times it repeated) with it. After that, we are just checking the values who have repeated only once means are unique.
Another way is using HashSet
:
var hash = new HashSet<int>();
var duplicates = list.Where(i => !hash.Add(i));
If you want unique values in your duplicates list:
var myhash = new HashSet<int>();
var mylist = new List<int>(){1,1,2,2,3,3,3,4,4,4};
var duplicates = mylist.Where(item => !myhash.Add(item)).Distinct().ToList();
Here is the same solution as a generic extension method:
public static class Extensions
{
public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector, IEqualityComparer<TKey> comparer)
{
var hash = new HashSet<TKey>(comparer);
return source.Where(item => !hash.Add(selector(item))).ToList();
}
public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
{
return source.GetDuplicates(x => x, comparer);
}
public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
{
return source.GetDuplicates(selector, null);
}
public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source)
{
return source.GetDuplicates(x => x, null);
}
}
You can do this:
var list = new[] {1,2,3,1,4,2};
var duplicateItems = list.Duplicates();
With these extension methods:
public static class Extensions
{
public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
{
var grouped = source.GroupBy(selector);
var moreThan1 = grouped.Where(i => i.IsMultiple());
return moreThan1.SelectMany(i => i);
}
public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source)
{
return source.Duplicates(i => i);
}
public static bool IsMultiple<T>(this IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
return enumerator.MoveNext() && enumerator.MoveNext();
}
}
Using IsMultiple() in the Duplicates method is faster than Count() because this does not iterate the whole collection.