I need some help using a lambda expression to remove duplicate entries in my Entity Framework context. I have a table with the following columns:
Id, DateOfIncide
var query = db.Incidents.Where(x => x.IsAttendanceIncident == "Y")
.GroupBy(x => new { x.Id, x.EmployeeId, x.DateOfIncident, x.IsAttendanceIncident })
.Select(x => x.FirstOrDefault());
var query2 = from duplicate in db.Incidents
.Where(x => x.IsAttendanceIncident == "Y" && !query.Any(i => i.Id == duplicate.Id));
query2 will now just contain the duplicates?
var query = db.Incidents.Where(x => x.IsAttendanceIncident == "Y").GroupBy(x => new { x.EmployeeId, x.DateOfIncident, x.IsAttendanceIncident })
.SelectMany(x => x.Skip(1));
You will need to use the Distinct function and if you want to just some of the fields then you need to create an Equality Comparer. (IEqualityComparer)
Ahh just seen the comment above, check this out for more:
Remove duplicates in the list using linq
var query = db.Incidents
.Where(x => x.IsAttendanceIncident == "Y")
.GroupBy(x => new { x.EmployeeId, x.DateOfIncident, x.IsAttendanceIncident })
Example 1:
.Select(x => x.FirstOrDefault()); // your original code which retrieves entities to not delete
var dupes = db.Incidents.Except( query ); // get entities to delete
Example 2:
.SelectMany( x => x.OrderBy( y => y.Id ).Skip(1) ); // gets dupes directly
var dupes = query; // already have what we need
And finally:
foreach( var dupe in dupes )
{
db.Incidents.Remove( dupe );
}
Example SQL generated from a test context I used earlier where Person entity has a 1:N relationship with watches:
C#:
context.Persons.SelectMany(x => x.Watches.OrderBy(y => y.Id).Skip(1))
Generated SQL:
SELECT
1 AS [C1],
[Skip1].[Id] AS [Id],
[Skip1].[Brand] AS [Brand],
[Skip1].[Person_Id] AS [Person_Id]
FROM [dbo].[Persons] AS [Extent1]
CROSS APPLY (SELECT [Project1].[Id] AS [Id], [Project1].[Brand] AS [Brand], [Project1].[Person_Id] AS [Person_Id]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[Brand] AS [Brand], [Project1].[Person_Id] AS [Person_Id], row_number() OVER (ORDER BY [Project1].[Id] ASC) AS [row_number]
FROM ( SELECT
[Extent2].[Id] AS [Id],
[Extent2].[Brand] AS [Brand],
[Extent2].[Person_Id] AS [Person_Id]
FROM [dbo].[Watches] AS [Extent2]
WHERE [Extent1].[Id] = [Extent2].[Person_Id]
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 1 ) AS [Skip1]