Entity Framework v6 GroupBy Losing Original Ordering

前端 未结 1 1092
慢半拍i
慢半拍i 2021-01-20 17:54

I have the following part of a query (It\'s the end of a larger query - queryBuilder is an IQueryable)

            var results = queryBuilder
               


        
1条回答
  •  情歌与酒
    2021-01-20 18:37

    I expect for it to order the vehicles by RangeId and then by rental

    In an LINQ to Entities query, any ordering before a GroupBy is simply ignored. You won't even see it in the executed SQL. That is because Entity Framework takes the grouping expression to order by (in your case x => x.Vehicle.RangeId). Why is that?

    LINQ's GroupBy is seemingly similar to SQL's GROUP BY, but actually it's quite different.

    GROUP BY in SQL is "destructive", by which I mean that any information other than the columns in the GROUP BY is lost (apart from aggregate expressions). If you do ...

    SELECT Brand, COUNT(*) 
    FROM Cars
    GROUP BY Brand
    

    ... you only see Brand and their counts. You don't see the cars in the groups.

    That's exactly what LINQ's GroupBy does: it produces groups of complete objects. All information in the original data is still there. You'll see cars grouped by their brands.

    That means that ORMs that translate GroupBy as GROUP BY give themselves a hard time building the result set. LINQ to SQL does that. It executes a GROUP BY query first and then it needs separate queries (one per group actually) to make up for the "lost" data.

    EF implements GroupBy differently. It gets all data in one query and then it builds the groups in memory. You won't see GROUP BY in the generated SQL. You see an ORDER BY instead. I think EF prefers a sorted SQL query result for more efficient processing in memory. (And I can imagine combines better with other LINQ statements in the pipeline).

    So that's why any ordering before GroupBy is ignored. And why you can only apply ordering after the grouping.

    the performance is absolutely dire

    It's hard to tell from here why that is. Maybe you can do the ordering in memory:

    var results = queryBuilder
                  .GroupBy(x => x.Vehicle.RangeId)
                  .Select(x => x.OrderBy(o => o.Rate.Rental).FirstOrDefault())
                  .Select(o => new { o.Rate.Rental, o }
                  .AsEnumerable()
                  .OrderBy(x => x.Rental);
    

    But it may also be an indexing issue. If there's no proper index on Rate.Rental, ordering by that column is expensive.

    0 讨论(0)
提交回复
热议问题