I have the following part of a query (It\'s the end of a larger query - queryBuilder is an IQueryable)
var results = queryBuilder
I expect for it to order the vehicles by RangeId and then by rental
In an LINQ to Entities query, any ordering before a GroupBy
is simply ignored. You won't even see it in the executed SQL. That is because Entity Framework takes the grouping expression to order by (in your case x => x.Vehicle.RangeId
). Why is that?
LINQ's GroupBy
is seemingly similar to SQL's GROUP BY
, but actually it's quite different.
GROUP BY
in SQL is "destructive", by which I mean that any information other than the columns in the GROUP BY
is lost (apart from aggregate expressions). If you do ...
SELECT Brand, COUNT(*)
FROM Cars
GROUP BY Brand
... you only see Brand
and their counts. You don't see the cars in the groups.
That's exactly what LINQ's GroupBy
does: it produces groups of complete objects. All information in the original data is still there. You'll see cars grouped by their brands.
That means that ORMs that translate GroupBy
as GROUP BY
give themselves a hard time building the result set. LINQ to SQL does that. It executes a GROUP BY
query first and then it needs separate queries (one per group actually) to make up for the "lost" data.
EF implements GroupBy
differently. It gets all data in one query and then it builds the groups in memory. You won't see GROUP BY
in the generated SQL. You see an ORDER BY
instead. I think EF prefers a sorted SQL query result for more efficient processing in memory. (And I can imagine combines better with other LINQ statements in the pipeline).
So that's why any ordering before GroupBy
is ignored. And why you can only apply ordering after the grouping.
the performance is absolutely dire
It's hard to tell from here why that is. Maybe you can do the ordering in memory:
var results = queryBuilder
.GroupBy(x => x.Vehicle.RangeId)
.Select(x => x.OrderBy(o => o.Rate.Rental).FirstOrDefault())
.Select(o => new { o.Rate.Rental, o }
.AsEnumerable()
.OrderBy(x => x.Rental);
But it may also be an indexing issue. If there's no proper index on Rate.Rental
, ordering by that column is expensive.