LINQ generating SQL with duplicate nested selects

前端未结

关注

 2  2042

情深已故 2021-02-07 09:36

I\'m very new to the .NET Entity Framework, and I think it\'s awesome, but somehow I\'m getting this strange issue (sorry for the spanish but my program is in that language, an

2条回答

情深已故 (楼主)

2021-02-07 10:21
To be clear, LINQ to Entities does not generate the SQL. Instead, it generates an ADO.NET canonical command tree, and the ADO.NET provider for your database, presumably SQL Server in this case, generates the SQL.

So why does it generate this derived table (I think "derived table" is the more correct term for the SQL feature in use here)? Because the code which generates the SQL has to generate SQL for a wide variety of LINQ queries, most of which are not nearly as trivial as the one you show. These queries will often be selecting data for multiple types (many of which might be anonymous, rather than named types), and in order to keep the SQL generation relatively sane, they are grouped into extents for each type.

Another question: Why should you care? It's easy to demonstrate that the use of the derived table in this statement is "free" from a performance point of view.

I selected a table at random from a populated database, and run the following query:
```
SELECT [AddressId]
      ,[Address1]
      ,[Address2]
      ,[City]
      ,[State]
      ,[ZIP]
      ,[ZIPExtension]
  FROM [VertexRM].[dbo].[Address]
```
Let's look at the cost:
```
  
  
    
```
Now let's compare that to the query with the derived table:
```
SELECT 
       [Extent1].[AddressId]
      ,[Extent1].[Address1]
      ,[Extent1].[Address2]
      ,[Extent1].[City]
      ,[Extent1].[State]
      ,[Extent1].[ZIP]
      ,[Extent1].[ZIPExtension]
  FROM (SELECT [AddressId]
          ,[Address1]
          ,[Address2]
          ,[City]
          ,[State]
          ,[ZIP]
          ,[ZIPExtension]
  FROM[VertexRM].[dbo].[Address]) AS [Extent1]
```
And the cost:
```
  
  
    
```
In both cases, SQL Server simply scans the clustered index. Not surprisingly, the cost is almost precisely the same.

Let's take a look at a slightly more complicated query. I fired up LINQPad, and entered the following query against the same table, plus one related table:
```
from a in Addresses
select new
{
    Id = a.Id,
    Address1 = a.Address1,
    Address2 = a.Address2,
    City = a.City,
    State = a.State,
    ZIP = a.ZIP,
    ZIPExtension = a.ZIPExtension,
    PersonCount = a.EntityAddresses.Count()
}
```
This generates the following SQL:
```
SELECT 
1 AS [C1], 
[Project1].[AddressId] AS [AddressId], 
[Project1].[Address1] AS [Address1], 
[Project1].[Address2] AS [Address2], 
[Project1].[City] AS [City], 
[Project1].[State] AS [State], 
[Project1].[ZIP] AS [ZIP], 
[Project1].[ZIPExtension] AS [ZIPExtension], 
[Project1].[C1] AS [C2]
FROM ( SELECT 
    [Extent1].[AddressId] AS [AddressId], 
    [Extent1].[Address1] AS [Address1], 
    [Extent1].[Address2] AS [Address2], 
    [Extent1].[City] AS [City], 
    [Extent1].[State] AS [State], 
    [Extent1].[ZIP] AS [ZIP], 
    [Extent1].[ZIPExtension] AS [ZIPExtension], 
    (SELECT 
        COUNT(cast(1 as bit)) AS [A1]
        FROM [dbo].[EntityAddress] AS [Extent2]
        WHERE [Extent1].[AddressId] = [Extent2].[AddressId]) AS [C1]
    FROM [dbo].[Address] AS [Extent1]
)  AS [Project1]
```
Analyzing this, we can see that Project1 is the projection onto the anonymous type. Extent1 is the Address table/entity. And Extent2 is the table for the association. Now there is no derived table for Address, but there is one for the projection.

I don't know if you have ever written a SQL generation system, but it isn't easy. I believe that the general problem of proving that a LINQ to Entities query and a SQL query are equivalent is NP-hard, although certain specific cases are obviously much easier. SQL is intentionally Turing-incomplete, because its designers wanted all SQL queries to execute in bounded time. LINQ, not so.

In short, this is a very difficult problem to solve, and the combination of the Entity Framework and its providers do occasionally sacrifice some readability in favor of consistency over a wide range of queries. But it shouldn't be a performance issue.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...