Why CTE (Common Table Expressions) in some cases slow down queries comparing to temporary tables in SQL Server

后端 未结 3 967
予麋鹿
予麋鹿 2021-02-07 09:56

I have several cases where my complex CTE (Common Table Expressions) are ten times slower than the same queries using the temporary tables in SQL

相关标签:
3条回答
  • 2021-02-07 10:32

    There are different use cases for the two, and different advantages/disadvantages.

    Common Table Expressions

    Common Table Expressions should be viewed as expressions, not tables. As expressions, the CTE does not need to be instantiated, so the query optimizer can fold it into the rest of the query, and optimize the combination of the CTE and the rest of the query.

    Temporary Tables

    With temporary tables, the results of the query are stored in a real live table, in the temp database. The query results can then be reused in multiple queries, unlike CTEs, where the CTE, if used in multiple separate queries, would have to be a part of the work plan in each of those separate queries.

    Also, a temporary table can have an index, keys, etc. Adding these to a temp table can be a great assistance in optimizing some queries, and is unavailable in the CTE, though the CTE can utilize the indexes and keys in the tables underlying the CTE.

    If the underlying tables to a CTE don't support the type of optimizations you need, a temp table may be better.

    0 讨论(0)
  • 2021-02-07 10:35

    There can be several reason for Temp table performing better than CTE and vice versa depending upon specific Query and requirement.

    IMO in your case both the query are not optimize.

    Since CTE is evaluated every time it is referenced. so in your case

    SELECT a.MasterAccountId,
           ISNULL(t.BatchedOrders, 0) BatchedOrders,
           ISNULL(t.PendingOrders, 0) PendingOrders,
           ISNULL(av.AvailableStock, 0) AvailableOrders,
           ISNULL(av.AvailableBeforeCutOff, 0) AvailableCutOff,
           ISNULL(av.OrdersAnyStock, 0) AllOrders
    FROM MasterAccount a
    LEFT OUTER JOIN Available av ON av.MasterAccountId = a.MasterAccountId
    LEFT OUTER JOIN Totals t ON t.MasterAccountId = a.MasterAccountId
    WHERE a.IsActive = 1
    

    This query is showing High Cardinality estimate.MasterAccount table is evaluated multiple times.Due to this reason it is slow.

    In case of Temp table,

    SELECT a.MasterAccountId,
           ISNULL(t.BatchedOrders, 0) BatchedOrders,
           ISNULL(t.PendingOrders, 0) PendingOrders,
           ISNULL(av.AvailableStock, 0) AvailableOrders,
           ISNULL(av.AvailableBeforeCutOff, 0) AvailableCutOff,
           ISNULL(av.OrdersAnyStock, 0) AllOrders
    FROM MasterAccount a (NOLOCK)
    LEFT OUTER JOIN #Available av (NOLOCK) ON av.MasterAccountId = a.MasterAccountId
    LEFT OUTER JOIN Totals t (NOLOCK) ON t.MasterAccountId = a.MasterAccountId
    WHERE a.IsActive = 1
    

    Here #Available is already evaluated and result is store in temp table so MasterAccount table is join with Less resultset,thus Cardinality Estimate is less. similarly with #Orders table.

    Both CTE and Temp table query can be optimize in your case thus performance improved.

    So #Orders should be your base temp table and you should not use MasterAccount again later.you should use #Orders instead.

    INSERT INTO #Available
    SELECT  ma.MasterAccountId,
            SUM(IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                            CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END)) AS AvailableStock,
            SUM(IIF(sa.[Status] IN ('Full', 'Partial', 'None'), 1, 0))  AS OrdersAnyStock, 
    
            SUM(IIF(sa.RequisitionDate < dbo.TicksToTime(ma.DailyOrderCutOffTime, @toDate),
                    IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                                CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END), 0)) AS AvailableBeforeCutOff                             
    FROM #Orders ma (NOLOCK)
    INNER JOIN #StockAvailability2 sa ON sa.AccountNumber = dbo.fn_RemoveUnitPrefix(ma.TaskAccountId)
    GROUP BY ma.MasterAccountId, ma.IsPartialStock
    

    Here require column from MasterAcount table like ma.IsPartialStock etc should incorporated in #order table itself if possible.Hope my idea is clear.

    No need of MasterAccount table in in last query

    SELECT a.MasterAccountId,
           ISNULL(t.BatchedOrders, 0) BatchedOrders,
           ISNULL(t.PendingOrders, 0) PendingOrders,
           ISNULL(av.AvailableStock, 0) AvailableOrders,
           ISNULL(av.AvailableBeforeCutOff, 0) AvailableCutOff,
           ISNULL(av.OrdersAnyStock, 0) AllOrders
    FROM  #Available av 
    LEFT OUTER JOIN Totals t  ON t.MasterAccountId = av.MasterAccountId
    --WHERE a.IsActive = 1
    

    I think no need of Nolock hint in temp table.

    0 讨论(0)
  • 2021-02-07 10:48

    The answer is simple.

    SQL Server doesn't materialise CTEs. It inlines them, as you can see from the execution plans.

    Other DBMS may implement it differently, a well-known example is Postgres, which does materialise CTEs (it essentially creates temporary tables for CTEs behind the hood).

    Whether explicit materialisation of intermediary results in explicit temporary tables is faster, depends on the query.

    In complex queries the overhead of writing and reading intermediary data into temporary tables can be offset by more efficient simpler execution plans that optimiser is able to generate.

    On the other hand, in Postgres CTE is an "optimisation fence" and engine can't push predicates across CTE boundary.

    Sometimes one way is better, sometimes another. Once the query complexity grows beyond certain threshold an optimiser can't analyse all possible ways to process the data and it has to settle on something. For example, the order in which to join the tables. The number of permutations grows exponentially with the number of tables to choose from. Optimiser has limited time to generate a plan, so it may make a poor choice when all CTEs are inlined. When you manually break complex query into smaller simpler ones you need to understand what you are doing, but optimiser has a better chance to generate a good plan for each simple query.

    0 讨论(0)
提交回复
热议问题