Performance of User-Defined Table Types in SQL Server

问题

We have been using User-Defined Table Types to pass a list of integers to our stored procedures.

We then use these to join to other tables in our stored proc queries.

For example:

CREATE PROCEDURE [dbo].[sp_Name]
(
    @Ids [dbo].[OurTableType] READONLY  
)
AS
    SET Nocount ON

    SELECT
        *
    FROM
        SOMETABLE
        INNER JOIN @Ids [OurTableType] ON [OurTableType].Id = SOMETABLE.Id

We have seen very poor performance from this when using larger datasets.

One approach we've used to speed things up, is the dump the contents into a temp table and join off that instead.

For example:

CREATE PROCEDURE [dbo].[sp_Name]
(
    @Ids [dbo].[OurTableType] READONLY  
)
AS
    SET Nocount ON
    CREATE TABLE #TempTable(Id INT)
    INSERT INTO #TempTable
    SELECT Id from @Ids

    SELECT
        *
    FROM
        SOMETABLE
        INNER JOIN #TempTable ON #TempTable.Id = SOMETABLE.Id

    DROP TABLE #TempTable

This does improve performance significantly, but I wanted to get some opinions on this approach and any other consequences we haven't considered. Also an explanation as to why this improves performance may also be useful.

N.B. sometime we may need to pass in more than just an integer, hence why we don't use a comma separated list or something like that.

回答1:

This topic has been discussed before. The primary reason for the poor performance of the JOIN is that the Table-Valued Parameter (TVP) is a Table Variable. Table Variables do not keep statistics and appear to the Query Optimizer to only have 1 row. Hence they are just fine to do something like INSERT INTO Table (column_list) SELECT column_list FROM @TVP; but not a JOIN.

There are a few things to try to get around this:

Dump everything to a local temporary table (you are already doing this). A technical downside here is that you are duplicating the data passed into the TVP in tempdb (where both the TVP and temp table store their data).
Maybe try defining the User-Defined Table Type to have a Clustered Primary Key. You can do this inline on the [Id] field:
```
[ID] INT NOT NULL PRIMARY KEY
```
Not sure how much this helps performance, but worth a try.
You could try adding OPTION (RECOMPILE) to the query. This is a way of getting the Query Optimizer to see how many rows are in a Table Variable so that it can have proper estimates.
```
SELECT column_list
FROM   SOMETABLE
INNER JOIN @Ids [OurTableType]
        ON [OurTableType].Id = SOMETABLE.Id
OPTION (RECOMPILE);
```
The downside here is that you have a RECOMPILE which takes additional time each time this proc is called. But that might be an overall net gain.
Starting in SQL Server 2014, you can take advantage of In-Memory OLTP and specify WITH (MEMORY_OPTIMIZED = ON) for the User-Defined Table Type. Please see Scenario: Table variable can be MEMORY_OPTIMIZED=ON for details. I have heard that this definitely helps. Unfortunately, in SQL Server 2014 and SQL Server 2016 RTM this feature is only available in 64-bit Enterprise Edition. But, starting with SQL Server 2016 SP1, this feature was made available to all editions (possible exception being SQL Server Express LocalDB).
SQL Server 2019 introduces "Table variable deferred compilation":

With table variable deferred compilation, compilation of a statement that references a table variable is deferred until the first actual execution of the statement. This deferred compilation behavior is identical to the behavior of temporary tables. This change results in the use of actual cardinality instead of the original one-row guess.

Please see the linked documentation for details.

PS. Don't do SELECT *. Always specify a column list. Unless doing something like an IF EXIST(SELECT * FROM)....

来源：https://stackoverflow.com/questions/35457211/performance-of-user-defined-table-types-in-sql-server

标签

sql-server

tsql

sql-server-2008-r2

user-defined-types

table-valued-parameters