问题
This is a bit of an open question but I would really like to hear people opinions.
I rarely make use of explicitly declared temporary tables (either table variables or regular #tmp tables) as I believe not doing so leads to more concise, readable and debuggable T-SQL. I also think that SQL can do a better job than I of making use of temporary storage when it's required (such as when you use a derived table in a query).
The only exception is when the database is not a typical relational database but a star or snowflake schema. I understand that it's best to apply filters to the fact table first and then use the resultant temp table to get the values from your dimensions.
Is this the common opinion or does anyone have an opposing view?
回答1:
Temporary tables are most useful for a complex batch process like a report or ETL job. Generally you would expect to use them fairly rarely in a transactional application.
If you're doing complex query with a join involving multiple large tables (perhaps for a report) the query optimiser may not actually be able to optimise this in one hit, so temporary tables become a win here - they decompose the query into a series of simpler ones that give the query optimiser less opportunity to screw up the plan. Sometimes you have an operation that cannot be done in a single SQL statement at all, so multiple steps for processing are necessary to do the job at all. Again, we're talking about more complex manipulations here.
You can also create a tempory table for an intermediate result and then index the table, possibly even putting a clustered index on it to optimise a subsequent query. This might also be a quick and dirty way to optimise a report query on a system where you are not allowed to add indexes to the database schema. SELECT INTO is useful for this type of operation as it is minimally logged (and therefore fast) and doesn't require to align the columns of a select and insert.
Other reasons might include extracting data from XML fields using CROSS APPLY and xpath queries. Generally it's much more efficient to extract this into a temp table and then work on the temp table. They're also much faster than CTE's for some tasks as they materialise the query results rather than re-evaluating the query.
One thing to note is that temporary tables are exactly the same structure that the query engine uses to store intermediate join results, so there is no performance penalty to using them. Temporary tables also allow multi-phase tasks using set operations and make cursors almost (not quite but almost) unnecessary in T-SQL code.
'Code Smell' is an overstatement but if I saw a lot of simple operations involving temporary tables I would be wondering what was going on.
回答2:
It really depends on what you are doing. I generally try to avoid them, but sometimes you need to do something complicated that takes multiple steps. Generally this is way beyond the simple select from table stuff. Like anything else it's a tool that you have to know when to use.
I would agree with you that I normally let the db handle stuff behind the scenes, but there are times when it's optimization is off and you have to go in and do it by hand.
回答3:
I see temp tables as a sort of SQL code smell, to be used only as a last resort. If you are having to cache data before you get a final result set, then it usually indicates bad DB design to me.
回答4:
Temp tables certainly have appropriate uses, they're not a code smell if they're used correctly. One of the nice things about them is that they live in tempdb, which is typically set to Simple recovery model. This means that if you're using temp tables for what they're good for (mostly bulk operations), you're generating a minimal amount of log compared to what the same operation would do on tables in your production db, which probably is in Full recovery model.
If, as another poster suggested, your production db is on good hardware, but your tempdb isn't, ask your DBA to move it. SQL Server itself uses tempdb quite a bit to process your queries, so it's important for tempdb to have a high performance home.
Table variables are a different creature entirely. They live only in memory. One good use for them is if you've got a function that you need to call for each row in your query with CROSS APPLY. If that function is expensive, but the number of different results you can get from it is small, you might get significantly higher performance from precomputing the results of all the possible calls (or perhaps all possible calls for your dataset) and storing that in a table variable, then joining to that table variable instead of using CROSS APPLY.
回答5:
I, too, avoid temporary tables. It is my understanding that temporary tables on MS SQL Server are always in the file group of the master database. What that means is that, while your production application tables are most probably on some expensive, high performance RAID set up, your temporary tables are located wherever MS SQL Server was installed which is most probably on your C: drive under the Program Files directory.
回答6:
Also useful when you have a dataset that needs to be retrieved once and used over and over in subsequent statements.
Makes these long batch processes more readable (sometimes this is more important than performance).
来源:https://stackoverflow.com/questions/447115/temporary-table-usage-in-sql-server