问题
I have the following query
DECLARE @StartDate DATE = '2017-09-22'
DECLARE @EndDate DATE = '2017-09-23'
SELECT a.col1,
a.col2,
b.col1,
b.col2,
b.col3,
a.col3
FROM TableA a
JOIN TableB b
ON b.pred = a.pred
WHERE b.col2 > @StartDate AND b.col2 < @EndDate
When I run this and inspect the actual execution plan, I can see that the most costly operator is a clustered index scan (The index is on a.pred)
However, if I change the query as follows
SELECT a.col1,
a.col2,
b.col1,
b.col2,
b.col3,
a.col3
FROM TableA a
JOIN TableB b
ON b.pred = a.pred
WHERE b.col2 > '2017-09-22' AND b.col2 < '2017-09-23'
The index scan is eliminated and an index seek is used.
Can someone explain why this is? In my mind, this is something to do with the fact that the value in the variable could be anything so SQL doesn't know how plan the execution.
Is there any way I can eliminate the table scan but still have use of the variable? (PS, this will be converted to a stored procedure which takes @StartDate and @EndDate as parameters)
EDIT
col2 is DATETIME, however, if I make my variable DATETIME the problem still persists
回答1:
SQL makes plans reusable for variables.
When you use variables - it compiles query without knowing actual values you'll pass. Even in this sql batch
values are known. But it won't need to recompile query for another set of passing arguments.
So if you hardcode values - DB compiles it chosing the plan optimized for these particular values (e.g., it guesses expected number of rows passed date check). It'd be 'at least not worse' than when you use variables. But DB needs to recompile it for another set of hardcoded values (because text of the query is changed), which takes time and litters compiled query cache
storage superseding the other useful queries.
As of:
Is there any way I can eliminate the table scan but still have use of the variable? (PS, this will be converted to a stored procedure which takes @StartDate and @EndDate as parameters)
I think non-clustered index on b.col2
maybe be the solution. The key of this index may also contain b.pred as a part of surrogate key or as including (with include(pred)
).
回答2:
This query has variables, the advice in this question about sql server not knowing the value of your variables and therefore has to develop a plan based on guessed size of the result set is relevant to your question.
Why is SQL Server using index scan instead of index seek when WHERE clause contains parameterized values
However, you mentioned that you are going to convert this code to a stored procedure. At the time of converting it to a stored procedure the query optimizer should be able to sniff the value of the variables and develop and execution plan off of them. Try converting it to a sproc and executing it. The query plan should improve under those conditions.
来源:https://stackoverflow.com/questions/46852874/date-parameter-causing-clustered-index-scan