Why does SQL Server use a non-clustered index over the clustered PK in a “select *” operation?

心已入冬 提交于 2019-12-12 11:16:26

问题


I've got a very simple table which stores Titles for people ("Mr", "Mrs", etc). Here's a brief version of what I'm doing (using a temporary table in this example, but the results are the same):

create table #titles (
    t_id    tinyint     not null    identity(1, 1),
    title   varchar(20) not null,

    constraint pk_titles primary key clustered (t_id),
    constraint ux_titles unique nonclustered (title)
)
go

insert #titles values ('Mr')
insert #titles values ('Mrs')
insert #titles values ('Miss')

select * from #titles

drop table #titles

Notice that the primary key of the table is clustered (explicitly, for the sake of the example) and there's a non-clustered uniqueness constraint the the title column.

Here's the results from the select operation:

t_id title
---- --------------------
3    Miss
1    Mr
2    Mrs

Looking at the execution plan, SQL uses the non-clustered index over the clustered primary key. I'm guessing this explains why the results come back in this order, but what I don't know is why it does this.

Any ideas? And more importantly, any way of stopping this behavior? I want the rows to be returned in the order they were inserted.

Thanks!


回答1:


If you want order, you need to specify an explicit ORDER BY - anything else does not produce an order (it's "order" is random and could change). There is no implied ordering in SQL Server - not by anything. If you need order - say so with ORDER BY.

SQL Server probably uses the non-clustered index (if it can - if that index has all the columns your query is asking for) since that it smaller - usually just the index column(s) and the clustering key (again: one or multiple columns). The clustered index on the other hand is the whole data (at the leaf level), so it might require a lot more data to be read, in order to get your answer (not in this over-simplified example, of course - but in the real world).




回答2:


The only way to (absolutely and correctly) guarantee row order is to use ORDER BY -- anything else is an implementation detail and apt to explode, as demonstrated.

As to why the engine chose the unique index: it just didn't matter.

  1. There was no criteria favoring one index over another
  2. The unique index covered the data (title and PK) returned; this is somewhat speculative on my part, but SQL Server is doing what it thinks best.

Try it on a table with an additional column which is not covered -- no bets, but it may make the query planner change its mind.

Happy coding.




回答3:


SQLServer probably chose the non clustered index because all the data you requested (the id and title) could be retrieved from that index.

For such a trivial table it doesn't really matter which access path was chosen as the worse path is still only two IOs.

As someone commented above if you want your data in a particular order you must specificaly request this using the "ORDER BY" clause otherwise its pretty random what you get back.




回答4:


Nonclustered indexes are usually smaller than clustered ones so it is usually faster to scan a nonclustered index rather than a clustered one. That probably explains SQL Server's preference for a nonclustered index, even though in your case the indexes are the same size.

The only way to guarantee the order of rows returned is to specify ORDER BY. If you don't specify ORDER BY then you are implicitly telling the optimizer that it can choose what order to return the rows in.



来源:https://stackoverflow.com/questions/6936891/why-does-sql-server-use-a-non-clustered-index-over-the-clustered-pk-in-a-select

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!