SQL Server 2008: Ordering by datetime is too slow

前端 未结 7 1915
鱼传尺愫
鱼传尺愫 2020-12-14 01:56

My table (SQL Server 2008) has 1 million+ records, when I try to order records by datetime, it takes 1 second, but when I order by ID (int), it only takes about 0.1 second.<

相关标签:
7条回答
  • 2020-12-14 02:22

    If your datetime field contains a lot of distinct values and those values rarely change, define a clustered index on the datetime field, this will sort the actual data by the datetime value. See http://msdn.microsoft.com/en-us/library/aa933131(SQL.80).aspx for using clustered indexes.

    This will make you int searches slower though, as they will be relegated to using a non-clustered index.

    0 讨论(0)
  • 2020-12-14 02:23

    Add the date time to a new index, adding it to the id one will still not help much.

    0 讨论(0)
  • 2020-12-14 02:30

    maybe if you store datatime as a int but it would take time converting each time you store or get data. (common technique used to store staff like ip address and have a faster seek times)

    you should check in your server how it stores datetime, b/c it your server already stores it as int or bigint.. it will not change anything....

    0 讨论(0)
  • 2020-12-14 02:36

    Ordering by id probably uses a clustered index scan while ordering by datetime uses either sorting or index lookup.

    Both these methods are more slow than a clustered index scan.

    If your table is clustered by id, basically it means it is already sorted. The records are contained in a B+Tree which has a linked list linking the pages in id order. The engine should just traverse the linked list to get the records ordered by id.

    If the ids were inserted in sequential order, this means that the physical order of the rows will match the logical order and the clustered index scan will be yet faster.

    If you want your records to be ordered by datetime, there are two options:

    • Take all records from the table and sort them. Slowness is obvious.
    • Use the index on datetime. The index is stored in a separate space of the disk, this means the engine needs to shuttle between the index pages and table pages in a nested loop. It is more slow too.

    To improve the ordering, you can create a separate covering index on datetime:

    CREATE INDEX ix_mytable_datetime ON mytable (datetime) INCLUDE (field1, field2, …)
    

    , and include all columns you use in your query into that index.

    This index is like a shadow copy of your table but with data sorted in different order.

    This will allow to get rid of the key lookups (since the index contains all data) which will make ordering by datetime as fast as that on id.

    Update:

    A fresh blog post on this problem:

    • SQL Server: clustered index and ordering
    0 讨论(0)
  • 2020-12-14 02:44

    To honor the ORDER BY the engine has two alternatives:

    • scan the rows using an index that offers the order requested
    • sort the rows

    First option is fast, second is slow. The problem is that in order to be used, the index has to be a covering index. Meaning it contains all the columns in the SELECT projection list and all the columns used in WHERE clauses (at a minimum). If the index is not covering then the engine would have to lookup the clustered index (ie the 'table') for each row, in order to retrieve the values of the needed columns. This constant lookup of values is expensive, and there is a tipping point when the engine will (rightfully) decide is more efficient to just scan the clustered index and sort the result, in effect ignoring your non-clustered index. For details, see The Tipping Point Query Answers.

    Consider the following three queries:

    SELECT dateColumn FROM table ORDER BY dateColumn
    SELECT * FROM table ORDER BY dateColumn
    SELECT someColumn FROM table ORDER BY dateColumn
    

    The first one will be be using a non-clustered index on dateColumn. But a the second one will not be using an index on dateColumn, will likely choose a scan and sort instead for 1M rows. On the other hand the third query can benefit from an index on Table(dateColumn) INCLUDE (someColumn).

    This topic is covered at large on MSDN see Index Design Basics , General Index Design Guidelines , Nonclustered Index Design Guidelines or How To: Optimize SQL Indexes.

    Ultimately, the most important choice of your table design is the clustered index you use. Almost always the primary key (usually an auto incremented ID) is left as the clustered index, a decision that benefits only certain OLTP loads.

    And finally, a rather obvious question: Why in the world would you order 1 million rows?? You can't possibly display them, can you? Explaining a little bit more about your use case might help us find a better answer for you.

    0 讨论(0)
  • 2020-12-14 02:44

    Have you added the DateTime field to "the" index or to an exclusive index? Are you filtering your selection by another field and the DateTime or only this one?

    You must have an index with all the fields that you are filtering and preferably in the same order to optmize performance.

    0 讨论(0)
提交回复
热议问题