What are the differences between a clustered
and a non-clustered index
?
Clustered Index
Non Clustered Index
Both types of index will improve performance when select data with fields that use the index but will slow down update and insert operations.
Because of the slower insert and update clustered indexes should be set on a field that is normally incremental ie Id or Timestamp.
SQL Server will normally only use an index if its selectivity is above 95%.
Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.
Non-clustered indexes are stored separately, and you can have as many as you want.
The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.
A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.
A Non-Clustered Index defines a logical order that does not match the physical order on disk.
An indexed database has two parts: a set of physical records, which are arranged in some arbitrary order, and a set of indexes which identify the sequence in which records should be read to yield a result sorted by some criterion. If there is no correlation between the physical arrangement and the index, then reading out all the records in order may require making lots of independent single-record read operations. Because a database may be able to read dozens of consecutive records in less time than it would take to read two non-consecutive records, performance may be improved if records which are consecutive in the index are also stored consecutively on disk. Specifying that an index is clustered will cause the database to make some effort (different databases differ as to how much) to arrange things so that groups of records which are consecutive in the index will be consecutive on disk.
For example, if one were to start with an empty non-clustered database and add 10,000 records in random sequence, the records would likely be added at the end in the order they were added. Reading out the database in order by the index would require 10,000 one-record reads. If one were to use a clustered database, however, the system might check when adding each record whether the previous record was stored by itself; if it found that to be the case, it might write that record with the new one at the end of the database. It could then look at the physical record before the slots where the moved records used to reside and see if the record that followed that was stored by itself. If it found that to be the case, it could move that record to that spot. Using this sort of approach would cause many records to be grouped together in pairs, thus potentially nearly doubling sequential read speed.
In reality, clustered databases use more sophisticated algorithms than this. A key thing to note, though, is that there is a tradeoff between the time required to update the database and the time required to read it sequentially. Maintaining a clustered database will significantly increase the amount of work required to add, remove, or update records in any way that would affect the sorting sequence. If the database will be read sequentially much more often than it will be updated, clustering can be a big win. If it will be updated often but seldom read out in sequence, clustering can be a big performance drain, especially if the sequence in which items are added to the database is independent of their sort order with regard to the clustered index.
Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.
Unclustered means it's "only" a logical order.
A clustered index is essentially a sorted copy of the data in the indexed columns.
The main advantage of a clustered index is that when your query (seek) locates the data in the index then no additional IO is needed to retrieve that data.
The overhead of maintaining a clustered index, especially in a frequently updated table, can lead to poor performance and for that reason it may be preferable to create a non-clustered index.