SQL Server heap v.s. clustered index

后端未结

关注

 3  1819

I am using SQL Server 2008. I know if a table has no clustered index, then it is called heap, or else the storage model is called clustered index (B-Tree).

I want to lea

相关标签:

3条回答

一生所求

2021-01-31 04:15

Books Online is the best source!

The whole Database Engine - Planning and Architecture - Tables and Index Data Structures Architecture is very good internal introduction.

From this link you can download a local copy of Books Online(it is free). It is the best (and official) reference to all Sql 2008 questions.

0 讨论(0)
发布评论:

提交评论
- 加载中...
忘掉有多难

2021-01-31 04:32
Heap storage has nothing to do with these heaps.

Heap just means records themselves are not ordered (i. e. not linked to one another).

When you insert a record, it just gets inserted into the free space the database finds.

Updating a row in a heap based table does not affect other records (though it affects secondary indexes)

If you create a secondary index on a HEAP table, the RID (a kind of a physical pointer to the storage space) is used as a row pointer.

Clustered index means that the records are part of a B-Tree. When you insert a record, the B-Tree needs to be relinked.

Updating a row in a clustered table causes relinking of the B-Tree, i. e. updating internal pointers in other records.

If you create a secondary index on a clustered table, the value of the clustered index key is used as a row pointer.

This means a clustered index should be unique. If a clustered index is not unique, a special hidden column called uniquifier is appended to the index key that makes if unique (and larger in size).

It is also worth noting that creating a secondary index on a column makes the values or the clustered index's key to be the part of the secondayry index's key.

By creating an index on a clustered table, you in fact always get a composite index
```
CREATE UNIQUE CLUSTERED INDEX CX_mytable_1234 (col1, col2, col3, col4)

CREATE INDEX IX_mytable_5678 (col5, col6, col7, col8)
```
Index IX_mytable_5678 is in fact an index on the following columns:
```
col5
col6
col7
col8
col1
col2
col3
col4
```
This has one more side effect:

A DESC condition in a single-column index on a clustered table makes sense in SQL Server

This index:
```
CREATE INDEX IX_mytable ON mytable (col1)
```
can be used in a query like this:
```
SELECT  TOP 100 *
FROM    mytable
ORDER BY
       col1, id
```
, while this one:
```
CREATE INDEX IX_mytable ON mytable (col1 DESC)
```
can be used in a query like this:
```
SELECT  TOP 100 *
FROM    mytable
ORDER BY
       col1, id DESC
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
耶瑟儿～

2021-01-31 04:35
Heaps are just tables without a clustering key - without a key that enforces a certain physical order.

I would not really recommend having heaps at any time - except maybe if you use a table temporarily to bulk-load an external file, and then distribute those rows to other tables.

In every other case, I would strongly recommend using a clustering key. SQL Server will use the Primary Key as the clustering key by default - which is a good choice, in most cases. UNLESS you use a GUID (UNIQUEIDENTIFIER) as your primary key, in which case using that as your clustering key is a horrible idea.

See Kimberly Tripp's excellent blog posts GUIDs as Primary and/or the clustering key and The Clustered Index Debate Continues for excellent explanations why you should always have a clustering key, and why a GUID is a horrible clustering key.

My recommendation would be:
- in 99% of all cases try to use a INT IDENTITY as your primary key and let SQL Server make that the clustering key as well
- exception #1: if you're bulk loading huge data amounts, you might be fine without a primary / clustering key for your temporary table
- exception #2: if you must use a GUID as your primary key, then set your clustering key to a different column - preferably a INT IDENTITY - and I would even create a separate INT column just for that purpose, if no other column can be used
Marc
0 讨论(0)
发布评论:

提交评论
- 加载中...

SQL Server heap v.s. clustered index

Updating a row in a heap based table does not affect other records (though it affects secondary indexes)

Updating a row in a clustered table causes relinking of the B-Tree, i. e. updating internal pointers in other records.

By creating an index on a clustered table, you in fact always get a composite index

A DESC condition in a single-column index on a clustered table makes sense in SQL Server

A `DESC` condition in a single-column index on a clustered table makes sense in `SQL Server`