SQL Server - Clustered index design for dictionary

问题

Would like some advice from this. I got a table where I want to keep track of an object and a list of keys related to the object. Example:

OBJECTID   ITEMTYPE   ITEMKEY
--------   --------   -------
1          1          THE
1          1          BROWN
1          2          APPLE
1          3          ORANGE
2          2          WINDOW

Both OBJECTID and ITEMKEY have high selectivity (i.e. the OBJECTID and ITEMKEY are very varied). My access are two ways:

By OBJECTID: Each time an object changes, the list of key changes so a key is needed based on OBJECTID. Changes happen frequently.
By ITEMKEY: This is for keyword searching and also happens frequently.

So I probably need two keys, and choose one for clustered index (the one that is more frequently accessed, or where I want the speed to be, for now lets assume i will prioritize OBJECTID for clustered). What I am confused about is how I should design it.

My questions is, which is better:

a) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and then an index of (ITEMKEY). My concern is that since a clustered index is so big (2 ints, 1 string) the index will be big, because all index items got to point back to the clustered key.

b) Create a new column with a running identity DIRECTORYID (integer) as primary key and clustered index, and declare two index for (OBJECTID,ITEMTYPE,ITEMKEY) and just (ITEMKEY). This will minimize index space but have higher lookup costs.

c) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and a materialized view of (ITEMKEY,ITEMTYPE,OBJECTID) on it. My logic is that this is avoids a key lookup and will still be just as big as the index with a lookup in a), at cost of higher overhead.

d) Err...maybe there is a better way given the requirements?

Thanks in advance, Andrew

回答1:

If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table.

Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a VARCHAR column - especially if that column is potentially wide (> 10 chars) and is bound to change.

So of the options you present, I personally would choose b) - why??

Adding a surrogate DirectoryID will satisfy all crucial criteria for a clustering key:

small
stable
unique
ever-increasing

and your other non-clustered indices will be minimally impacted.

See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening!

To satisfy your query requirements, I would add two non-clustered indices, one on ObjectID (possibly including other columns frequently needed), and another on ItemKey to search by keyname.

来源：https://stackoverflow.com/questions/3849068/sql-server-clustered-index-design-for-dictionary

标签

sql

sql-server

database-design

clustered-index