问题
Would like some advice from this. I got a table where I want to keep track of an object and a list of keys related to the object. Example:
OBJECTID ITEMTYPE ITEMKEY
-------- -------- -------
1 1 THE
1 1 BROWN
1 2 APPLE
1 3 ORANGE
2 2 WINDOW
Both OBJECTID and ITEMKEY have high selectivity (i.e. the OBJECTID and ITEMKEY are very varied). My access are two ways:
By OBJECTID: Each time an object changes, the list of key changes so a key is needed based on OBJECTID. Changes happen frequently.
By ITEMKEY: This is for keyword searching and also happens frequently.
So I probably need two keys, and choose one for clustered index (the one that is more frequently accessed, or where I want the speed to be, for now lets assume i will prioritize OBJECTID for clustered). What I am confused about is how I should design it.
My questions is, which is better:
a) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and then an index of (ITEMKEY). My concern is that since a clustered index is so big (2 ints, 1 string) the index will be big, because all index items got to point back to the clustered key.
b) Create a new column with a running identity DIRECTORYID (integer) as primary key and clustered index, and declare two index for (OBJECTID,ITEMTYPE,ITEMKEY) and just (ITEMKEY). This will minimize index space but have higher lookup costs.
c) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and a materialized view of (ITEMKEY,ITEMTYPE,OBJECTID) on it. My logic is that this is avoids a key lookup and will still be just as big as the index with a lookup in a), at cost of higher overhead.
d) Err...maybe there is a better way given the requirements?
Thanks in advance, Andrew
回答1:
If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table.
Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a VARCHAR
column - especially if that column is potentially wide (> 10 chars) and is bound to change.
So of the options you present, I personally would choose b) - why??
Adding a surrogate DirectoryID
will satisfy all crucial criteria for a clustering key:
- small
- stable
- unique
- ever-increasing
and your other non-clustered indices will be minimally impacted.
See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening!
To satisfy your query requirements, I would add two non-clustered indices, one on ObjectID
(possibly including other columns frequently needed), and another on ItemKey
to search by keyname.
来源:https://stackoverflow.com/questions/3849068/sql-server-clustered-index-design-for-dictionary