SQL Index - Difference Between char and int

问题

I have a table on Sql Server 2005 database. The primary key field of the table is a code number.

As a standard, the code must contain exactly 4 numeric digits. For example: 1234, 7834, ...

Do you suggest that field type to be char(4) or int or numeric(4) in terms of effective select operation. Would indexing the table on any type of these differ from any other?

回答1:

Integer / Identity columns are often used for primary keys in database tables for a number of reasons. Primary key columns must be unique, should not be updatable, and really should be meaningless. This makes an identity column a pretty good choice because the server will get the next value for you, they must be unique, and integers are relatively small and useable (compared to a GUID).

Some database architects will argue that other data types should be used for primary key values and the "meaningless" and "not updatable" criteria can be argued convincingly on both sides. Regardless, integer / identity fields are pretty convenient and many database designers find that they make suitable key values for referential integrity.

The best choice for primary key are integer data types since integer values are process faster than character data type values. A character data type (as a primary key) needs to be converted to ASCII equivalent values before processing.
Fetching the record on the basis of primary key will be faster in case of integers as primay keys as this will mean more index records will be present on a single page. So the total search time decreases. Also the joins will be faster. But this will be applicable incase your query uses clustered index seek and not scan and if only one table is used. In case of scan not having additional column will mean more rows on one data page.

Hopefully this will help you!

回答2:

I advocate a SMALLINT column. Just because it is the most sensible datatype that will fit the required range (up to 65535, in excess of 4 digits). Use a check constraint to enforce the 4-digit limitation and a COMPUTED column to return the char(4) column.

回答3:

If I remember correctly, ints take up less storage than chars, so you should go with int. These two links say the same:
http://www.eggheadcafe.com/software/aspnet/31759030/varcharschars-vs-intbigint-as-keys.aspx
http://sql-server-performance.com/Community/forums/p/16020/94489.aspx

回答4:

"It depends"

In this case, char(4) captures the data stored correctly with no storage overhead (4 bytes each). And 0001 is not the same as 1 of course.
You do have some overhead for processing collation etc if you have non-numeric digits, but it shouldn't matter for reasonably sized databases. And with a 4 digit code you do have an upper bound for number of rows especially if numeric (10k).
If your new codes are not strictly increasing, then you get the page split issue associated with GUID clustered keys
If they are strictly increasing, then use int and add a computed column to add leading zeros

来源：https://stackoverflow.com/questions/5141737/sql-index-difference-between-char-and-int

标签

sql-server-2005

indexing

primary-key

numeric