I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world)
I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER
.
Why don't use a DATE
field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER
?
Ex:
CREATE TABLE dimDate
id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT,
date DATE NOT NULL,
dayOfWeek
...
VS
CREATE TABLE dimDate
date DATE NOT NULL PRIMARY,
dayOfWeek
...
If you have a table with a column that is of date
type and where no two rows will ever have the same date, then you can surely use this column as PRIMARY KEY
.
You see a lot of examples where the Primary Key is a simple UNSIGNED INTEGER
because there are many cases where there is no perfect candidate for Primary Key. The AUTO_INCREMENT
allows this column to be automatically filled by the database (and be unique) during inserts .
Date dimension is kind of special -- having date (2011-12-07) or date-related integer (20111207) for a primary key is actually preferred. This allows for nice partitioning (by date) of fact tables.
For other type of dimensions, surrogate (integer) keys are recommended.
As a template, each dimension usually has entries for unknown, not entered, error, ...
which are often matched with keys 0, -1, -2, ...
Due to this, it is more common to find integer-formatted date (20111207) as a primary key instead of date -- it is a bit messy to represent unknown, not entered, error, ...
with date-type key.
来源:https://stackoverflow.com/questions/8415715/using-a-date-field-as-primary-key-of-a-date-dimension-with-mysql