surrogate vs natural key: hard numbers on performance differences?

后端未结

关注

 2  1587

情话喂你 2021-02-03 13:52

There\'s a healthy debate out there between surrogate and natural keys:

SO Post 1

SO Post 2

My opinion, which seems to be in line with the majority (it\

2条回答

不知归路 (楼主)

2021-02-03 14:25
Natural keys differ from surrogate keys in value, not type.

Any type can be used for a surrogate key, like a VARCHAR for the system-generated slug or something else.

However, most used types for surrogate keys are INTEGER and RAW(16) (or whatever type your RDBMS does use for GUID's),

Comparing surrogate integers and natural integers (like SSN) takes exactly same time.

Comparing VARCHARs make take collation into account and they are generally longer than integers, that making them less efficient.

Comparing a set of two INTEGER is probably also less efficient than comparing a single INTEGER.

On datatypes small in size this difference is probably percents of percents of the time required to fetch pages, traverse indexes, acquite database latches etc.

And here are the numbers (in MySQL):
```
CREATE TABLE aint (id INT NOT NULL PRIMARY KEY, value VARCHAR(100));
CREATE TABLE adouble (id1 INT NOT NULL, id2 INT NOT NULL, value VARCHAR(100), PRIMARY KEY (id1, id2));
CREATE TABLE bint (id INT NOT NULL PRIMARY KEY, aid INT NOT NULL);
CREATE TABLE bdouble (id INT NOT NULL PRIMARY KEY, aid1 INT NOT NULL, aid2 INT NOT NULL);

INSERT
INTO    aint
SELECT  id, RPAD('', FLOOR(RAND(20090804) * 100), '*')
FROM    t_source;

INSERT
INTO    bint
SELECT  id, id
FROM    aint;

INSERT
INTO    adouble
SELECT  id, id, value
FROM    aint;

INSERT
INTO    bdouble
SELECT  id, id, id
FROM    aint;

SELECT  SUM(LENGTH(value))
FROM    bint b
JOIN    aint a
ON      a.id = b.aid;

SELECT  SUM(LENGTH(value))
FROM    bdouble b
JOIN    adouble a
ON      (a.id1, a.id2) = (b.aid1, b.aid2);
```
t_source is just a dummy table with 1,000,000 rows.

aint and adouble, bint and bdouble contain exactly same data, except that aint has an integer as a PRIMARY KEY, while adouble has a pair of two identical integers.

On my machine, both queries run for 14.5 seconds, +/- 0.1 second

Performance difference, if any, is within the fluctuations range.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

surrogate vs natural key: hard numbers on performance differences?

Natural keys differ from surrogate keys in value, not type.

Comparing surrogate integers and natural integers (like SSN) takes exactly same time.

On my machine, both queries run for 14.5 seconds, +/- 0.1 second

Comparing surrogate integers and natural integers (like `SSN`) takes exactly same time.