I came across a comment on the following blogpost that recommends against using MEDIUMINT
:
Don’t use [the 24bit INT], even in MySQL. It’s dumb,
In the grand scheme of things, fetching a row is the big cost. Simple functions, expressions, and much less, data formats, is insignificant in how long a query takes.
On the other side, if your dataset it too large to stay cached, the overhead of I/O to fetch row(s) is even more significant. A crude rule of thumb says that a non-cached row takes 10 times as long as a cached one. Hence, shrinking the dataset (such as using a smaller *INT
) may give you a huge performance benefit.
This argument apples to ...INT
, FLOAT
vs DOUBLE
, DECIMAL(m,n)
, DATETIME(n)
, etc. (A different discussion is needed for [VAR]CHAR/BINARY(...)
and TEXT/BLOB
.)
For those with a background in Assembly language...
Hence, the only sane way to write the code is to work at the byte level, and to ignore register size and assume all values are mis-aligned.
For Optimization, in order of importance:
Rule of Thumb: If a tentative optimization does not (via back-of-envelope calc) yield 10% improvement, don't waste your time on it. Instead look for some bigger improvement. For example, indexes and Summary tables are often provide 10x (not just 10%).