How is it possible to calculate index size for specific column types in InnoDB i.e.:
I\'v
One secondary index record is indexed fields + primary key. Sum up field sizes, multiply by number of records, add some overhead and that would be index size.
In InnoDB, the PRIMARY KEY
is embedded with the data, so you can think of it as taking no space.
For a secondary key... Take the MyISAM formula, but include the columns of both the secondary key and the PRIMARY KEY
. Then multiply by 3. (There is a lot of overhead.) Still, the answer can be off by a factor of 2 in either direction.
Note that if you have a lot of secondary keys, the size of the PK makes a big difference in the overall space for the table+indexes.
Example
SET @db = 'world', @tbl = 'cities';
SELECT n_rows AS 'Approx Rows',
'Data & PK' AS 'Type',
clustered_index_size * 16384 AS Bytes,
ROUND(clustered_index_size * 16384 / n_rows) AS 'Bytes/row',
clustered_index_size AS Pages,
ROUND(n_rows / clustered_index_size) AS 'Rows/page'
FROM mysql.innodb_table_stats
WHERE database_name = @db
AND table_name = @tbl
UNION
SELECT n_rows,
'Secondary Indexes' AS 'BTrees',
sum_of_other_index_sizes * 16384 AS Bytes,
ROUND(sum_of_other_index_sizes * 16384 / n_rows) AS 'Bytes/row',
sum_of_other_index_sizes AS Pages,
ROUND(n_rows / sum_of_other_index_sizes) AS 'Rows/page'
FROM mysql.innodb_table_stats
WHERE database_name = @db
AND table_name = @tbl
AND sum_of_other_index_sizes > 0
;
-- (Percona has a different way.)
Output:
+-------------+-------------------+-----------+-----------+-------+-----------+
| Approx Rows | Type | Bytes | Bytes/row | Pages | Rows/page |
+-------------+-------------------+-----------+-----------+-------+-----------+
| 2637973 | Data & PK | 179077120 | 68 | 10930 | 241 |
| 2637973 | Secondary Indexes | 232341504 | 88 | 14181 | 186 |
+-------------+-------------------+-----------+-----------+-------+-----------+
The table has two indexes:
PRIMARY KEY(...) -- 14 bytes
INDEX(state, population)
INDEX(state, city)
state CHAR(2) CHARACTER SET ascii -- 2 bytes
population INT UNSIGNED -- 4 bytes
city -- AVG(LENGTH(city)) = 1+9.07 bytes
COUNT(*): 2,699,354 (the InnoDB estimate was not too far from this)
First index: 20 bytes * 2.7M rows = 54MB
Second index: 26.07 bytes * 2.7M rows = 70MB
Total: 124MB
Actual: 232MB
Ratio: 1.9x (note: I skipped the "/0.67")
Just to prove another point, I tried OPTIMIZE TABLE
. The stats after there were essentially the same:
+-------------+-------------------+-----------+-----------+-------+-----------+
| Approx Rows | Type | Bytes | Bytes/row | Pages | Rows/page |
+-------------+-------------------+-----------+-----------+-------+-----------+
| 2685828 | Data & PK | 179077120 | 67 | 10930 | 246 |
| 2685828 | Secondary Indexes | 232341504 | 87 | 14181 | 189 |
+-------------+-------------------+-----------+-----------+-------+-----------+