Is there a formula to estimate index size in InnoDB?

后端 未结 2 2092
感动是毒
感动是毒 2021-02-11 09:42

How is it possible to calculate index size for specific column types in InnoDB i.e.:

  • VARCHAR
  • CHAR
  • TIMESTAMP
  • SMALLINT

I\'v

2条回答
  •  星月不相逢
    2021-02-11 10:35

    In InnoDB, the PRIMARY KEY is embedded with the data, so you can think of it as taking no space.

    For a secondary key... Take the MyISAM formula, but include the columns of both the secondary key and the PRIMARY KEY. Then multiply by 3. (There is a lot of overhead.) Still, the answer can be off by a factor of 2 in either direction.

    Note that if you have a lot of secondary keys, the size of the PK makes a big difference in the overall space for the table+indexes.

    Example

    SET @db = 'world', @tbl = 'cities';
        SELECT      n_rows AS 'Approx Rows',
                    'Data & PK' AS 'Type',
                    clustered_index_size * 16384 AS Bytes,
                    ROUND(clustered_index_size * 16384 / n_rows) AS 'Bytes/row',
                    clustered_index_size AS Pages,
                    ROUND(n_rows / clustered_index_size) AS 'Rows/page'
            FROM mysql.innodb_table_stats
            WHERE database_name = @db
              AND table_name = @tbl
        UNION
            SELECT  n_rows,
                    'Secondary Indexes' AS 'BTrees',
                    sum_of_other_index_sizes * 16384 AS Bytes,
                    ROUND(sum_of_other_index_sizes * 16384 / n_rows) AS 'Bytes/row',
                    sum_of_other_index_sizes AS Pages,
                    ROUND(n_rows / sum_of_other_index_sizes) AS 'Rows/page'
            FROM mysql.innodb_table_stats
            WHERE database_name = @db
              AND table_name = @tbl
              AND sum_of_other_index_sizes > 0
              ;
    -- (Percona has a different way.)
    

    Output:

    +-------------+-------------------+-----------+-----------+-------+-----------+
    | Approx Rows | Type              | Bytes     | Bytes/row | Pages | Rows/page |
    +-------------+-------------------+-----------+-----------+-------+-----------+
    |     2637973 | Data & PK         | 179077120 |        68 | 10930 |       241 |
    |     2637973 | Secondary Indexes | 232341504 |        88 | 14181 |       186 |
    +-------------+-------------------+-----------+-----------+-------+-----------+
    

    The table has two indexes:

    PRIMARY KEY(...)  -- 14 bytes
    INDEX(state, population)
    INDEX(state, city)
      state CHAR(2) CHARACTER SET ascii -- 2 bytes
      population INT UNSIGNED -- 4 bytes
      city  -- AVG(LENGTH(city)) = 1+9.07 bytes
    
    COUNT(*): 2,699,354  (the InnoDB estimate was not too far from this)
    
    First index:  20    bytes * 2.7M rows = 54MB
    Second index: 26.07 bytes * 2.7M rows = 70MB
    Total:  124MB
    Actual: 232MB
    Ratio: 1.9x  (note: I skipped the "/0.67")
    

    Just to prove another point, I tried OPTIMIZE TABLE. The stats after there were essentially the same:

    +-------------+-------------------+-----------+-----------+-------+-----------+
    | Approx Rows | Type              | Bytes     | Bytes/row | Pages | Rows/page |
    +-------------+-------------------+-----------+-----------+-------+-----------+
    |     2685828 | Data & PK         | 179077120 |        67 | 10930 |       246 |
    |     2685828 | Secondary Indexes | 232341504 |        87 | 14181 |       189 |
    +-------------+-------------------+-----------+-----------+-------+-----------+
    

提交回复
热议问题