How do I estimate SQL Server index sizes

孤街醉人 提交于 2019-12-04 10:21:05

An index leaf has a preamble identifying the data page (7 bytes plus some directory information for variable length columns, if any) plus a copy of the key value (s) which will be the same size as the table data for those columns. There's one for each row in the table. The higher up levels of the index are much smaller, usually less than 1% of the leaves unless you are indexing a very wide key.

The fill factor leaves some space free so that updates and inserts do not generate excessive leaf splitting traffic.

EDIT: This MSDN link describes the page-level structures, although it's a bit light on the format of the individual index rows. This presentation goes into the physical format of disk log entries and data pages to some extent. This one more detail and includes the index data structures. Numeric and fixed length columns have the size it says on the box; you would have to estimate the average size of varchar columns.

For reference, some documents on Oracle's block format can be found Here and Here.

when possible, I generally take 1000 records from the original table, insert them into my own table, and with the script below I have a sample to play with.

Ok it is not accurate, but can give me a starting point.

--Find out the disk size of an index:
--USE [DB NAME HERE]
go
SELECT
OBJECT_NAME(I.OBJECT_ID) AS TableName,
I.name AS IndexName,   
8 * SUM(AU.used_pages) AS 'Index size (KB)',
CAST(8 * SUM(AU.used_pages) / 1024.0 AS DECIMAL(18,2)) AS 'Index size (MB)'
FROM
sys.indexes I
JOIN sys.partitions P ON P.OBJECT_ID = I.OBJECT_ID AND P.index_id = I.index_id
JOIN sys.allocation_units AU ON AU.container_id = P.partition_id
--WHERE 
--    OBJECT_NAME(I.OBJECT_ID) = '<TableName>'    
GROUP BY
I.OBJECT_ID,    
I.name
ORDER BY
TableName

--========================================================================================

--http://msdn.microsoft.com/en-us/library/fooec9de780-68fd-4551-b70b-2d3ab3709b3e.aspx

--I believe that keeping the GROUP BY 
--is the best option in this case
--because of sys.allocation_units
--can have 4 types of data inside
--as below:

--type tinyint
--Type of allocation unit.
--0 = Dropped
--1 = In-row data (all data types, except LOB data types)
--2 = Large object (LOB) data (text, ntext, image, xml, large value types, and CLR     user-defined types)
--3 = Row-overflow data

--marcelo miorelli 8-NOV-2013
--========================================================================================
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!