I\'ve never really understood the difference between these two indexes, can someone please explain what the difference is (performance-wise, how the index structure will look li
The internal storage of indexes uses a B-Tree structure and consists of "index pages" (the root and all intermediate pages) and "index data pages" (the leaf pages only).
Note do not confuse "index data pages" with the "data pages" (leaf pages of clustered indexes) which store most of the columns of actual data.
INCLUDE
section, less data per index key is stored on each page.When an index is used, the index key is used to navigate through the index pages to the correct index data page.
INCLUDE
columns, that data is immediately available should the query need it.INCLUDE
columns, then an additional "bookmark lookup" is required to the correct row in the clustered index (or heap if no clustered index defined).Some things to note that hopefully addresses some of your confusion:
INCLUDE
columns).INCLUDE
columns as well.)It's worth noting that before INCLUDE
columns were added as a feature:
INCLUDE
columns basically allow the same benefit more efficiently.NB Something very important to point out. You generally get zero benefit out of
INCLUDE
columns in your indexes if you're in the lazy habit of always writing your queries asSELECT * ...
. By returning all columns you're basically ensuring a bookmark lookup is required in any case.
In first Index, in Index page
only PostalCode
is the key column and AddressLine1, AddressLine2, City, StateProvinceID
are part of leaf node to avoid key/RID
lookup
I will prefer first index when my table will be filtered always on PostalCode
and any of this columns AddressLine1, AddressLine2, City, StateProvinceID
will be part of select
and not filtration
select AddressLine1, AddressLine2, City, StateProvinceID
from Person.Address
Where PostalCode=
In second index, in Index page
there will be five key columns PostalCode, AddressLine1, AddressLine2, City, StateProvinceID
I will prefer second index when I have possiblity to filter data like
Where PostalCode = And AddressLine1 =
or
Where PostalCode = And AddressLine2 =
or
Where PostalCode = And AddressLine1 = and AddressLine2 =
and so on..
At any case the first column in index should be part of filtration to utilize the index
In the first example, only the index column: PostalCode is stored in the index tree with all the other columns stored in leaf level of the index. This makes the index smaller in size and is useful if you wouldn't be using a where, Join, group by against the other columns but only against the PostalCode.
In the second index, all the data for all the columns are stored in the index tree, this makes the index much bigger but is useful if you would be using any of the columns in a WHERE/JOIN/GROUP BY/ORDER By statements.
Include columns makes it faster to retrieve the data when they are specified in the select list.
For example if you are running:
SELECT PostalCode, AddressLine1, AddressLine2, City, StateProvinceID
FROM Person.Address
Where PostalCode= 'A1234'
This will benefit from creating an index on PostalCode and including all the other columns
On the other hand, if you are running:
SELECT PostalCode, AddressLine1, AddressLine2, City, StateProvinceID
FROM Person.Address
Where PostalCode= 'A1234' or City = 'London' or StateProvinceID = 1 or AddressLine1 = 'street A' or AddressLine2 = 'StreetB'
This would benefit more from having all the columns in the index
Have a look at the links below, these might help more with your query
Index with Included Column: https://msdn.microsoft.com/en-us/library/ms190806(v=sql.105).aspx
Table and Index Organization: https://msdn.microsoft.com/en-us/library/ms189051(v=sql.105).aspx