What is the definition of secondary index in postgresql?

不羁的心 提交于 2019-12-10 22:12:29

问题


From https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html:

All indexes in PostgreSQL are secondary indexes, meaning that each index is stored separately from the table's main data area (which is called the table's heap in PostgreSQL terminology).

In postgresql, is a secondary index defined as an index which is stored separately from the table's main data area?

If not, what is its definition, and why does the quote mention the one which is not a definition?

Similarly, what is the definition of a primary index?

Are the concepts in postgresql the same as in Oracle database?

Thanks.


回答1:


There is some lack of precision in the definitions of primary & secondary indices.

Using two popular university texts as reference:

Fundamentals of Database Systems, Elmasri & Navathe defines them as:

A primary index as an index on an ordered file where the search key is the same as the sort key

A secondary index provides a secondary means of accessing a data file for which some primary access already exists. The data file records could be ordered, unordered, or hashed.

Database Systems: The Complete Book, Garcia-Molina et. al defines them as:

A primary index determines the location of the records of the data file

The secondary index is distinguished from the primary index in that a secondary index does not determine the placement of records in the data file. Rather, the secondary index tells us the current locations of records; that location may have been decided by a primary index on some other field

Some properties that hold true for either definition above:

  • primary keys can be primary indices
  • there can be at most 1 primary index per table
  • primary indices uniquely determine where a record is kept in physical storage.
  • All other indices are classified as secondary.

However, if the placement of records in the data file is not determined by any field, then a primary index cannot be constructed.

Thus for sorted files, it makes sense to talk about the primary index (which would be the list of fields upon which the sorting is based). I can't find other examples of physical file structures where a primary index can be constructed.

Postgresql utilizes a heap structure for the physical storage for records. Heaps are not sorted (pun alert: they're sorta sorted). Therefore, even the primary keys are implemented using secondary indices, and as such all indices in Postgresql are secondary.

Other RDBMS Systems do implement storage formats that support primary indices:

  • Mysql's InnoDB calls this clustered index
  • MSSQL also refer to the primay index as a clustered index
  • Oracle calls this index-organized tables

The language in the Postgres Documentation is imprecise.

All indexes in PostgreSQL are secondary indexes

This is true.

meaning that each index is stored separately from the table's main data area

This is not why all indices are secondary in Postgresql. Primary indices may also be stored separately from the table's main data area.




回答2:


It seems that the term secondary index does not have a formal definition. In the quoted passage, the author wanted to emphasize that the index is stored separately from the data table (physically, these are two files). Usually the secondary index is understood as an index other than the primary one. However, in Postgres a table's primary key is based on an index stored separately from heap as well. These meanings are different, I do not think one should pay too much attention to this.



来源:https://stackoverflow.com/questions/51083881/what-is-the-definition-of-secondary-index-in-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!