Difference between partition and index in hive

后端 未结 2 730
轻奢々
轻奢々 2021-02-14 11:03

I am new in hadoop and hive and I would know what is the difference between index and partition in hive? When I use index and when partition?

Thank you!

2条回答
  •  心在旅途
    2021-02-14 11:35

    Indexes are new and evolving (features are being added) but currently Indexes are limited to single tables and cannot be used with external tables. Creating an index creates a separate table. Indexes can be partitioned (matching the partitions of the base table). Indexes are used to speed the search of data within tables.

    Partitions provide segregation of the data at the hdfs level, creating sub-directories for each partition. Partitioning allows the number of files read and amount of data searched in a query to be limited. For this to occur however, partition columns must be specified in your WHERE clauses.

    While building your data model you can determine the best use of indexes and/or partitions based on the size of your data and your expected use patterns.

提交回复
热议问题