Hive中的数据分区

首先认识什么是分区

Hive 中的分区就是分目录，把一个大的数据集根据业务需要分割成更下的数据集。

1. 如何定义分区，创建分区

hive> create table test(name string,sex int) partitioned by (birth string, age string);

Time taken: 0.044 seconds
hive> alter table test add partition (birth='1980', age ='30');

Time taken: 0.079 seconds
hive> alter table test add partition (birth='1981', age ='29');

Time taken: 0.052 seconds
hive> alter table test add partition (birth='1982', age ='28');

Time taken: 0.056 seconds
hive> show partitions test;

birth=1980/age =30

birth=1981/age =29

birth=1982/age =28

2. 如何删除分区

hive> alter table test drop partition (birth='1980',age='30');

3. 加载数据到指定分区

load data local inpath '/home/hadoop/data.log' overwrite into table

test partition(birth='1980-01-01',age='30');

创建分区原则：最少粒度原则

http://biansutao.blog.163.com/blog/static/6702418820115332453560/

转载于:https://my.oschina.net/u/580135/blog/612329

来源：51CTO

作者：chuteng3602

链接：https://blog.csdn.net/chuteng3602/article/details/100772531

标签

Hive

partition

test

birth

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!