Hive是基于Hadoop构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop 分布式文件系统中的数据。其在Hadoop的架构体系中承担了一个SQL解析的过程,它提供了对外的入口来获取用户的指令然后对指令进行分析,解析出一个MapReduce程序组成可执行计划,并按照该计划生成对应的MapReduce任务提交给Hadoop集群处理,获取最终的结果。元数据——如表模式——存储在名为metastore的数据库中。
系统环境
1 2 3 |
192.168.186.128 hadoop-master 192.168.186.129 hadoop-slave MySQL安装在master机器上,hive服务器也安装在master上 |
Hive下载
下载源码包,最新版本可自行去官网下载
1 2 3 4 |
[hadoop@hadoop-master ~]$ wget http://mirrors.cnnic.cn/apache/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz [hadoop@hadoop-master ~]$ tar -zxf apache-hive-1.2.1-bin.tar.gz [hadoop@hadoop-master ~]$ ls apache-hive-1.2.1-bin apache-hive-1.2.1-bin.tar.gz dfs hadoop-2.7.1 Hsource tmp |
配置环境变量
1 2 3 4 5 |
[root@hadoop-master hadoop]# vi /etc/profile HIVE_HOME=/home/hadoop/apache-hive-1.2.1-bin PATH=$PATH:$HIVE_HOME/bin export HIVE_NAME PATH [root@hadoop-master hadoop]# source /etc/profile |
Metastore
metastore是Hive元数据集中存放地。它包括两部分:服务和后台数据存储。有三种方式配置metastore:内嵌metastore、本地metastore以及远程metastore。
本次搭建中采用MySQL作为远程仓库,部署在hadoop-master节点上,hive服务端也安装在hive-master上,hive客户端即hadoop-slave访问hive服务器。
配置Hive
修改配置文件
进入到hive的配置文件目录下,找到hive-default.xml.template,cp份为hive-default.xml
另创建hive-site.xml并添加参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
[hadoop@hadoop-master conf]$ cp hive-default.xml.template hive-site.xml [hadoop@hadoop-master conf]$ pwd /home/hadoop/apache-hive-1.2.1-bin/conf [hadoop@hadoop-master conf]$ vi hive-site.xml <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://hadoop-master:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive<value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> </configuration> |
JDBC下载
1 2 3 4 |
[hadoop@hadoop-master ~]$ wget http://cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz [hadoop@hadoop-master ~]$ ls apache-hive-1.2.1-bin dfs hadoop-2.7.1 Hsource tmp [hadoop@hadoop-master ~]$ cp mysql-connector-java-5.1.33-bin.jar apache-hive-1.2.1-bin/lib/ |
Hive客户端配置
1 2 3 4 5 6 7 8 |
[hadoop@hadoop-master ~]$ scp -r apache-hive-1.2.1-bin/ hadoop@hadoop-slave:/home/hadoop [hadoop@hadoop-slave conf]$ vi hive-site.xml <configuration> <property> <name>hive.metastore.uris</name> <value>thrift://hadoop-master:9083</value> </property> </configuration> |
Hive启动
要启动metastore服务
1 2 3 4 5 6 7 8 9 |
[hadoop@hadoop-master ~]$ hive --service metastore & [hadoop@hadoop-master ~]$ jps 10288 RunJar #多了一个进程 9365 NameNode 9670 SecondaryNameNode 11096 Jps 9944 NodeManager 9838 ResourceManager 9471 DataNode |
Hive服务器端访问
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
[hadoop@hadoop-master ~]$ hive Logging initialized using configuration in jar:file:/home/hadoop/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties hive> show databases; OK default src Time taken: 1.332 seconds, Fetched: 2 row(s) hive> use src; OK Time taken: 0.037 seconds hive> create table test1(id int); OK Time taken: 0.572 seconds hive> show tables; OK abc test test1 Time taken: 0.057 seconds, Fetched: 3 row(s) hive> |
Hive客户端访问
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
[hadoop@hadoop-slave conf]$ hive Logging initialized using configuration in jar:file:/home/hadoop/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties hive> show databases; OK default src Time taken: 1.022 seconds, Fetched: 2 row(s) hive> use src; OK Time taken: 0.057 seconds hive> show tables; OK abc test test1 Time taken: 0.218 seconds, Fetched: 3 row(s) hive> create table test2(id int ,name string); OK Time taken: 5.518 seconds hive> show tables; OK abc test test1 test2 Time taken: 0.102 seconds, Fetched: 4 row(s) hive> |
好了,测试完毕,已经安装成功了。
安装问题纠错
Hive数据库编码问题
错误描述:hive进入后可以创建数据库,但是无法创建表
1 2 |
hive>create table table_test(id string,name string); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.MetaException(message:javax.jdo.JDODataStoreException: An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes |
解决办法:登录mysql修改下hive数据库的编码方式
1 |
mysql>alter database hive character set latin1; |
http://yanliu.org/2015/08/13/Hadoop%E9%9B%86%E7%BE%A4%E4%B9%8BHive%E5%AE%89%E8%A3%85%E9%85%8D%E7%BD%AE/
来源:oschina
链接:https://my.oschina.net/u/934205/blog/745384