1. 环境准备
JDK1.8
2. 集群规划
ip地址 | 机器名 | 角色 |
192.168.1.101 | palo101 | hadoop namenode, hadoop datanode, yarn nodeManager, zookeeper, hive, hbase master,hbase region server, |
192.168.1.102 | palo102 | |
192.168.1.103 | palo103 | hadoop namenode, hadoop datanode, yarn nodeManager, zookeeper, hive,hbase region server,mysql |
3. 下载kylin2.6
wget http://mirrors.tuna.tsinghua.edu.cn/apache/kylin/apache-kylin-2.6.0/apache-kylin-2.6.0-bin-hbase1x.tar.gz #下载kylin2.6.0二进制文件 tar -xzvf apache-kylin-2.6.0-bin-hbase1x.tar.gz #解压kylin2.6.0二进制压缩包 mv apache-kylin-2.6.0-bin apache-kylin-2.6.0 #将kylin解压过的文件重命名(去掉最后的bin) mkdir /usr/local/kylin/ #创建目标存放路径 mv apache-kylin-2.6.0 /usr/local/kylin/ #将kylin2.6.0文件夹移动到/usr/local/kylin目录下
4. 添加系统环境变量
vim /etc/profile
在文件末尾添加
#kylin export KYLIN_HOME=/usr/local/kylin/apache-kylin-2.6.0 export KYLIN_CONF_HOME=$KYLIN_HOME/conf export PATH=:$PATH:$KYLIN_HOME/bin:$CATALINE_HOME/bin export tomcat_root=$KYLIN_HOME/tomcat #变量名小写 export hive_dependency=$HIVE_HOME/conf:$HIVE_HOME/lib/*:$HCAT_HOME/share/hcatalog/hive-hcatalog-core-2.3.4.jar #变量名小写
:wq保存退出,并输入source /etc/profile使环境变量生效
5. 配置kylin
5.1 配置$KYLIN_HOME/bin/kylin.sh
vim $KYLIN_HOME/bin/kylin.sh
在文件开头添加
export HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_dependency:$HBASE_CLASSPATH_PREFIX
这么做的目的是为了加入$hive_dependency环境,解决后续的两个问题,都是没有hive依赖的原因:
a) kylinweb界面load hive表会失败
b) cube build的第二步会报org/apache/Hadoop/hive/conf/hiveConf的错误。
5.2 hadoop压缩配置
关于snappy压缩支持问题,如果支持需要事先重新编译Hadoop源码,使得native库支持snappy.使用snappy能够实现一个适合的压缩比,使得这个运算的中间结果和最终结果都能占用较小的存储空间
本例的hadoop不支持snappy压缩,这个会导致后续cube build报错。
vim $KYLIN_HOME/conf/Kylin_job_conf.xml
修改配置文件,将配置项mapreduce.map.output.compress,mapreduce.output.fileoutputformat.compress修改为false
<property> <name>mapreduce.map.output.compress</name> <value>false</value> <description>Compress map outputs</description> </property> <property> <name>mapreduce.output.fileoutputformat.compress</name> <value>false</value> <description>Compress the output of a MapReduce job</description> </property>
还有一个关于压缩的地方需要修改
vim $KYLIN_HOME/conf/kylin.properties
将kylin.hbase.default.compression.codec设置为none或者注释掉
#kylin.storage.hbase.compression-codec=none
5.3 主配置$KYLIN_HOME/conf/kylin.properties
vim $KYLIN_HOME/conf/kylin.properties
修改为:
kylin.metadata.url=kylin_metadata@hbase ###hbase上存储kylin元数据 kylin.env.hdfs-working-dir=/kylin ###hdfs上kylin工作目录 kylin.env=DEV kylin.env.zookeeper-base-path=/kylin kylin.server.mode=all ###kylin主节点模式,从节点的模式为query,只有这一点不一样 kylin.rest.servers=192.168.1.101:7070,192.168.1.102:7070,192.168.1.103:7070 ###集群的信息同步 kylin.web.timezone=GMT+8 ####改为中国时间 kylin.job.retry=3 kylin.job.mapreduce.default.reduce.input.mb=500 kylin.job.concurrent.max.limit=10 kylin.job.yarn.app.rest.check.interval.seconds=10 kylin.job.hive.database.for.intermediatetable=kylin_flat_db ###build cube 产生的Hive中间表存放的数据库 kylin.hbase.default.compression.codec=none ###不采用压缩 kylin.job.cubing.inmem.sampling.percent=100 kylin.hbase.regin.cut=5 kylin.hbase.hfile.size.gb=2 ###定义kylin用于MR jobs的job.jar包和hbase的协处理jar包,用于提升性能(添加项) kylin.job.jar=/usr/local/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar kylin.coprocessor.local.jar=/usr/local/kylin/apache-kylin-2.6.0/lib/kylin-coprocessor-2.6.0.jar
5.4 将配置好的kylin复制到其他两台机器上去
scp -r /usr/local/kylin/ 192.168.1.102:/usr/local scp -r /usr/local/kylin/ 192.168.1.103:/usr/local
5.5 将192.168.1.102,192.168.1.103上的kylin.server.mode改为query
vim $KYLIN_HOME/conf/kylin.properties
修改项为
kylin.server.mode=query ###kylin主节点模式,从节点的模式为query,只有这一点不一样
6. 启动kylin
6.1 前提条件:依赖服务先启动
a) 启动zookeeper,所有节点运行zkServer.sh start
b) 启动hadoop,主节点运行start-all.sh
c) 启动JobHistoryserver服务,yarn主节点启动$HADOOP_HOME/sbin/mr-jobhistoryserver-deamon.sh start historyserver
e) 启动hbase集群,主节点启动start-hbase.sh
启动后的进程为:
192.168.1.101
[root@palo101 apache-kylin-2.6.0]# jps 62403 NameNode #hdfs NameNode 31013 NodeManager #yarn NodeManager 22325 Kafka 54217 QuorumPeerMain #zookeeper 7274 Jps 62589 DataNode #hadoop datanode 28895 HRegionServer #hbase region server 8440 HMaster #hbase master
192.168.1.102
[root@palo102 ~]# jps 47474 QuorumPeerMain #zookeeper 15203 NodeManager #yarn NodeManager 15061 ResourceManager #yarn ResourceManager 49877 Jps 6694 HRegionServer #hbase region server 7673 Kafka 37517 SecondaryNameNode #hdfs SecondaryNameNode 37359 DataNode #hadoop datanode
192.168.1.103
[root@palo103 ~]# jps 1185 RunJar #hive metastore 62404 NodeManager #yarn NodeManager 47365 HRegionServer #hbase region server 62342 QuorumPeerMain #zookeeper 20952 ManagerBootStrap 52440 Kafka 31801 RunJar #hive thrift server 47901 DataNode #hadoop datanode 36494 Jps
6.2 检查配置是否正确
$KYLIN_HOME/bin/check-env.sh
[root@palo101 bin]# $KYLIN_HOME/bin/check-env.sh Retrieving hadoop conf dir... KYLIN_HOME is set to /usr/local/kylin/apache-kylin-2.6.0 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive依赖检查find-hive-dependency.sh
hbase依赖检查find-hbase-dependency.sh
所有的依赖检查可吃用check-env.sh
6.3 所有节点运行下面命令来启动kylin
$KYLIN_HOME/bin/kylin.sh start
Failed to find metadata store by url: kylin_metadata@hbase
解决办法 为:
1)将$HBASE_HOME/conf/hbase-site.html的属性hbase.rootdir改成与$HADOOP_HOME/etc/hadoop/core-site.xml中的属性fs.defaultFS一致
2)进入zk的bin的zkCli,将/hbase删除,然后重启hbase可以解决