准备四台虚拟机
虚拟机安装
1.创建新虚拟机
2.点击典型安装(推荐)
3.选择中文,点击自己分区
# 分区配置(JD使用)
/boot 200M
swap 512M # 本机内存不够用了,用swap
/ # 根目录
4.配置其它,如下图
更新yum
yum install update -y
四台主机的ip
一主三从
172.20.10.9 密码:hadoop01 对应的虚拟机 hadoop01
172.20.10.10 密码:hadoop02 对应的虚拟机 hadoop02
172.20.10.11 密码:hadoop03 对应的虚拟机 hadoop03
172.20.10.12 密码:hadoop04 对应的虚拟机 hadoop04
# 重新设置root的密码
passwd root
hadoop安装
https://www.cnblogs.com/shireenlee4testing/p/10472018.html
配置DNS
每个节点都配置
vim /etc/hosts
172.20.10.9 hadoop01
172.20.10.10 hadoop02
172.20.10.11 hadoop03
172.20.10.12 hadoop04
关闭防火墙
# 关闭防火墙
systemctl stop firewalld
# 关闭自启动
systemctl disable firewalld
配置免密登录
https://www.cnblogs.com/shireenlee4testing/p/10366061.html
配置DNS
生成ssh密钥
# 生成ssh密钥
ssh-keygen -t rsa
cd /root/.ssh
ls
# 在主节点(hadoop01)上将公钥拷到一个特定文件authorized_keys中
cp id_rsa.pub authorized_keys
# 把authorized_keys拷贝到hadoop02上
scp authorized_keys root@hadoop02:/root/.ssh/
# 登录hadoop02主机
cd .ssh/
cat id_rsa.pub >> authorized_keys
# 在把authorized_keys拷贝到hadoop03上
scp authorized_keys root@hadoop03:/root/.ssh/
# 登录hadoop03主机
cd .ssh/
cat id_rsa.pub >> authorized_keys
# 在把authorized_keys拷贝到hadoop04上
scp authorized_keys root@hadoop04:/root/.ssh/
# 登录hadoop04主机
cd .ssh/
cat id_rsa.pub >> authorized_keys
# 把生成好的authorized_keys拷贝到hadoop01,hadoop02,hadoop03
scp authorized_keys root@hadoop01:/root/.ssh/
scp authorized_keys root@hadoop02:/root/.ssh/
scp authorized_keys root@hadoop03:/root/.ssh/
# 验证免密登录
使用ssh 用户名@节点名或ssh ip地址命令验证免密码登录
ssh root@hadoop02
wget下载jdk8
https://blog.csdn.net/u014700139/article/details/89960494
# 把下载好的jdk拷贝到hadoop02,hadoop03,hadoop04
scp -r -P 22 jdk.tar.gz root@hadoop02:~/
scp -r -P 22 jdk.tar.gz root@hadoop03:~/
scp -r -P 22 jdk.tar.gz root@hadoop04:~/
配置JDK环境
tar -zxvf jdk.tat.gz
mv jdk1.8.0_241 /opt/
# 创建软连接
ln -s /opt/jdk1.8.0_241 /opt/jdk
# 配置java环境
vim /etc/profile
# Java
export JAVA_HOME=/opt/jdk
export CLASSPATH=$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
# 使环境变量生效
source /etc/profile
# 验证java安装
java -version
搭建Hadoop完全分布式集群
hadoop版本下载
http://mirror.bit.edu.cn/apache/hadoop/common/
下载hadoop
wget http://us.mirrors.quenda.co/apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
1.配置hadoop环境变量(每个节点)
# 解压到opt下
tar -zxvf hadoop-3.2.0.tar.gz -C /opt/
vim /etc/profile
# hadoop
export HADOOP_HOME=/opt/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# 保存后,使profile生效
source /etc/profile
2.配置Hadoop环境脚本文件中的JAVA_HOME参数
cd /opt/hadoop-3.2.0/etc/hadoop
#分别在hadoop-env.sh、mapred-env.sh、yarn-env.sh文件中添加或修改如下参数
vim hadoop-env.sh
vim mapred-env.sh
vim yarn-env.sh
export JAVA_HOME="/opt/jdk"
3.修改Hadoop配置文件
cd /opt/hadoop-3.2.0/etc/hadoop
Hadoop安装目录下的etc/hadoop目录中,需修改core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、workers件,根据实际情况修改配置信息
创建文件夹
mkdir -p /opt/hadoop/tmp
core-site.xml (配置Common组件属性)
<configuration>
<property>
<!-- 配置hdfs地址 -->
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<property>
<!-- 保存临时文件目录,需先在/opt/hadoop下创建tmp目录 -->
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
</configuration>
hdfs-site.xml (配置HDFS组件属性)
<configuration>
<property>
<!-- 主节点地址 -->
<name>dfs.namenode.http-address</name>
<value>hadoop01:50070</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/dfs/data</value>
</property>
<property>
<!-- 备份数为默认值3 -->
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>配置为false后,可以允许不要检查权限就生成dfs上的文件,方便倒是方便了,但是你需要防止误删除.</description>
</property>
</configuration>
mapred-site.xml (配置Map-Reduce组件属性)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<!--#设置MapReduce的运行平台为yarn-->
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop01:19888</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
yarn-site.xml(配置资源调度属性)
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<!--#指定yarn的ResourceManager管理界面的地址,不配的话,Active Node始终为0-->
<value>hadoop01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<!--#reducer获取数据的方式-->
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop01:8088</value>
<description>配置外网只需要替换外网ip为真实ip,否则默认为 localhost:8088</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
<description>每个节点可用内存,单位MB,默认8182MB</description>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>忽略虚拟内存的检查,如果你是安装在虚拟机上,这个配置很有用,配上去之后后续操作不容易出问题。</description>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
</value>
</property>
</configuration>
workers
vim workers
# 添加一下内容
hadoop02
hadoop03
hadoop04
4.将配置好的文件夹拷贝到其他从节点
scp -r /opt/hadoop-3.2.0 root@hadoop02:/opt/
scp -r /opt/hadoop-3.2.0 root@hadoop03:/opt/
scp -r /opt/hadoop root@hadoop02:/opt/
scp -r /opt/hadoop root@hadoop03:/opt/
5.配置启动脚本,添加HDFS和Yarn权限
# 添加HDFS权限:编辑如下脚本,在第二行空白位置添加HDFS权限
cd /opt/hadoop-3.2.0/sbin
vim start-dfs.sh
vim stop-dfs.sh
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
# 添加Yarn权限:编辑如下脚本,在第二行空白位置添加Yarn权限
cd /opt/hadoop-3.2.0/sbin
vim start-yarn.sh
vim stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root
HDFS_DATANODE_SECURE_USER=yarn
YARN_NODEMANAGER_USER=root
6.初始化 & 启动
cd /opt/hadoop-3.2.0
# init
# 格式化
bin/hdfs namenode -format wmqhadoop
#启动
sbin/start-dfs.sh
sbin/start-yarn.sh
# 后面开启
sbin/start-all.sh
# 停止
sbin/stop-all.sh
7.验证Hadoop启动成功
jps
在浏览器输入:http://hadoop01:8088打开ResourceManager页面
在浏览器输入:http://hadoop01:50070打开Hadoop Namenode页面
mysql-5.7安装
下载
wget http://repo.mysql.com/yum/mysql-5.7-community/el/7/x86_64/mysql57-community-release-el7-10.noarch.rpm
rpm -ivh mysql57-community-release-el7-10.noarch.rpm
使用yum命令即可完成安装
1、安装命令:
yum -y install mysql-community-server
2、启动msyql:
systemctl start mysqld #启动MySQL
3、获取安装时的临时密码(在第一次登录时就是用这个密码):
grep 'temporary password' /var/log/mysqld.log
sGpt=V+8f,qv
Auftbt8Mht,x
3.设置开机启动
systemctl enable mysqld
登录
mysql -uroot -p
# 输入刚才的密码
修改密码
ALTER USER 'root'@'localhost' IDENTIFIED BY 'Mysql123!';
设置允许远程登陆
1.执行授权命令
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'Mysql123!' WITH GRANT OPTION;
2.退出mysql操作控制台
exit
3.开放3306端口
开启防火墙
sudo systemctl start firewalld.service
永久开放3306端口
sudo firewall-cmd --add-port=3306/tcp --permanent
重新加载
sudo firewall-cmd --reload
关闭防火墙
sudo systemctl stop firewalld.service
设置默认编码为utf8
查看修改前mysql编码
show variables like '%chara%';
修改/etc/my.cnf文件,加入下面两行
vim /etc/my.cnf
character_set_server=utf8
init_connect='SET NAMES utf8'
修改后,重启mysql
sudo systemctl restart mysqld
hive安装
https://blog.csdn.net/qq_39315740/article/details/98626518 # 推荐
https://blog.csdn.net/weixin_43207025/article/details/101073351
hive下载
http://mirror.bit.edu.cn/apache/hive/
hive安装
tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/
配置环境变量
vim /etc/profile
# hive
export HIVE_HOME=/opt/apache-hive-3.1.2-bin
export PATH=$PATH:$HIVE_HOME/bin
source /etc/profile
创建hive-site.xml 文件
cd /opt/apache-hive-3.1.2-bin/conf
cp hive-default.xml.template hive-site.xml
为在hive-site.xml 中如下HDFS相关设置,因此我们需要现在HDFS中创建对应的目录
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
创建HDFS文件夹
hadoop fs -mkdir -p /user/hive/warehouse # 创建文件夹
hadoop fs -mkdir -p /tmp/hive # 创建文件夹
hadoop fs -chmod -R 777 /user/hive/warehouse # 授予权限
hadoop fs -chmod -R 777 /tmp/hive # 授予权限
# 查看是否创建成功
hadoop fs -ls /
Hive相关配置
将 hive-site.xml 中的{system:java.io.tmpdir}改为hive的本地临时目录,将{system:user.name}改为用户名。
创建temp目录
cd /opt/apache-hive-3.1.2-bin
mkdir temp
chmod -R 777 temp
# 把${system:java.io.tmpdir}替换成/opt/apache-hive-3.1.2-bin/temp
# 把${system:user.name}替换成root
vim hive-site.xml
%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g
数据库相关配置
# 数据库jdbc地址,value标签内修改为主机ip地址
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8</value>
</property>
# 数据库的驱动类名称
# 新版本8.0版本的驱动为com.mysql.cj.jdbc.Driver
# 旧版本5.x版本的驱动为com.mysql.jdbc.Driver
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
# 数据库用户名
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
# 数据库密码
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Mysql123!</value> #修改为你自己的mysql密码
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
配置hive-log4j2.properties
cd /opt/apache-hive-3.1.2-bin/conf
cp hive-log4j2.properties.template hive-log4j2.properties
vim hive-log4j2.properties
# 修改内容
property.hive.log.dir = /opt/apache-hive-3.1.2-bin/temp/root
配置hive-env.sh文件
cd /opt/apache-hive-3.1.2-bin/conf
cp hive-env.sh.template hive-env.sh
vim hive-env.sh
添加以下内容:
export JAVA_HOME=/opt/jdk
export HADOOP_HOME=/opt/hadoop-3.2.0
export HIVE_CONF_DIR=/opt/apache-hive-3.1.2-bin/conf
export HIVE_AUX_JARS_PATH=/opt/apache-hive-3.1.2-bin/lib
Hive启动
下载数据库5.7驱动
https://blog.csdn.net/qq_41950447/article/details/90085170
数据库驱动下载
wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.48.tar.gz
# 把数据库驱动 移到 hive 中的lib里
cp -r mysql-connector-java-5.1.48-bin.jar /opt/apache-hive-3.1.2-bin/lib
初始化
schematool -dbType mysql -initSchema
问题
# hive 初始化报错
http://www.lzhpo.com/article/98
# 比较hadoop和hive里的guava-27.0-jre.jar版本
cd /opt/hadoop-3.2.0/share/hadoop/common/lib
ll | grep guava*
cd /opt/apache-hive-3.1.2-bin/lib
ll | grep guava*
把高版本的guava-27.0-jre.jar替换低的guava-19.0-jre.jar版本
# 还有问题参考
https://blog.csdn.net/qq_39315740/article/details/98626518
Hadoop 3.1.2 + Hive 3.1.1 安装
https://www.cnblogs.com/weavepub/p/11130869.html
other
修改vim注释颜色
在用户~主文件夹下新建.vimrc配置文件
vim ~/.vimrc
# 添加该内容并保存
hi Comment ctermfg =blue
vim替换
%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g
来源:CSDN
作者:ZbyFt
链接:https://blog.csdn.net/weixin_42149982/article/details/104130453