目录
注意(Hadoop集群搭建好)
1、简介
sqoop是apache旗下一款“Hadoop和关系数据库服务器之间传送数据”的工具。
导入数据:MySQL,Oracle导入数据到Hadoop的HDFS、HIVE、HBASE等数据存储系统;
导出数据:从Hadoop的文件系统中导出数据到关系数据库
2、安装
2.1 下载sqoop1:
sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz(下载带*bin__hadoop-2.6.0,要不后面安装会报错)
[hadoop@hadoop01 ~]$ tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz --解压
[hadoop@hadoop01 ~]$ cd sqoop-1.4.7.bin__hadoop-2.6.0
[hadoop@hadoop01 sqoop-1.4.7.bin__hadoop-2.6.0]$ ls -ll -查看目录
total 2020
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 19 2017 bin
-rw-rw-r--. 1 hadoop hadoop 55089 Dec 19 2017 build.xml
-rw-rw-r--. 1 hadoop hadoop 47426 Dec 19 2017 CHANGELOG.txt
-rw-rw-r--. 1 hadoop hadoop 9880 Dec 19 2017 COMPILING.txt
drwxr-xr-x. 2 hadoop hadoop 150 Dec 19 2017 conf
drwxr-xr-x. 5 hadoop hadoop 169 Dec 19 2017 docs
drwxr-xr-x. 2 hadoop hadoop 96 Dec 19 2017 ivy
-rw-rw-r--. 1 hadoop hadoop 11163 Dec 19 2017 ivy.xml
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 19 2017 lib
-rw-rw-r--. 1 hadoop hadoop 15419 Dec 19 2017 LICENSE.txt
-rw-rw-r--. 1 hadoop hadoop 505 Dec 19 2017 NOTICE.txt
-rw-rw-r--. 1 hadoop hadoop 18772 Dec 19 2017 pom-old.xml
-rw-rw-r--. 1 hadoop hadoop 1096 Dec 19 2017 README.txt
-rw-rw-r--. 1 hadoop hadoop 1108073 Dec 19 2017 sqoop-1.4.7.jar
-rw-rw-r--. 1 hadoop hadoop 6554 Dec 19 2017 sqoop-patch-review.py
-rw-rw-r--. 1 hadoop hadoop 765184 Dec 19 2017 sqoop-test-1.4.7.jar
drwxr-xr-x. 7 hadoop hadoop 73 Dec 19 2017 src
drwxr-xr-x. 4 hadoop hadoop 114 Dec 19 2017 testdata
2.2 配置sqoop——mysql连接器:
下载mysql-connector-java-8.0.16.jar,并将其拷贝至sqoop安装目录下的lib文件夹里。
2.3 配置sqoop环境变量:
[hadoop@hadoop01 sqoop-1.4.7.bin__hadoop-2.6.0]$ cd conf
[hadoop@hadoop01 conf]$ ls -ll
total 28
-rw-rw-r--. 1 hadoop hadoop 3895 Dec 19 2017 oraoop-site-template.xml
-rw-rw-r--. 1 hadoop hadoop 1404 Dec 19 2017 sqoop-env-template.cmd
-rwxr-xr-x. 1 hadoop hadoop 1345 Dec 19 2017 sqoop-env-template.sh
-rw-rw-r--. 1 hadoop hadoop 6044 Dec 19 2017 sqoop-site-template.xml
-rw-rw-r--. 1 hadoop hadoop 6044 Dec 19 2017 sqoop-site.xml
2.3.1 拷贝复制sqoop-env.sh样本,并添加hadoop、hbase、hive、zookeeper的安装目录(注:没有的就不添加)
[hadoop@hadoop01 conf]$ cp sqoop-env-template.sh sqoop-env.sh
[hadoop@hadoop01 conf]$ gedit sqoop-env.sh
修改的内容:
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop/hadoop-3.2.0
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop-3.2.0
#set the path to where bin/hbase is available
export HBASE_HOME=/home/hadoop/hbase-2.2.1
#Set the path to where bin/hive is available
export HIVE_HOME=/home/hadoop/apache-hive-3.1.2-bin
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/hadoop/apache-zookeeper-3.5.5
2.3.2 配置linux环境变量
sudo vi /etc/profile
export SQOOP_HOME=/home/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0
export PATH=$PATH:$SQOOP_HOME/bin
--------------------------------------------------
source /etc/profile
3.1 验证sqoop是否安装成功
[hadoop@hadoop01 sqoop-1.4.7.bin__hadoop-2.6.0]$ bin/sqoop help --执行该命令,看到如下信息就表示成功
Warning: /home/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty
2019-09-29 23:38:28,571 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information
See 'sqoop help COMMAND' for information on a specific command.
[hadoop@hadoop01 sqoop-1.4.7.bin__hadoop-2.6.0]$
4.1 测试sqoop与mysql的连接
sqoop list-tables --username root -P --connect jdbc:mysql://localhost:3306/test_db
3. sqoop导入hdfs
3.1导入前准备:
use test_db
CREATE TABLE hello
(
id INT(11),
name VARCHAR(25),
deptId INT(11),
salary FLOAT
);
insert into hello values(01,'gopal',2,0.3);
insert into hello values(02,'manis',3,0.3);
insert into hello values(03,'khali',4,0.3);
insert into hello values(04,'prasa',5,0.3);
insert into hello values(05,'krant',6,0.3);
3.2导入:
sqoop import \
--connect jdbc:mysql://192.168.195.131:3306/test_db \
--username root \
--password root \
--table hello \
--m 1
3.3结果:
参照:
https://blog.csdn.net/yumingzhu1/article/details/80678525
来源:CSDN
作者:Lawrence_121
链接:https://blog.csdn.net/m0_37806112/article/details/103646205