Using Sqoop to import data from MySQL to Hive

前端 未结 6 1966
攒了一身酷
攒了一身酷 2021-02-06 10:12

I am using Sqoop (version 1.4.4) to import data from MySQL to Hive. The data will be a subset of one of tables, i.e. few columns from a table. Is it necessary to create table in

6条回答
  •  臣服心动
    2021-02-06 10:45

    Firstly, one doesn't have to create an EXTERNAL table it works otherwise, secondly, the solutions given above are bit complex.

    Suppose mysql schema looks like this

    mysql> describe emp;
    +--------+-------------+------+-----+---------+-------+
    | Field  | Type        | Null | Key | Default | Extra |
    +--------+-------------+------+-----+---------+-------+
    | id     | int(11)     | YES  |     | NULL    |       |
    | name   | varchar(20) | YES  |     | NULL    |       |
    | deg    | varchar(20) | YES  |     | NULL    |       |
    | salary | int(11)     | YES  |     | NULL    |       |
    | dept   | varchar(20) | YES  |     | NULL    |       |
    +--------+-------------+------+-----+---------+-------+
    

    Then one needs to create hive table as I did, DATABASE as userdb and TABLE as emp

    hive>
    CREATE TABLE userdb.emp (
    id  INT,
    name  VARCHAR(20),
    deg  VARCHAR(20),
    salary INT,
    dept  VARCHAR(20))
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE;
    

    Now it is a matter of running the sqoop script ( I had to quit from hive prompt though ) and since I am not using hive2 I had to run the below script at the location where metastore_db exist ( ie from the same working directory where I used hive). Some workaround can mitigate this problem (I guess). The sqoop script is

    sqoop import \ 
    --connect jdbc:mysql://localhost/userdb \
    --username root --password root \ 
    --table emp --fields-terminated-by ',' \ 
    --split-by id \ 
    --hive-import --hive-table userdb.emp \
    --target-dir /emp
    

    The target directory ie /emp gets deleted once the command succeeds. I explicitly specified the hive table using userdb.emp

    My hdfs directory structure

    drwxr-xr-x   - ubuntu supergroup          0 2016-12-18 13:20 /user/hive/warehouse/userdb.db/emp
    -rwxr-xr-x   3 ubuntu supergroup         28 2016-12-18 13:19 /user/hive/warehouse/userdb.db/emp/part-m-00000
    -rwxr-xr-x   3 ubuntu supergroup         35 2016-12-18 13:20 /user/hive/warehouse/userdb.db/emp/part-m-00001
    -rwxr-xr-x   3 ubuntu supergroup         29 2016-12-18 13:20 /user/hive/warehouse/userdb.db/emp/part-m-00002
    -rwxr-xr-x   3 ubuntu supergroup         31 2016-12-18 13:20 /user/hive/warehouse/userdb.db/emp/part-m-00003
    -rwxr-xr-x   3 ubuntu supergroup         28 2016-12-18 13:20 /user/hive/warehouse/userdb.db/emp/part-m-00004
    

提交回复
热议问题