Is it possible to import data into Hive table without copying the data

后端 未结 4 1079
花落未央
花落未央 2021-02-14 00:49

I have log files stored as text in HDFS. When I load the log files into a Hive table, all the files are copied.

Can I avoid having all my text data stored twice?

4条回答
  •  执念已碎
    2021-02-14 01:14

    use an external table:

    CREATE EXTERNAL TABLE sandbox.test(id BIGINT, name STRING) ROW FORMAT
                  DELIMITED FIELDS TERMINATED BY ','
                  LINES TERMINATED BY '\n' 
                  STORED AS TEXTFILE
                  LOCATION '/user/logs/';
    

    if you want to use partitioning with an external table, you will be responsible for managing the partition directories. the location specified must be an hdfs directory..

    If you drop an external table hive WILL NOT delete the source data. If you want to manage your raw files, use external tables. If you want hive to do it, the let hive store inside of its warehouse path.

提交回复
热议问题