Is it possible to import data into Hive table without copying the data

后端 未结 4 1090
花落未央
花落未央 2021-02-14 00:49

I have log files stored as text in HDFS. When I load the log files into a Hive table, all the files are copied.

Can I avoid having all my text data stored twice?

4条回答
  •  余生分开走
    2021-02-14 01:23

    Hive (atleast when running in true cluster mode) can not refer to external files in local file system. Hive can automatically import the files during table creation or load operation. The reason behind this can be that Hive runs MapReduce jobs internally to extract the data. MapReduce reads from the HDFS as well as writes back to HDFS and even runs in distributed mode. So if the file is stored in local file system, it can not be used by the distributed infrastructure.

提交回复
热议问题