Load a text file into Apache Kudu table?

问题

How do you load a text file to an Apache Kudu table?

Does the source file need to be in HDFS space first?

If it doesn't share the same hdfs space as other hadoop ecosystem programs (ie/ hive, impala), is there Apache Kudu equivalent of:

hdfs dfs -put /path/to/file

before I try to load the file?

回答1:

The file need not to be in HDFS first.It can be taken from an edge node/local machine.Kudu is similar to Hbase.It is a real-time store that supports key-indexed record lookup and mutation but cant store text file directly as in HDFS.For Kudu to store the contents of a text file,it needs to be parsed and tokenised.For that, you need to have Spark execution/java api alongwith Nifi (or Apache Gobblin) to perform the processing and then storing it in Kudu table.

You can integrate it with Impala allowing you to use Impala to insert, query, update, and delete data from Kudu tablets using Impala’s SQL syntax, as an alternative to using the Kudu APIs to build a custom Kudu application.Below are the steps:

Import the file in hdfs
Create an external impala table.
Then insert the data in the table.
Create a kudu table using keyword stored as KUDU and As Select to copy the contents from impala to kudu.

In this link you can refer for more info- https://kudu.apache.org/docs/quickstart.html

来源：https://stackoverflow.com/questions/45361525/load-a-text-file-into-apache-kudu-table

标签

Cloudera

apache-kudu

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!