Is it possible to load parquet table directly from file?

后端未结

关注

 1  984

If I have a binary data file(it can be converted to csv format), Is there any way to load parquet table directly from it? Many tutorials show loading csv file to text table, and

相关标签:

1条回答

不知归路

2021-01-22 04:30

Unfortunately it is not possible to read from a custom binary format in Impala. You should convert your files to csv, then create an external table over the existing csv files as a temporary table, and finally insert into a final parquet table reading from the temp csv table. The Impala Parquet documentation has a lot more information and some related examples. See the section about compacting small files, which is similar.

I don't know how you convert your file format to csv, but you might consider writing a program to convert your binary format to Parquet. For example, you can write a MapReduce job that writes Parquet files. Here's an example that reads and writes Parquet: https://github.com/cloudera/parquet-examples/blob/master/MapReduce/TestReadWriteParquet.java

0 讨论(0)
发布评论:

提交评论
- 加载中...