Is it possible to load parquet table directly from file?

后端 未结 1 984
鱼传尺愫
鱼传尺愫 2021-01-22 04:12

If I have a binary data file(it can be converted to csv format), Is there any way to load parquet table directly from it? Many tutorials show loading csv file to text table, and

相关标签:
1条回答
  • 2021-01-22 04:30

    Unfortunately it is not possible to read from a custom binary format in Impala. You should convert your files to csv, then create an external table over the existing csv files as a temporary table, and finally insert into a final parquet table reading from the temp csv table. The Impala Parquet documentation has a lot more information and some related examples. See the section about compacting small files, which is similar.

    I don't know how you convert your file format to csv, but you might consider writing a program to convert your binary format to Parquet. For example, you can write a MapReduce job that writes Parquet files. Here's an example that reads and writes Parquet: https://github.com/cloudera/parquet-examples/blob/master/MapReduce/TestReadWriteParquet.java

    0 讨论(0)
提交回复
热议问题