Parquet without Hadoop?

后端未结

关注

 6  1586

北海茫月 2021-01-01 12:21

I want to use parquet in one of my projects as columnar storage. But i dont want to depends on hadoop/hdfs libs. Is it possible to use parquet outside of hdfs? Or What is th

6条回答

生来不讨喜 (楼主)

2021-01-01 12:28
What type of data do you have in Parquet? You don't require HDFS to read Parquet files. It is definitely not a pre-requisite. We use parquet files at Incorta for our staging tables. We do not ship with a dependency on HDFS, however, you can store the files on HDFS if you want. Obviously, we at Incorta can read directly from the parquet files, but you can also use Apache Drill to connect, use file:/// as the connection and not hdfs:/// See below for an example.

To read or write Parquet data, you need to include the Parquet format in the storage plugin format definitions. The dfs plugin definition includes the Parquet format.
```
{
  "type" : "file",
  "enabled" : true,
  "connection" : "file:///",
  "workspaces" : {
  "json_files" : {
  "location" : "/incorta/tenants/demo//drill/json/",
  "writable" : false,
  "defaultInputFormat" : json
  } 
},
```
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...