Updating values in apache parquet file

后端 未结 4 1789
醉话见心
醉话见心 2021-02-07 08:54

I have a quite hefty parquet file where I need to change values for one of the column. One way to do this would be to update those values in source text files and recreate parqu

4条回答
  •  攒了一身酷
    2021-02-07 08:59

    You must re-create the file, this is the Hadoop way. Especially if the file is compressed.

    Another approach, (very common in Big-data), is to do the update on another Parquet (or ORC) file, then JOIN / UNION at query time.

提交回复
热议问题