overwrite hive partitions using spark

后端未结

关注

 4  1175

北荒 2021-02-05 21:44

I am working with AWS and I have workflows that use Spark and Hive. My data is partitioned by the date, so everyday I have a new partition in my S3 storage. My problem is when

4条回答

名媛妹妹 (楼主)

2021-02-05 22:33
If you are on Spark 2.3.0, try setting spark.sql.sources.partitionOverwriteMode setting to dynamic, the dataset needs to be partitioned, and the write mode overwrite.
```
spark.conf.set("spark.sql.sources.partitionOverwriteMode","dynamic")
data.write.mode("overwrite").insertInto("partitioned_table")
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...