Insert overwrite on partitioned table is not deleting the existing data

后端 未结 1 1814
独厮守ぢ
独厮守ぢ 2021-01-21 11:06

I am trying to run insert overwrite over a partitioned table. The select query of insert overwrite omits one partition completely. Is it the expected behavior?

T

相关标签:
1条回答
  • 2021-01-21 11:59

    Yes, this is expected behavior.

    Insert overwrite table partition select ,,, overwrites only partitions existing in the dataset returned by select.

    In your example partition state=UP has records with city='NOIDA' only. Filter where city !='NOIDA' removes entire state=UP partition from the returned dataset and this is why it is not being rewritten.

    Filter city !='Mumbai' does not filter entire partition, it is partially returned, this is why it is being overwritten with filtered data.

    It works as designed. Consider scenario when you need to overwrite only desired partitions, this is quite normal for the incremental partition load. You do not need to touch other partitions in this case. You need to be able normally to overwrite only desired partitions. And without overwriting unchanged partitions, which can be very expensive to recover.

    And if you still want to drop partitions and modify data in existing partitions, then you can drop/create table (you may need to create one more intermediate table for this) and then load partitions into it. Or alternatively calculate partitions which you need to drop separately and execute ALTER TABLE DROP PARTITION.

    0 讨论(0)
提交回复
热议问题