问题
How do I force ClickHouse to only merge one partition at a time when I run optimize table **** final (without specifying partition 201304 and then 201305 and running it sequentilly) ?
I am using a CollapsingMergeTree. Its using a lot of RAM to do multiple merges together for many partitions and killing the service/machine.
回答1:
The main problem of optimize final
(table or partition does not matter) that it re-writes/re-merges a partition fully even if partition have only 1 part which is excessive in 99.9999% occasions!!!! It re-merges old data which was finally merged already!!!
It needed because sometimes one needs to collapse rows (duplicates) inserted with single insert into partition with forever single part. It is a very very rare necessity.
So I recommend to run optimize final against partitions which have more than one part. You can use something like this
select concat('optimize table ',database, '.','\`', table, '\` partition ',partition , ' final;')
from system.parts
where active and (engine like '%ReplacingMergeTree' or engine like '%CollapsingMergeTree')
group by database,table,partition
having count()>1
PS: If you use GraphiteMergeTree it's another story and there are more simple solutions.
来源:https://stackoverflow.com/questions/60151852/clickhouse-merge-one-partition-at-a-time