How to delete many rows from frequently accessed table

為{幸葍}努か 提交于 2019-12-01 17:11:19

问题


I need to delete the majority (say, 90%) of a very large table (say, 5m rows). The other 10% of this table is frequently read, but not written to.

From "Best way to delete millions of rows by ID", I gather that I should remove any index on the 90% I'm deleting, to speed up the process (except an index I'm using to select the rows for deletion).

From "PostgreSQL locking mode", I see that this operation will acquire a ROW EXCLUSIVE lock on the entire table. But since I'm only reading the other 10%, this ought not matter.

So, is it safe to delete everything in one command (i.e. DELETE FROM table WHERE delete_flag='t')? I'm worried that if the deletion of one row fails, triggering an enormous rollback, then it will affect my ability to read from the table. Would it be wiser to delete in batches?


回答1:


  1. Indexes are completely useless for operations on 90% of all rows. Sequential scans will be faster either way.

  2. If you need to allow concurrent reads, you cannot take an exclusive lock on the table. So you can also not drop any indexes in the same transaction.

  3. You could drop indexes in separate transactions to keep the duration of the exclusive lock at a minimum. And later use CREATE INDEX CONCURRENTLY to rebuild the index in the background - and only take a very brief exclusive lock.

If you have a stable condition to identify the 10 % of rows that stay, I would strongly suggest a partial index on just those rows to get the best for both:

  • Reading queries can access the table quickly (using the partial index) at all times.
  • The big DELETE is not going to modify the partial index at all, since none of the rows are involved in the DELETE.

CREATE INDEX foo (some_id) WHERE delete_flag = FALSE;

Assuming delete_flag is boolean. You have to include the same predicate in your queries (even if it seems logically redundant) to make sure Postgres understands it can use the partial index.



来源:https://stackoverflow.com/questions/35188911/how-to-delete-many-rows-from-frequently-accessed-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!