PostgreSQL ERROR: canceling statement due to conflict with recovery

后端未结

关注

 8  806

I\'m getting the following error when running a query on a PostgreSQL db in standby mode. The query that causes the error works fine for 1 month but when you query for more

相关标签:

8条回答

粉色の甜心

2020-12-02 05:12
There's no need to start idle transactions on the master. In postgresql-9.1 the most direct way to solve this problem is by setting
```
hot_standby_feedback = on
```
This will make the master aware of long-running queries. From the docs:

The first option is to set the parameter hot_standby_feedback, which prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur.

Why isn't this the default? This parameter was added after the initial implementation and it's the only way that a standby can affect a master.
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉话见心

2020-12-02 05:13
I'm going to add some updated info and references to @max-malysh's excellent answer above.

In short, if you do something on the master, it needs to be replicated on the slave. Postgres uses WAL records for this, which are sent after every logged action on the master to the slave. The slave then executes the action and the two are again in sync. In one of several scenarios, you can be in conflict on the slave with what's coming in from the master in a WAL action. In most of them, there's a transaction happening on the slave which conflicts with what the WAL action wants to change. In that case, you have two options:
1. Delay the application of the WAL action for a bit, allowing the slave to finish its conflicting transaction, then apply the action.
2. Cancel the conflicting query on the slave.
We're concerned with #1, and two values:
- max_standby_archive_delay - this is the delay used after a long disconnection between the master and slave, when the data is being read from a WAL archive, which is not current data.
- max_standby_streaming_delay - delay used for cancelling queries when WAL entries are received via streaming replication.
Generally, if your server is meant for high availability replication, you want to keep these numbers short. The default setting of 30000 (milliseconds if no units given) is sufficient for this. If, however, you want to set up something like an archive, reporting- or read-replica that might have very long-running queries, then you'll want to set this to something higher to avoid cancelled queries. The recommended 900s setting above seems like a good starting point. I disagree with the official docs on setting an infinite value -1 as being a good idea--that could mask some buggy code and cause lots of issues.

The one caveat about long-running queries and setting these values higher is that other queries running on the slave in parallel with the long-running one which is causing the WAL action to be delayed will see old data until the long query has completed. Developers will need to understand this and serialize queries which shouldn't run simultaneously.

For the full explanation of how max_standby_archive_delay and max_standby_streaming_delay work and why, go here.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2