I\'m getting the following error when running a query on a PostgreSQL db in standby mode. The query that causes the error works fine for 1 month but when you query for more
There's no need to start idle transactions on the master. In postgresql-9.1 the most direct way to solve this problem is by setting
hot_standby_feedback = on
This will make the master aware of long-running queries. From the docs:
The first option is to set the parameter hot_standby_feedback, which prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur.
Why isn't this the default? This parameter was added after the initial implementation and it's the only way that a standby can affect a master.
I'm going to add some updated info and references to @max-malysh's excellent answer above.
In short, if you do something on the master, it needs to be replicated on the slave. Postgres uses WAL records for this, which are sent after every logged action on the master to the slave. The slave then executes the action and the two are again in sync. In one of several scenarios, you can be in conflict on the slave with what's coming in from the master in a WAL action. In most of them, there's a transaction happening on the slave which conflicts with what the WAL action wants to change. In that case, you have two options:
We're concerned with #1, and two values:
max_standby_archive_delay
- this is the delay used after a long disconnection between the master and slave, when the data is being read from a WAL archive, which is not current data.max_standby_streaming_delay
- delay used for cancelling queries when WAL entries are received via streaming replication.Generally, if your server is meant for high availability replication, you want to keep these numbers short. The default setting of 30000
(milliseconds if no units given) is sufficient for this. If, however, you want to set up something like an archive, reporting- or read-replica that might have very long-running queries, then you'll want to set this to something higher to avoid cancelled queries. The recommended 900s
setting above seems like a good starting point. I disagree with the official docs on setting an infinite value -1
as being a good idea--that could mask some buggy code and cause lots of issues.
The one caveat about long-running queries and setting these values higher is that other queries running on the slave in parallel with the long-running one which is causing the WAL action to be delayed will see old data until the long query has completed. Developers will need to understand this and serialize queries which shouldn't run simultaneously.
For the full explanation of how max_standby_archive_delay
and max_standby_streaming_delay
work and why, go here.