I would like to measure time between insert data into master-table and slave-table using streaming replication in PostgreSQL 9.3. For this I create table test_time
If your database has frequent writes, then the below query is a close approximation to get the slave lag
select now() - pg_last_xact_replay_timestamp() AS replication_delay;
Below is a more accurate query for calculating replication lag for databases with very few writes. If the master doesnt sent down any write to the slave, then pg_last_xact_replay_timestamp() can be constant and hence may not accurately determine the slave lag using the above query.
SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;
slightly different version of the correct answer:
postgres=# SELECT
pg_last_xlog_receive_location() receive,
pg_last_xlog_replay_location() replay,
(
extract(epoch FROM now()) -
extract(epoch FROM pg_last_xact_replay_timestamp())
)::int lag;
receive | replay | lag
------------+------------+-------
1/AB861728 | 1/AB861728 | 2027
the lag is only important when "receive" not equal "replay". execute the query on the replica
Alf162 mentioned a good solution in the comments to Craig Ringer's answer; so I'm adding this to clarify.
PostgreSQL has an administrative function pg_last_xact_replay_timestamp() which returns time stamp of the last transaction replayed during recovery. This is the time at which the commit or abort WAL record for that transaction was generated on the primary.
So this query select now()-pg_last_xact_replay_timestamp() as replication_lag
on a replica will return a duration representing the difference in time between the current clock and the timestamp of the last WAL record applied from the replication stream.
Note that if the master is not receiving new mutations, there will be no WAL records to stream and the lag calculated this way will grow without actually being a signal of delays in replication. If the master is under more or less continuous mutation, it will be continuously streaming WALs and the above query is a fine approximation of the time delay for changes on the master to materialize on the slave. Accuracy will obviously be affected by how rigorously synchronized the system clocks on the two hosts are.
You can get the delay in bytes from the master side quite easily using pg_xlog_location_diff
to compare the master's pg_current_xlog_insert_location
with the replay_location
for that backend's pg_stat_replication
entry.
This only works when run on the master. You can't do it from the replica because the replica has no idea how far ahead the master is.
Additionally this won't tell you the lag in seconds. In current (as of 9.4 at least) PostgreSQL versions there's no timestamp associated with a commit or a WAL record. So there's no way to tell how long ago a given LSN (xlog position) was.
The only way to get the replica lag in seconds on a current PostgreSQL version is to have an external process commit an update
to a dedicated timestamp table periodically. So you can compare current_timestamp
on the replica to the timestamp of the most recent entry in that table visible on the replica to see how far the replica is behind. This creates additional WAL traffic that will then have to be kept in your archived WAL for PITR (PgBarman or whatever), so you should balance the increased data use with the granularity of lag detection you require.
PostgreSQL 9.5 may add commit timestamps that will hopefully let you find out how long ago a given commit happened and therefore how far a replica is behind in wall-clock seconds.
on master, you can do select * from pg_stat_replication;
this will give you:
| sent_lsn | write_lsn | flush_lsn | replay_lsn
-+-------------+-------------+-------------+-------------
| 8D/2DA48000 | 8D/2DA48000 | 8D/2DA48000 | 89/56A0D500
those can tell you where your offsets are. as you can see from this example, replay on the replica is behind.
as of 10 release:
https://www.postgresql.org/docs/10/static/monitoring-stats.html#pg-stat-replication-view
write_lag interval Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written it (but not yet flushed it or applied it). This can be used to gauge the delay that synchronous_commit level remote_write incurred while committing if this server was configured as a synchronous standby.
flush_lag interval Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written and flushed it (but not yet applied it). This can be used to gauge the delay that synchronous_commit level remote_flush incurred while committing if this server was configured as a synchronous standby.
replay_lag interval Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written, flushed and applied it. This can be used to gauge the delay that synchronous_commit level remote_apply incurred while committing if this server was configured as a synchronous standby.
(formatting mine)
Alas new columns seem to suit only synchronous replication (otherwise master would not know exact lag) thus async replication delay chack seem to remain now()-pg_last_xact_replay_timestamp()
...