本次测试中用到的配置及脚本见:https://github.com/lxgithub/repmgr_conf_scripts
一、系统
IP HOSTNAME PG VERSION DIR OS
192.168.100.146 node1 9.3.4 /opt/pgsql CentOS6.4_x64
192.168.100.150 node2 9.3.4 /opt/pgsql CentOS6.4_x64
# cat /etc/issue
CentOS release 6.5 (Final)
Kernel \r on an \m
# uname -a
Linux barman 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 19:59:55 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/hosts
127.0.0.1localhost.localdomainlocalhost.localdomainlocalhost4localhost4.localdomain4localhostnode1
::1localhost.localdomainlocalhost.localdomainlocalhost6localhost6.localdomain6localhostnode1
192.168.100.146 node1
192.168.100.150 node2
二、安装
2.1 简介
PostgreSQL 9+ allow us to have replicated Hot Standby servers which we can query and/or use for high availability.
While the main components of the feature are included with PostgreSQL, the user is expected to manage the high availability part of it.
repmgr allows you to monitor and manage your replicated PostgreSQL databases as a single cluster. repmgr includes two components:
repmgr: command program that performs tasks and then exits
repmgrd: management and monitoring daemon that watches the cluster and can automate remote actions.
2.2 需求
Pg version >= 9.0
UNIX-like OS
gcc/gmake
rsync/pg_config/pg_ctl in PATH
2.3 安装pg
安装过程略
注:node1初始化数据库,node2仅安装数据库软件
2.4 编译安装repmgr
# unzip repmgr-master.zip
# mv repmgr-master postgresql-9.3.4/contrib/repmgr
# cd postgresql-9.3.4/contrib/repmgr/
# make && make install
也可通过pg系统管理用户编译安装,如:
[root@node1 ~]# su - postgres
[postgres@node1 ~]$ nzip repmgr-master.zip
[postgres@node1 ~]$ cd repmgr-master
[postgres@node1 repmgr-master]$ make USE_PGXS=1
[postgres@node1 repmgr-master]$ make USE_PGXS=1 install
验证是否安装正确:
[postgres@node1 ~]$ repmgr --version
repmgr 2.1dev (PostgreSQL 9.3.4)
[postgres@node1 ~]$ repmgrd --version
repmgrd 2.1dev (PostgreSQL 9.3.4)
成功安装后会将so及sql文件拷贝到相应的安装目录中:
$ ls /opt/pgsql/share/contrib/
repmgr_funcs.sql repmgr.sql uninstall_repmgr_funcs.sql uninstall_repmgr.sql
$ ls /opt/pgsql/lib/repmgr*
/opt/pgsql/lib/repmgr_funcs.so
$ ls /opt/pgsql/bin/repmgr*
/opt/pgsql/bin/repmgr /opt/pgsql/bin/repmgrd
postgres=# \df
List of functions
Schema | Name | Result data type | Argument data types | Type
-------------+----------------------------------+--------------------------+---------------------+--------
repmgr_test | repmgr_get_last_standby_location | text | | normal
repmgr_test | repmgr_get_last_updated | timestamp with time zone | | normal
repmgr_test | repmgr_update_last_updated | timestamp with time zone | | normal
repmgr_test | repmgr_update_standby_location | boolean | text | normal
(4 rows)
三、配置
3.1 配置ssh互信
[postgres@node1 ~]$ ssh-keygen -t rsa
[postgres@node1 ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@node2
[postgres@node1 ~]$ ssh node2 date
Tue Apr 15 01:17:20 CST 2014
[postgres@node2 ~]$ ssh-keygen -t rsa
[postgres@node2 ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@node1
[postgres@node2 ~]$ ssh node1 date
Tue Apr 15 01:18:13 CST 2014
3.2 配置主节点数据库
$ vi postgresql.conf
listen_addresses = '*'
port = 5432
archive_mode = on
archive_command = 'cd .' 也可以使用exit 0等等,确保不做任何事情
wal_level = hot_standby
max_wal_senders = 10
wal_keep_segments = 5000 设置大一些
hot_standby = on
$ vi pg_hba.conf
host all all 192.168.100.0/24 trust
host replication all 192.168.100.0/24 trust
启动数据库:
[postgres@node1 ~]$ pg_ctl start
server starting
3.3 创建用户
在node1上创建repmgr用户:
[postgres@node1 ~]$ createuser --login --superuser repmgr
在node2上测试:
[postgres@node2 ~]$ psql -h node1 -U repmgr -d postgres -c "select version()"
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit
(1 row)
3.4 克隆standby数据库
[postgres@node2 ~]$ repmgr -D $PGDATA -d postgres -p 5432 -U repmgr -R postgres --verbose standby clone node1
-D 指定将文件拷贝到的目录
-U 指定连接数据库的用户
-R 指定执行rsync命令同步的系统用户
同步完成后自动添加了recovery.conf文件:
[postgres@node2 data]$ cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'port=5432 host=node1 user=repmgr'
3.5 注册
3.5.1 设置配置文件
node1节点上:
[postgres@node1 ~]$ vi /opt/pgsql/repmgr/repmgr.conf
cluster=test
node=1
node_name=master
conninfo='host=node1 user=repmgr dbname=postgres'
pg_bindir=/opt/pgsql/bin
node2节点上:
[postgres@node2 ~]$ vi /opt/pgsql/repmgr/repmgr.conf
cluster=test
node=2
node_name=slave
conninfo='host=node2 user=repmgr dbname=postgres'
pg_bindir=/opt/pgsql/bin
3.5.2 进行注册
在node1上执行注册:
[postgres@node1 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose master register
Opening configuration file: /opt/pgsql/repmgr.conf
[2014-04-15 02:58:44] [INFO] repmgr connecting to master database
[2014-04-15 02:58:44] [INFO] repmgr connected to master, checking its state
[2014-04-15 02:58:44] [INFO] master register: creating database objects inside the repmgr_test schema
[2014-04-15 02:58:44] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=node1 user=repmgr dbname=postgres)
postgres=# set search_path to repmgr_test;
SET
postgres=# \d
List of relations
Schema | Name | Type | Owner
-------------+--------------+-------+--------
repmgr_test | repl_monitor | table | repmgr
repmgr_test | repl_nodes | table | repmgr
repmgr_test | repl_status | view | repmgr
(3 rows)
repl_monitor 记录每次监控信息
repl_nodes 记录节点连接信息
repl_status 当前同步监控信息
postgres=# select * from repmgr_test.repl_nodes ;
id | cluster | name | conninfo | priority | witness
----+---------+--------+----------------------------------------+----------+---------
1 | test | master | host=node1 user=repmgr dbname=postgres | 0 | f
(1 row)
启动standby数据库:
[postgres@node2 ~]$ pg_ctl start
server starting
在node2上执行注册:
[postgres@node2 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose standby register
Opening configuration file: /opt/pgsql/repmgr.conf
[2014-04-15 03:05:37] [INFO] repmgr connecting to standby database
[2014-04-15 03:05:37] [INFO] repmgr connected to standby, checking its state
[2014-04-15 03:05:37] [INFO] repmgr connecting to master database
[2014-04-15 03:05:37] [INFO] finding node list for cluster 'test'
[2014-04-15 03:05:37] [INFO] checking role of cluster node 'host=node1 user=repmgr dbname=postgres'
[2014-04-15 03:05:37] [INFO] repmgr connected to master, checking its state
[2014-04-15 03:05:37] [INFO] repmgr registering the standby
[2014-04-15 03:05:37] [INFO] repmgr registering the standby complete
[2014-04-15 03:05:37] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=node2 user=repmgr dbname=postgres)
postgres=# select * from repmgr_test.repl_nodes ;
id | cluster | name | conninfo | priority | witness
----+---------+--------+----------------------------------------+----------+---------
1 | test | master | host=node1 user=repmgr dbname=postgres | 0 | f
2 | test | slave | host=node2 user=repmgr dbname=postgres | 0 | f
(2 rows)
3.6 监控测试
[postgres@node2 ~]$ repmgrd -f /opt/pgsql/repmgr/repmgr.conf --verbose --monitoring-history > /opt/pgsql/repmgr/repmgr.log 2>&1
[postgres@node2 ~]$ tail -f /opt/pgsql/repmgr/repmgr.log
[2014-04-15 05:24:42] [INFO] repmgrd Connecting to database 'host=node2 user=repmgr dbname=postgres'
[2014-04-15 05:24:42] [INFO] repmgrd Connected to database, checking its state
[2014-04-15 05:24:42] [INFO] repmgrd Connecting to primary for cluster 'test'
[2014-04-15 05:24:42] [INFO] finding node list for cluster 'test'
[2014-04-15 05:24:42] [INFO] checking role of cluster node 'host=node1 user=repmgr dbname=postgres'
[2014-04-15 05:24:42] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_test'
[2014-04-15 05:24:42] [INFO] repmgrd Checking node 2 in cluster 'test'
[2014-04-15 05:24:42] [INFO] Reloading configuration file and updating repmgr tables
[2014-04-15 05:24:42] [INFO] repmgrd Starting continuous standby node monitoring
在备端启动监控进程后会在发起一个连向主端的进程,该连接将实时的监控信息insert到主库的repl_monitor中。可以在主端服务器查到该进程,如:
postgres 5541 670 0 03:23 ? 00:00:00 postgres: repmgr repmgr 192.168.100.146(56388) idle
postgres=# \d repl_monitor
Table "repmgr_test.repl_monitor"
Column | Type | Modifiers
---------------------------+--------------------------+-----------
primary_node | integer | not null
standby_node | integer | not null
last_monitor_time | timestamp with time zone | not null
last_apply_time | timestamp with time zone |
last_wal_primary_location | text | not null
last_wal_standby_location | text |
replication_lag | bigint | not null
apply_lag | bigint | not null
Indexes:
"idx_repl_status_sort" btree (last_monitor_time, standby_node)
postgres=# \d repl_nodes
Table "repmgr_test.repl_nodes"
Column | Type | Modifiers
----------+---------+------------------------
id | integer | not null
cluster | text | not null
name | text | not null
conninfo | text | not null
priority | integer | not null
witness | boolean | not null default false
Indexes:
"repl_nodes_pkey" PRIMARY KEY, btree (id)
postgres=# \d repl_status
View "repmgr_test.repl_status"
Column | Type | Modifiers
---------------------------+--------------------------+-----------
primary_node | integer |
standby_node | integer |
standby_name | text |
last_monitor_time | timestamp with time zone |
last_wal_primary_location | text |
last_wal_standby_location | text |
replication_lag | text |
replication_time_lag | interval |
apply_lag | text |
communication_time_lag | interval |
查看repl_status视图定义:
postgres=# select definition from pg_views where viewname = 'repl_status';
definition
---------------------------------------------------------------------------------------------------------------
SELECT repl_monitor.primary_node, +
repl_monitor.standby_node, +
repl_nodes.name AS standby_name, +
repl_monitor.last_monitor_time, +
repl_monitor.last_wal_primary_location, +
repl_monitor.last_wal_standby_location, +
pg_size_pretty(repl_monitor.replication_lag) AS replication_lag, +
age(now(), repl_monitor.last_apply_time) AS replication_time_lag, +
pg_size_pretty(repl_monitor.apply_lag) AS apply_lag, +
age(now(), +
CASE +
WHEN pg_is_in_recovery() THEN repmgr_get_last_updated() +
ELSE repl_monitor.last_monitor_time +
END) AS communication_time_lag +
FROM (repl_monitor +
JOIN repl_nodes ON ((repl_monitor.standby_node = repl_nodes.id))) +
WHERE ((repl_monitor.standby_node, repl_monitor.last_monitor_time) IN ( SELECT repl_monitor_1.standby_node,+
max(repl_monitor_1.last_monitor_time) AS max +
FROM repl_monitor repl_monitor_1 +
GROUP BY repl_monitor_1.standby_node));
(1 row)
通过视图查看当前同步情况:
postgres=# select * from repl_status ;
-[ RECORD 1 ]-------------+-----------------------------
primary_node | 1
standby_node | 2
standby_name | slave
last_monitor_time | 2014-04-15 05:28:32.53065+08
last_wal_primary_location | 0/3052FF0
last_wal_standby_location | 0/3052FF0
replication_lag | 0 bytes
replication_time_lag | 00:00:03.27349
apply_lag | 0 bytes
communication_time_lag | 00:00:00.013697
向主库插入数据:
[postgres@node1 ~]$ pgbench -i -s 10 pgbench
期间查看同步情况:
postgres=# select * from repmgr_test.repl_status ;
-[ RECORD 1 ]-------------+------------------------------
primary_node | 1
standby_node | 2
standby_name | slave
last_monitor_time | 2014-04-15 05:43:53.368038+08
last_wal_primary_location | 0/48CC000
last_wal_standby_location | 0/4000000
replication_lag | 9008 kB
replication_time_lag | 00:00:04.031926
apply_lag | 336 bytes
communication_time_lag |
四、切换模拟
4.1 模拟主库故障
[postgres@node1 ~]$ pg_ctl stop -m f
在node2上查看同步情况:
postgres=# select * from repmgr_test.repl_status ;
-[ RECORD 1 ]-------------+------------------------------
primary_node | 1
standby_node | 2
standby_name | slave
last_monitor_time | 2014-04-15 05:50:26.687504+08
last_wal_primary_location | 0/ADE9668
last_wal_standby_location | 0/ADE9668
replication_lag | 0 bytes
replication_time_lag | 00:01:12.366403
apply_lag | 0 bytes
communication_time_lag |
(滞后时间在不断增加)
监控日志输出:
[2014-04-15 05:50:28] [WARNING] wait_connection_availability: could not receive data from connection.
[2014-04-15 05:50:28] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 60 seconds before failover decision
[2014-04-15 05:50:38] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 50 seconds before failover decision
[2014-04-15 05:50:48] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 40 seconds before failover decision
[2014-04-15 05:50:58] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 30 seconds before failover decision
[2014-04-15 05:51:08] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 20 seconds before failover decision
[2014-04-15 05:51:18] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 10 seconds before failover decision
[2014-04-15 05:51:28] [ERROR] repmgrd: We couldn't reconnect for long enough, exiting...
[2014-04-15 05:51:28] [ERROR] We couldn't reconnect to master. Now checking if another node has been promoted.
[2014-04-15 05:51:28] [INFO] finding node list for cluster 'test'
[2014-04-15 05:51:28] [INFO] checking role of cluster node 'host=node1 user=repmgr dbname=postgres'
[2014-04-15 05:51:28] [ERROR] Connection to database failed: could not connect to server: Connection refused
Is the server running on host "node1" (192.168.100.146) and accepting
TCP/IP connections on port 5432?
[2014-04-15 05:51:28] [INFO] checking role of cluster node 'host=node2 user=repmgr dbname=postgres'
[2014-04-15 05:51:28] [ERROR] We haven't found a new master, waiting before retry...
4.2 提升备库状态
[postgres@node2 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose standby promote
4.3 将node1恢复为standby
[postgres@node1 ~]$ repmgr -D $PGDATA -d postgres -p 5432 -U repmgr -R postgres --verbose --force standby clone node2
启动:
[postgres@node1 ~]$ pg_ctl start
server starting
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
4.4 将node1恢复为master
停止node2:
[postgres@node2 ~]$ pg_ctl stop -m f
提升node1状态:
[postgres@node1 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose standby promote
恢复node2为standby:
[postgres@node2 ~]$ repmgr -D $PGDATA -d postgres -p 5432 -U repmgr -R postgres --verbose --force standby clone node1
启动node2上的数据库:
[postgres@node2 ~]$ pg_ctl start
server starting
postgres=# select * from repmgr_test.repl_status ;
-[ RECORD 1 ]-------------+------------------------------
primary_node | 1
standby_node | 2
standby_name | slave
last_monitor_time | 2014-04-15 06:19:58.531949+08
last_wal_primary_location | 0/10003070
last_wal_standby_location | 0/10003070
replication_lag | 0 bytes
replication_time_lag | 00:00:03.967245
apply_lag | 0 bytes
communication_time_lag |
五、搭建自动切换集群环境
IP HOSTNAME PG VERSION DIR OS ROLE
192.168.100.146 node1 9.3.4 /opt/pgsql CentOS6.4_x64 master
192.168.100.150 node2 9.3.4 /opt/pgsql CentOS6.4_x64 standby
192.168.100.190 witness 9.3.4 /opt/pgsql CentOS6.4_x64 witness
# vi /etc/hosts
192.168.100.190 witness
192.168.100.146 node1
192.168.100.150 node2
5.1 安装pg
在3个节点上安装pg数据库,过程略
5.2 安装repmgr
在3个节点上安装repmgr,过程参考2.4节
5.3 配置ssh互信
保证3个节点上的postgres用户可以互相免密码访问,配置过程参考3.1节
5.4 配置主库
$ vi postgresql.conf
listen_addresses = '*'
port = 5432
archive_mode = on
archive_command = 'cd .' 也可以使用exit 0等等,确保不做任何事情
wal_level = hot_standby
max_wal_senders = 10
wal_keep_segments = 5000 必须>=5000
hot_standby = on
shared_preload_libraries = 'repmgr_funcs'
$ vi pg_hba.conf
host all all 192.168.100.0/24 trust
host replication all 192.168.100.0/24 trust
启动数据库:
[postgres@node1 ~]$ pg_ctl start
server starting
在node1上创建repmgr用户:
[postgres@node1 ~]$ createuser -s repmgr
在node1上创建repmgr数据库:
[postgres@node1 ~]$ createdb -O repmgr repmgr
在node2上测试:
[postgres@node2 ~]$ psql -h node1 -U repmgr -d postgres -c "select version()"
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit
(1 row)
5.5 克隆standby数据库
[postgres@node2 ~]$ repmgr -D $PGDATA -d repmgr -p 5432 -U repmgr -R postgres --verbose standby clone node1
启动数据库:
[postgres@node2 ~]$ pg_ctl start
server starting
5.6 配置repmgr
node1:
[postgres@node1 ~]$ vi /opt/pgsql/repmgr/repmgr.conf
cluster=my_cluster
node=1
node_name=node1
conninfo='host=node1 dbname=repmgr user=repmgr'
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='repmgr standby promote -f /opt/pgsql/repmgr/repmgr.conf'
pg_bindir=/opt/pgsql/bin
node2:
[postgres@node2 ~]$ vi /opt/pgsql/repmgr/repmgr.conf
cluster=my_cluster
node=2
node_name=node2
conninfo='host=node2 dbname=repmgr user=repmgr'
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='repmgr standby promote -f /opt/pgsql/repmgr/repmgr.conf'
pg_bindir=/opt/pgsql/bin
witness:
[postgres@witness ~]$ vi /opt/pgsql/repmgr/repmgr.conf
cluster=my_cluster
node=3
node_name=witness
conninfo='host=witness dbname=postgres user=postgres port=5499'
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='repmgr standby promote -f /opt/pgsql/repmgr/repmgr.conf'
pg_bindir=/opt/pgsql/bin
5.7 注册主备库
[postgres@node1 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose master register
[postgres@node2 ~]$ repmgr -f /opt/pgsql/repmgr/repmgr.conf --verbose standby register
5.8 初始化witness
[postgres@witness ~]$ repmgr -d repmgr -U repmgr -h node1 -D $PGDATA -f /opt/pgsql/repmgr/repmgr.conf witness create
初始化完成后数据库自动启动
登录到witness的postgres库中查看nodes信息:
postgres=# select * from repmgr_my_cluster.repl_nodes ;
id | cluster | name | conninfo | priority | witness
----+------------+-------+--------------------------------------+----------+---------
1 | my_cluster | node1 | host=node1 dbname=repmgr user=repmgr | 0 | f
(1 row)
登录到node1的repmgr库中查看nodes信息:
repmgr=# select * from repmgr_my_cluster.repl_nodes ;
id | cluster | name | conninfo | priority | witness
----+------------+---------+--------------------------------------------------+----------+---------
1 | my_cluster | node1 | host=node1 dbname=repmgr user=repmgr | 0 | f
2 | my_cluster | node2 | host=node2 dbname=repmgr user=repmgr | 0 | f
3 | my_cluster | witness | host=witness dbname=repmgr user=repmgr port=5499 | 0 | t
(3 rows)
5.9 启动监控
在node2上启动监控进程:
repmgrd -f /opt/pgsql/repmgr/repmgr.conf --verbose --monitoring-history > /opt/pgsql/repmgr/repmgr.log 2>&1 &
5.10 模拟主库故障
在node2上查询:
[postgres@node2 ~]$ psql -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
t
(1 row)
{表明在恢复状态}
停止主库:
[postgres@node1 ~]$ pg_ctl stop -m f
此时node2上repmgr日志信息如下:
FATAL: terminating connection due to administrator command
[2014-04-15 08:57:07] [WARNING] Can't stop current query: PQcancel() -- connect() failed: Connection refused
[2014-04-15 08:57:07] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 60 seconds before failover decision
[2014-04-15 08:57:17] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 50 seconds before failover decision
[2014-04-15 08:57:27] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 40 seconds before failover decision
[2014-04-15 08:57:37] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 30 seconds before failover decision
[2014-04-15 08:57:47] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 20 seconds before failover decision
[2014-04-15 08:57:57] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 10 seconds before failover decision
[2014-04-15 08:58:07] [ERROR] repmgrd: We couldn't reconnect for long enough, exiting...
[2014-04-15 08:58:08] [ERROR] Connection to database failed: could not connect to server: Connection refused
Is the server running on host "node1" (192.168.100.146) and accepting
TCP/IP connections on port 5432?
[2014-04-15 08:58:13] [INFO] repmgrd: This node is the best candidate to be the new primary, promoting...
[2014-04-15 08:58:13] [ERROR] Connection to database failed: could not connect to server: Connection refused
Is the server running on host "node1" (192.168.100.146) and accepting
TCP/IP connections on port 5432?
[2014-04-15 08:58:13] [NOTICE] repmgr: Promoting standby
[2014-04-15 08:58:13] [NOTICE] repmgr: restarting server using /opt/pgsql/bin/pg_ctl
[2014-04-15 08:58:15] [ERROR] repmgr: STANDBY PROMOTE successful. You should REINDEX any hash indexes you have.
[2014-04-15 08:58:17] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_my_cluster'
[2014-04-15 08:58:17] [INFO] repmgrd Checking node 2 in cluster 'my_cluster'
[2014-04-15 08:58:17] [INFO] Reloading configuration file and updating repmgr tables
[2014-04-15 08:58:17] [INFO] repmgrd Starting continuous primary connection check
出现上面日志信息,说明已经切换,在node2上确认状态:
[postgres@node2 repmgr]$ psql -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
f
(1 row)
在node2上查看节点信息:
[postgres@node2 ~]$ repmgr cluster show -f /opt/pgsql/repmgr/repmgr.conf
Role | Connection String
[2014-04-15 09:01:59] [ERROR] Connection to database failed: could not connect to server: Connection refused
Is the server running on host "node1" (192.168.100.146) and accepting
TCP/IP connections on port 5432?
FAILED | host=node1 dbname=repmgr user=repmgr
witness | host=witness dbname=postgres user=postgres port=5499
* master | host=node2 dbname=repmgr user=repmgr
5.11 恢复node1为master
终止node2上之前的监控进程:
[postgres@node2 repmgr]$ kill -9 `pidof repmgrd`
恢复node1为standby:
[postgres@node1 ~]$ repmgr -D $PGDATA -d repmgr -p 5432 -U repmgr -R postgres --verbose --force standby clone node2
启动node1节点数据库:
[postgres@node1 ~]$ pg_ctl start
server starting
[postgres@node1 ~]$ psql -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
t
(1 row)
在node1上启动监控进程:
[postgres@node1 ~]$ repmgrd -f /opt/pgsql/repmgr/repmgr.conf --verbose --monitoring-history > /opt/pgsql/repmgr/repmgr.log 2>&1 &
停止node2节点数据库:
[postgres@node2 ~]$ pg_ctl stop -m f
waiting for server to shut down.... done
server stopped
此时再次触发失效切换,master角色将由node2切换到node1。此处也可不等待自动切换,直接手动执行promote。
恢复node2为standby:
[postgres@node2 ~]$ repmgr -D $PGDATA -d repmgr -p 5432 -U repmgr -R postgres --verbose --force standby clone node1
启动node2节点数据库:
[postgres@node2 ~]$ pg_ctl start
server starting
[postgres@node2 ~]$ psql -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
t
(1 row)
在node2上重新启动监控进程:
[postgres@node2 ~]$ repmgrd -f /opt/pgsql/repmgr/repmgr.conf --verbose --monitoring-history > /opt/pgsql/repmgr/repmgr.log 2>&1 &
查看同步情况:
repmgr=# select * from repmgr_my_cluster.repl_status ;
-[ RECORD 1 ]-------------+------------------------------
primary_node | 1
standby_node | 2
standby_name | node2
last_monitor_time | 2014-04-15 09:22:36.838084+08
last_wal_primary_location | 0/C003088
last_wal_standby_location | 0/C003088
replication_lag | 0 bytes
replication_time_lag | 00:00:04.119423
apply_lag | 0 bytes
communication_time_lag | 00:00:00.06189
-[ RECORD 2 ]-------------+------------------------------
primary_node | 2
standby_node | 1
standby_name | node1
last_monitor_time | 2014-04-15 09:15:39.482578+08
last_wal_primary_location | 0/9008848
last_wal_standby_location | 0/9008848
replication_lag | 0 bytes
replication_time_lag | 00:06:58.398443
apply_lag | 0 bytes
communication_time_lag | 00:06:57.417396
注:为排除无用监控数据干扰,可执行一次清理命令(repmgr cluster cleanup -f repmgr.conf),同时清理操作可用作为日常的维护任务,避免监控数据占用过多磁盘空间,可合理加入-k参数指定保留几天的监控数据。
六、配置参数说明(repmgr.conf)
cluster:为管理的集群指定一个名称。如test、……。
node:节点ID
node_name:指定节点的名称,如master、standby、node1、……。
conninfo:指定连接数据库的信息,包括host、user、dbname、port等。
rsync_options:指定rsync命令参数,默认为--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""。
ssh_options:指定ssh连接参数。
master_response_timeout:等待主库响应的最长时间,默认60s。
reconnect_attempts:无主库响应后尝试重新连接的等待时间,默认6s。
reconnect_interval:尝试连接的时间间隔,默认10s。
failover:指定手动切换或自动切换,默认为manual。自动为automatic。
priority:当存在多个备节点时,用于指定提升为主库的优先权。默认为-1,表示禁用。一个备节点时无需设置。
promote_command:当切换触发时执行的promote命令。例如'repmgr standby promote -f /path/to/repmgr.conf'。成功执行返回0。也可指定为一个脚本,以便实现更复杂的操作。
follow_command:当切换触发时执行的follow命令。例如'repmgr standby follow -f /path/to/repmgr.conf -W'。当存在多个备节点时,未被提升为主库的备节点需要执行该命令等待重新连接到新的主库。一个备节点时无需设置。成功执行返回0。也可指定为一个脚本,以便实现更复杂的操作。
loglevel:指定日志级别,如DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT ,EMERG,默认为NOTICE。
logfacility:输出日志的设备,默认为STDERR(标准错误输出)。
pg_bindir:指定pg数据库的pg_ctl命令路径。
pg_ctl_options:为pg_ctl命令指定附加命令项,如'-s'。
logfile:指定日志文件,如'/var/log/repmgr.log'
monitor_interval_secs:指定监控间隔,默认2s一次。
retry_promote_interval_secs:指定当主节点失效时,若promote操作失败后将等待retry_promote_interval_secs秒后重新进行promote,总共进行6次,6次后将退出。默认为300,表示间隔时间为半小时。
七、总结及注意事项
1. repmgr使集群中备库的创建及主库的重新恢复变得简单方便,仅需一条命令就可实现。文件的同步是通过rsync实现的,因此服务器必须安装支持rsync;
2. 发生standby promote时先将recovery.conf重命名为recovery.done然后重启standby(pg_ctl -D $PGDATA -m fast restart),启动后成为新的主节点;
3. 提供高可用功能,相比pgpool-II与linux-ha实现hot standby高可用性,repmgr安装配置更加简单,不过需要另外提供一个witness节点;
4. 可自动实现在发生切换后其余备库对新主库的重新连接,这样便使得repmgr支持多个备节点;
5. 关于promote_command的设置。若pg version>=9.1,则不需要再按照repmgr官方提供的那样来配置,因为在pg v9.1中新加入了pg_ctl promote的支持,这样将不需要重启备库就可方便提升为主库;
6. 监控节点故障及找到一个新的主节点的任务全部交由repmgrd来完成的,因此要实现自动切换必须在故障发生之前首先启动repmgrd守护进程;
7. 关于IP的切换。仅依赖上面的测试配置,虽然能实现失效自动切换,但是此时应用程序将不能连接到数据库,因为IP地址发生了变化。不过可以在以上测试基础上加上pgbouncer连接池工具,将pgbouncer的更改连接配置过程放入promote_command的脚本中。这样在发生切换后,应用程序不需要更改连接信息。(实现过程参考章节八)
八、实现IP自动切换
以下实验接第五章节
8.1 安装pgbouncer
将pgbouncer安装在witness节点上
下载地址:http://pgfoundry.org/frs/?group_id=1000258
8.1.1 安装libevent
下载地址:http://libevent.org/
[root@witness ~]# tar -zxvf libevent-2.0.21-stable.tar.gz
[root@witness ~]# cd libevent-2.0.21-stable
[root@witness libevent-2.0.21-stable]# ./configure --prefix=/opt/libevent
[root@witness libevent-2.0.21-stable]# make
[root@witness libevent-2.0.21-stable]# make install
8.1.2 安装pgbouncer
[root@witness ~]# tar -zxvf pgbouncer-1.5.4.tar.gz
[root@witness ~]# cd pgbouncer-1.5.4
[root@witness pgbouncer-1.5.4]# ./configure --prefix=/opt/pgbouncer --with-libevent=/opt/libevent/
[root@witness pgbouncer-1.5.4]# make
[root@witness pgbouncer-1.5.4]# make install
更改pgbouncer权限:
[root@witness ~]# chown -R postgres:postgres /opt/pgbouncer/
更改postgres用户环境变量,加入pgbouncer路径及libevetn的lib路径:
[root@witness ~]# su - postgres
[postgres@witness ~]$ vi .bash_profile
export PATH=/opt/pgbouncer/bin:/opt/pgsql/bin:$PATH:$HOME/bin
export PGDATA=/opt/pgsql/data
export PGUSER=postgres
export PGPORT=5499
export LD_LIBRARY_PATH=/opt/libevent/lib:/opt/pgsql/lib:$LD_LIBRARY_PATH
使更改生效:
[postgres@witness ~]$ source .bash_profile
测试:
[postgres@witness ~]$ pgbouncer -V
pgbouncer version 1.5.4 (compiled by <root@witness> at 2014-04-16 06:50:09)
8.1.3 配置pgbouncer
[postgres@witness ~]$ cp /opt/pgbouncer/share/doc/pgbouncer/pgbouncer.ini /opt/pgbouncer/
[postgres@witness ~]$ vi /opt/pgbouncer/pgbouncer.ini
[databases]
masterdb = host=192.168.100.146 port=5432 dbname=postgres user=postgres
[pgbouncer]
logfile = /opt/pgbouncer/pgbouncer.log
pidfile = /opt/pgbouncer/pgbouncer.pid
listen_addr = *
listen_port = 6432
auth_type = trust
auth_file = /opt/pgbouncer/userlist.txt
admin_users = pgbouncer
pool_mode = session
设置userlist:
[postgres@witness ~]$ cp /opt/pgbouncer/share/doc/pgbouncer/userlist.txt /opt/pgbouncer/
[postgres@witness ~]$ vi /opt/pgbouncer/userlist.txt
"postgres" "123456"
"pgbouncer" "123456"
8.1.4 启动pgbouncer
[postgres@witness ~]$ pgbouncer -d /opt/pgbouncer/pgbouncer.ini
2014-04-16 07:12:48.287 22662 LOG File descriptor limit: 1024 (H:4096), max_client_conn: 100, max fds possible: 130
8.1.5 测试
从其它服务器连接测试:
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U postgres masterdb
psql (9.0.9, server 9.3.4)
WARNING: psql version 9.0, server version 9.3.
Some psql features might not work.
Type "help" for help.
masterdb=# \l
List of databases
Name | Owner | Encoding | Collation | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
repmgr | repmgr | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
masterdb=# SELECT pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
连接pgbouncer本身的数据库:
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U pgbouncer pgbouncer
psql (9.0.9, server 1.5.4/bouncer)
WARNING: psql version 9.0, server version 1.5.
Some psql features might not work.
Type "help" for help.
pgbouncer=# show help;
NOTICE: Console usage
DETAIL:
SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|VERSION
SHOW STATS|FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
SHOW DNS_HOSTS|DNS_ZONES
SET key = arg
RELOAD
PAUSE [<db>]
RESUME [<db>]
KILL <db>
SUSPEND
SHUTDOWN
SHOW
pgbouncer=# show clients;
type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr
| link
------+-----------+-----------+--------+-----------------+-------+-----------------+------------+---------------------+---------------------+---------
--+------
C | pgbouncer | pgbouncer | active | 192.168.100.108 | 63984 | 192.168.100.190 | 6432 | 2014-04-16 07:27:08 | 2014-04-16 07:27:52 | 0x15f4a3
0 |
(1 row)
至此pgbouncer设置完成
8.2 repmgr配置
8.2.1 修改promote_command
将3个节点上的repmgr.conf中promote_command均改为:
promote_command='/opt/pgsql/repmgr/failover.sh'
8.2.2 设置failover.sh脚本
在node1与node2上分别创建failover.sh脚本,并赋执行权限。
脚本内容见后附内容。需要注意脚本中以下设置的不同:
node1:
MASTER_IP=192.168.100.146
node2:
MASTER_IP=192.168.100.150
8.3 模拟切换
从一个客户端首先检查当前是否可以正常连接:
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U postgres masterdb -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
f
(1 row)
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U pgbouncer pgbouncer -c "show databases"
name | host | port | database | force_user | pool_size | reserve_pool
-----------+-----------------+------+-----------+------------+-----------+--------------
masterdb | 192.168.100.146 | 5432 | postgres | postgres | 20 | 0
pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0
(2 rows)
停止node1上的数据库:
[postgres@node1 repmgr]$ pg_ctl stop -m f
监控node2上日志,输出以下内容:
FATAL: terminating connection due to administrator command
[2014-04-17 05:01:06] [WARNING] wait_connection_availability: could not receive data from connection.
[2014-04-17 05:01:06] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 60 seconds before failover decision
[2014-04-17 05:01:16] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 50 seconds before failover decision
[2014-04-17 05:01:26] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 40 seconds before failover decision
[2014-04-17 05:01:36] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 30 seconds before failover decision
[2014-04-17 05:01:46] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 20 seconds before failover decision
[2014-04-17 05:01:56] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 10 seconds before failover decision
[2014-04-17 05:02:06] [ERROR] repmgrd: We couldn't reconnect for long enough, exiting...
[2014-04-17 05:02:06] [ERROR] Connection to database failed: could not connect to server: Connection refused
Is the server running on host "node1" (192.168.100.146) and accepting
TCP/IP connections on port 5432?
[2014-04-17 05:02:11] [INFO] repmgrd: This node is the best candidate to be the new primary, promoting...
2014-04-17 05:02:11 server promoting
2014-04-17 05:02:11 FAILOVER-ERROR: the db is still in recovery!
2014-04-17 05:02:12 FAILOVER-INFO: promote successful!
2014-04-17 05:02:13 FAILVOER-INFO: change pgbouncer.ini successful!
2014-04-17 05:02:13.050 32668 LOG File descriptor limit: 1024 (H:4096), max_client_conn: 100, max fds possible: 130
2014-04-17 05:02:13.051 32668 LOG takeover_init: launching connection
2014-04-17 05:02:13.052 32668 LOG S-0xa9ba40: pgbouncer/pgbouncer@unix :6432 new connection to server
2014-04-17 05:02:13.052 32668 LOG S-0xa9ba40: pgbouncer/pgbouncer@unix :6432 Login OK, sending SUSPEND
2014-04-17 05:02:13.053 32668 LOG SUSPEND finished, sending SHOW FDS
2014-04-17 05:02:13.054 32668 LOG got pooler socket: 0.0.0.0@6432
2014-04-17 05:02:13.054 32668 LOG got pooler socket: unix@6432
2014-04-17 05:02:13.054 32668 LOG SHOW FDS finished
2014-04-17 05:02:13.055 32668 LOG disko over, going background
2014-04-17 05:02:13 FAILOVER-INFO: pgbouncer reload successful!
################################# The New Conn_info ####################################
name | host | port | database | force_user | pool_size | reserve_pool
-----------+-----------------+------+-----------+------------+-----------+--------------
masterdb | 192.168.100.150 | 5432 | postgres | postgres | 20 | 0
pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0
(2 rows)
########################################################################################
再次从客户端检查连接:
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U postgres masterdb -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
f
(1 row)
[highgo@lx-pc ~]$ psql -h 192.168.100.190 -p 6432 -U pgbouncer pgbouncer -c "show databases"
name | host | port | database | force_user | pool_size | reserve_pool
-----------+-----------------+------+-----------+------------+-----------+--------------
masterdb | 192.168.100.150 | 5432 | postgres | postgres | 20 | 0
pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0
(2 rows)
{已经自动将连接切换到了node2上}
九、参考文献
官方网站:http://www.repmgr.org/?gclid=CLDFj57x3r0CFUUHvAod_38ANQ
文档:https://github.com/2ndQuadrant/repmgr
十、license
GPL V3
十一、附
failover.sh脚本:
#!/bin/bash
# Created by lianshunke@highgo.com.cn 2014/04/16
# Do
# 1.Promote the standby.
# 2.Change the pgbouncer.ini on pgbouncer-server.
# 3.Restart the pgbouncer.ini on pgbouncer-server.
PGHOME=/opt/pgsql
PGBIN=$PGHOME/bin
PGDATA=$PGHOME/data
PGPORT=5432
PGUSER=postgres
LOG_FILE=/opt/pgsql/repmgr/failover.log
BOUN_SERVER=witness
BOUN_FILE=/opt/pgbouncer/pgbouncer.ini
BOUN_LISTEN_PORT=6432
BOUN_ADMIN_USER=pgbouncer
# STANDBY_IP = FAIL_NODE_IP
STANDBY_IP=192.168.100.*
# MASTER_IP = LOCAL_IP
MASTER_IP=192.168.100.146
CONN_INFO="user=postgres port=5432 dbname=postgres"
TIME=`date '+%Y-%m-%d %H:%M:%S'`
echo -n "$TIME " >> $LOG_FILE
$PGBIN/pg_ctl -D $PGDATA promote >> $LOG_FILE
if [ $? == 0 ];then
IF_RECOVERY=" t"
while [ "$IF_RECOVERY" = " t" ];do
TIME=`date '+%Y-%m-%d %H:%M:%S'`
IF_RECOVERY=`psql -c "select pg_is_in_recovery()" | sed -n '3,3p'`
if [ "$IF_RECOVERY" = " f" ];then
echo "$TIME FAILOVER-INFO: promote successful!" >> $LOG_FILE
TIME=`date '+%Y-%m-%d %H:%M:%S'`
echo "sed -i 's/host=$STANDBY_IP/host=$MASTER_IP $CONN_INFO/g' $BOUN_FILE" | ssh $PGUSER@$BOUN_SERVER bash
if [ $? == 0 ];then
echo "$TIME FAILVOER-INFO: change pgbouncer.ini successful!" >> $LOG_FILE
# $PGBIN/psql -h $BOUN_SERVER -p $BOUN_LISTEN_PORT -U $BOUN_ADMIN_USER pgbouncer -c "reload" > /dev/null
# TIME=`date '+%Y-%m-%d %H:%M:%S'`
# echo "$TIME " >> $LOG_FILE
ssh postgres@witness ". ~/.bash_profile;pgbouncer -R -d $BOUN_FILE" &>> $LOG_FILE
if [ $? == 0 ];then
TIME=`date '+%Y-%m-%d %H:%M:%S'`
echo "$TIME FAILOVER-INFO: pgbouncer reload successful!" >> $LOG_FILE
echo "################################# The New Conn_info ####################################" >> $LOG_FILE
# Ensure the auth_type = trust
$PGBIN/psql -h $BOUN_SERVER -p $BOUN_LISTEN_PORT -U $BOUN_ADMIN_USER pgbouncer -c "show databases" >> $LOG_FILE
echo "########################################################################################" >> $LOG_FILE
else
echo "$TIME FAILOVER-ERROR: pgbouncer reload failed!" >> $LOG_FILE
fi
else
echo "$TIME FAILOVER-ERROR: change pgbouncer.ini failed!" >> $LOG_FILE
fi
else
echo "$TIME FAILOVER-ERROR: the db is still in recovery! Sleep 1s and Retry..." >> $LOG_FILE
sleep 1
fi
done
else
echo "$TIME ERROR: promote failed!"
fi
来源:oschina
链接:https://my.oschina.net/u/1011289/blog/223896