实验环境:centos7.4纯净版
postgres版本: 9.6.15
etcd版本:3.3.11
patroni版本:1.6.0
patroni介绍可参考:https://github.com/zalando/patroni
ip规划
192.168.216.130 node1 master
192.168.216.132 node2 slave
192.168.216.134 node3 slave
etcd集群部署请看上一篇文章:https://www.cnblogs.com/caidingyu/p/11408389.html
postgres部署参考文章:https://www.cnblogs.com/virtulreal/p/9921978.html
修改node1中postgresql.conf配置如下
max_connections = '100'
max_wal_senders = '10'
port = '5432'
listen_addresses = '0.0.0.0'
synchronous_commit = on
full_page_writes = on
wal_log_hints = on
synchronous_standby_names = '*'
max_replication_slots = 10
wal_level = replica
修改node1中pg_hba.conf配置如下
[root@localhost data]# more pg_hba.conf|grep -v ^#|grep -v ^$
local all all peer
host all all 127.0.0.1/32 md5
host all postgres 127.0.0.1/32 md5
host all all 192.168.216.0/24 md5
host all all ::1/128 md5
local replication replicator peer
host replication replicator 127.0.0.1/32 md5
host replication replicator ::1/128 md5
host replication replicator 192.168.216.130/32 md5
host replication replicator 192.168.216.132/32 md5
host replication replicator 192.168.216.134/32 md5
node1上创建复制槽,至关重要,patroni 用到了这个玩意
postgres=# create user replicator replication login encrypted password '1qaz2wsx';
postgres=# alter user postgres with password '1qaz2wsx';
postgres=# select * from pg_create_physical_replication_slot('pgsql96_node1');
postgres=# select * from pg_create_physical_replication_slot('pgsql96_node2');
postgres=# select * from pg_create_physical_replication_slot('pgsql96_node3');
node2 配置stream replication
systemctl stop postgresql-9.6
su - postgres
cd /var/lib/pgsql/9.6/data
rm -rf ./*
/usr/pgsql-9.6/bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9.6/data -U replicator -v -P -R
vi recovery.conf
recovery_target_timeline = 'latest'
standby_mode = 'on'
primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx'
primary_slot_name = 'pgsql96_node2'
trigger_file = '/tmp/postgresql.trigger.5432'
执行exit返回root用户
systemctl start postgresql-9.6
node3 配置stream replication
systemctl stop postgresql-9.6
su - postgres
cd /var/lib/pgsql/9.6/data
rm -rf ./*
/usr/pgsql-9.6/bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9.6/data -U replicator -v -P -R
vi recovery.conf
recovery_target_timeline = 'latest'
standby_mode = 'on'
primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx'
primary_slot_name = 'pgsql96_node3'
trigger_file = '/tmp/postgresql.trigger.5432'
执行exit返回root用户
systemctl start postgresql-9.6
在node1上连接数据库,查看复制状态
select client_addr,pg_xlog_location_diff(sent_location, write_location) as write_delay,pg_xlog_location_diff(sent_location, flush_location) as flush_delay,pg_xlog_location_diff(sent_location, replay_location) as replay_delay from pg_stat_replication;
下载、安装 patroni,如遇网络问题可以多执行几次pip install或者切换其他pip源
yum install gcc
yum install python-devel.x86_64
cd /tmp
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py
pip install psycopg2-binary
pip install patroni[etcd,consul]
验证是否安装成功
which patroni
patronictl --help
node1上patroni 配置文件如下,该配置文件需要手动创建
mkdir -p /usr/patroni/conf
cd /usr/patroni/conf/
cat /usr/patroni/conf/patroni_postgresql.yml
scope: pgsql96
namespace: /pgsql/
name: pgsql96_node1
restapi:
listen: 192.168.216.130:8008
connect_address: 192.168.216.130:8008
etcd:
host: 192.168.216.130:2379
bootstrap:
# this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
# and all other cluster members will use it as a `global configuration`
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_addresses: "0.0.0.0"
port: 5432
wal_level: logical
hot_standby: "on"
wal_keep_segments: 1000
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
# archive_mode: "on"
# archive_timeout: 1800s
# archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
# recovery_conf:
# restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.216.130:5432
data_dir: /var/lib/pgsql/9.6/data
bin_dir: /usr/pgsql-9.6/bin
# config_dir: /etc/postgresql/9.6/main
authentication:
replication:
username: replicator
password: 1qaz2wsx
superuser:
username: postgres
password: 1qaz2wsx
#watchdog:
# mode: automatic # Allowed values: off, automatic, required
# device: /dev/watchdog
# safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
node2上patroni 配置文件如下
[root@localhost etcd]# cat /usr/patroni/conf/patroni_postgresql.yml
scope: pgsql96
namespace: /pgsql/
name: pgsql96_node2
restapi:
listen: 192.168.216.132:8008
connect_address: 192.168.216.132:8008
etcd:
host: 192.168.216.132:2379
bootstrap:
# this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
# and all other cluster members will use it as a `global configuration`
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_addresses: "0.0.0.0"
port: 5432
wal_level: logical
hot_standby: "on"
wal_keep_segments: 1000
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
# archive_mode: "on"
# archive_timeout: 1800s
# archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
# recovery_conf:
# restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.216.132:5432
data_dir: /var/lib/pgsql/9.6/data
bin_dir: /usr/pgsql-9.6/bin
# config_dir: /etc/postgresql/9.6/main
authentication:
replication:
username: replicator
password: 1qaz2wsx
superuser:
username: postgres
password: 1qaz2wsx
#watchdog:
# mode: automatic # Allowed values: off, automatic, required
# device: /dev/watchdog
# safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
node3上patroni 配置文件如下
scope: pgsql96
namespace: /pgsql/
name: pgsql96_node3
restapi:
listen: 192.168.216.134:8008
connect_address: 192.168.216.134:8008
etcd:
host: 192.168.216.134:2379
bootstrap:
# this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
# and all other cluster members will use it as a `global configuration`
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_addresses: "0.0.0.0"
port: 5432
wal_level: logical
hot_standby: "on"
wal_keep_segments: 1000
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
# archive_mode: "on"
# archive_timeout: 1800s
# archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
# recovery_conf:
# restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.216.134:5432
data_dir: /var/lib/pgsql/9.6/data
bin_dir: /usr/pgsql-9.6/bin
# config_dir: /etc/postgresql/9.6/main
authentication:
replication:
username: replicator
password: 1qaz2wsx
superuser:
username: postgres
password: 1qaz2wsx
#watchdog:
# mode: automatic # Allowed values: off, automatic, required
# device: /dev/watchdog
# safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
手动启动 patroni,注意这里需要切换到postgres用户下启动
node1、node2、node3 三个节点依次启动
patroni /usr/patroni/conf/patroni_postgresql.yml
查看 patroni 集群状态
克隆一个窗口,执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list
查看 etcd 的 信息
etcdctl ls /pgsql/pgsql96
etcdctl get /pgsql/pgsql96/members/pgsql96_node1
为了方便开机自启,故配置成 patroni.service,3个node都需要进行配置,配置好patroni.service后就可以直接在root用户下切换Leader以及重启postgres节点等操作
[root@localhost data]# vi /etc/systemd/system/patroni.service
[root@localhost data]# cat /etc/systemd/system/patroni.service
[Unit]
Description=patroni - a high-availability PostgreSQL
Documentation=https://patroni.readthedocs.io/en/latest/index.html
After=syslog.target network.target etcd.target
Wants=network-online.target
[Service]
Type=simple
User=postgres
Group=postgres
PermissionsStartOnly=true
ExecStart=/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536
KillMode=process
KillSignal=SIGINT
Restart=on-abnormal
RestartSec=30s
TimeoutSec=0
[Install]
WantedBy=multi-user.target
禁止 postgresql 的自启动,通过 patroni 来管理 postgresql
systemctl status patroni
systemctl start patroni
systemctl enable patroni
systemctl status postgresql
systemctl disable postgresql
systemctl status etcd
systemctl enable etcd
如何手动切换Leader
执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml switchover
多执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list进行刷新,可以看到Leader由node2切换为node1
可以尝试模拟node节点断电,即手动停止Leader所在节点的patroni服务,此时查询集群状态,可以看到Leader自动切换到其他node节点上
目前可以看到Leader位于132节点上,
在132节点上停止patroni服务
再次查看集群状态,可以看到132被stop,Leader自动被转移到134上
参考:https://blog.csdn.net/ctypyb2002/article/details/81007990
https://www.cnblogs.com/ctypyb2002/p/9792939.html
来源:oschina
链接:https://my.oschina.net/u/4384187/blog/3415541