Redis 哨兵集群搭建
在搭建哨兵集群之前先搭建 redis主从服务器 redis主从复制详尽步骤
哨兵服务器的搭建最少要3台服务,这里是在上一篇的基础上搭建的是伪集群服务
如图 192.168.172.21 服务器上创建文件夹
从 redis源码里面复制一份 redis-sentinel.conf 各个端口文件夹下面并改名
[root@localhost ~]# mkdir /opt/redis/redis-sentinel/{sentinel_26379,sentinel_26380,sentinel_26381}
[root@localhost ~]# tree /opt/redis/
/opt/redis/
├── redis-5.0.7
│ ├── bin
│ │ ├── redis-benchmark
│ │ ├── redis-check-aof
│ │ ├── redis-check-rdb
│ │ ├── redis-cli
│ │ ├── redis-sentinel -> redis-server
│ │ └── redis-server
│ ├── conf
│ │ └── redis.conf
│ ├── logs
│ │ └── redis_6379.log
│ └── redis_6379.pid
└── redis-sentinel
├── logs
│ └── sentinel_26379.log
├── sentinel_26379
│ └── sentinel_26379.conf
├── sentinel_26379.pid
├── sentinel_26380
│ └── sentinel_26380.conf
└── sentinel_26381
└── sentinel_26381.conf
9 directories, 14 files
修改配置文件 sentinel_26379.conf
#以守护进程方式运行
daemonize yes
pidfile /opt/redis/redis-sentinel/sentinel_26379.pid
logfile "/opt/redis/redis-sentinel/logs/sentinel_26379.log"
sentinel monitor mymaster 192.168.172.11 6379 2
#mymaster为指定的master服务器起一个别名
#master IP和端口号
#2的含义:当开启的sentinel server认为当前master主观下线的(+sdown)数量达到2时,
# 则sentinel server认为当前master客观下线(+odown)系统开始自动迁移。2的计算(建议):
#sentinel server数量的大多数,至少为count(sentinel server)/2 向上取整。2>3/2(主观下线与客观下线?)
#master别名和认证密码。这就提醒了用户,在master-slave系统中,各服务的认证密码应该保持一致。
sentinel auth-pass mymaster pwd@123
# 6秒 判断
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
(多开服务只需要在以上配置基础上修改端口号, log和pid文件位置名称即可,其它保持不变 port 26380/port 26381)
开启Sentinel服务
# redis-sentinel /opt/redis/redis-sentinel/sentinel_26379/sentinel_26379.conf
# redis-sentinel /opt/redis/redis-sentinel/sentinel_26380/sentinel_26380.conf
# redis-sentinel /opt/redis/redis-sentinel/sentinel_26381/sentinel_26381.conf
启动之后可以看到日志信息,每个哨兵都能去监控到对应的redis master,并能够自动发现对应的slave,哨兵之间,互相会自动进行发现,用的就是之前说的pub/sub,消息发布和订阅channel消息系统和机制
检查哨兵状态
redis-cli -h 127.0.0.1 -p 26379 -a pwd@123
sentinel master mymaster
sentinel slaves mymaster
sentinel sentinels mymaster
sentinel get-master-addr-by-name mymaster
slave切换为Master的优先级:slave-priority,值越小优先级越高
slave-priority 设置最好不要相同,容易引起哨兵推举错误
每个slave都有可能切换成master,所以每个实例都要配置两个指令
响应密码,requirepass
连接密码,masterauth
master宕机测试
通过哨兵看一下当前的master:sentinel get-master-addr-by-name mymaster
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "192.168.172.11"
2) "6379"
127.0.0.1:26379>
把master节点kill -9掉,pid文件也删除掉
[root@localhost /]# ps aux|grep redis
root 1074 0.1 0.7 156456 7632 ? Ssl 13:57 0:05 /opt/redis/redis-5.0.7/bin/redis-server *:6379
root 1646 0.1 0.8 156456 8004 ? Ssl 14:33 0:03 redis-server *:6381
root 1657 0.1 0.7 156456 7916 ? Ssl 14:36 0:03 redis-server *:6380
root 1688 0.0 0.0 112728 972 pts/0 R+ 15:27 0:00 grep --color=auto redis
[root@localhost /]# kill -9 1074
等待6秒 查看,master 已经从192.168.172.11 切换到 192.168.172.21 上面了
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "192.168.172.11"
2) "6379"
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "192.168.172.21"
2) "6379"
查看sentinal的日志,是否出现+sdown字样,识别出了master的宕机问题; 然后出现+odown字样,就是指定的quorum哨兵数量,都认为master宕机了
(1)三个哨兵进程都认为master是sdown了
(2)超过quorum指定的哨兵进程都认为sdown之后,就变为odown
(3)哨兵1是被选举为要执行后续的主备切换的那个哨兵
(4)哨兵1去新的master(slave)获取了一个新的config version
(5)尝试执行failover
(6)投票选举出一个slave区切换成master,每隔哨兵都会执行一次投票
(7)让salve,slaveof noone,不让它去做任何节点的slave了; 把slave提拔成master; 旧的master认为不再是master了
(8)哨兵就自动认为之前的master变成slave,将投票出的slave变成master
(9)哨兵去探查了一下之前的master(变成来salve)的状态,认为它sdown了
故障恢复,再将旧的master重新启动,查看是否被哨兵自动切换成slave节点
查看到结果将192.168.172.21 6379切换为slave节点
1412:X 05 Jan 2020 23:39:26.393 # +sdown master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:26.448 # +odown master mymaster 192.168.172.11 6379 #quorum 2/2 -- 进入ODOWN状态时。有三个哨兵认为master当机了
1412:X 05 Jan 2020 23:39:26.448 # +new-epoch 3
1412:X 05 Jan 2020 23:39:26.448 # +try-failover master mymaster 192.168.172.11 6379 -- 尝试故障转移,正等待其他sentinel的选举。
1412:X 05 Jan 2020 23:39:26.449 # +vote-for-leader 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3 -- 投票给领导
1412:X 05 Jan 2020 23:39:26.452 # 6603bf447268c0242762ce53f0c0851d2f682846 voted for 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3
1412:X 05 Jan 2020 23:39:26.452 # 3014f58f6e74d789e057c33df1501bba25397894 voted for 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3
1412:X 05 Jan 2020 23:39:26.526 # +elected-leader master mymaster 192.168.172.11 6379 -- 被选举为去执行failover的时候。
1412:X 05 Jan 2020 23:39:26.526 # +failover-state-select-slave master mymaster 192.168.172.11 6379 -- 开始要选择一个slave当选新master时。
1412:X 05 Jan 2020 23:39:26.609 # +selected-slave slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379 -- 找到了 21:6379 一个适合的slave来担当新master
-- 当把选择为新master的slave的身份进行切换的时候。
1412:X 05 Jan 2020 23:39:26.609 * +failover-state-send-slaveof-noone slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379 -- 等待提升 21:6379 为新的master
1412:X 05 Jan 2020 23:39:26.668 * +failover-state-wait-promotion slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:26.805 # +promoted-slave slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379 -- 提升 21:6379 master
1412:X 05 Jan 2020 23:39:26.805 # +failover-state-reconf-slaves master mymaster 192.168.172.11 6379 -- Failover状态变为reconf-slaves状态时
1412:X 05 Jan 2020 23:39:26.877 * +slave-reconf-sent slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379 -- 重新配置 6381为slave
1412:X 05 Jan 2020 23:39:27.527 # -odown master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:27.648 * +slave-reconf-inprog slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:28.721 * +slave-reconf-done slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:28.775 * +slave-reconf-sent slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.728 * +slave-reconf-inprog slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.728 * +slave-reconf-done slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.828 # +failover-end master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.828 # +switch-master mymaster 192.168.172.11 6379 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6379 192.168.172.11 6379 @ mymaster 192.168.172.21 6379
来源:CSDN
作者:ting_Cwt
链接:https://blog.csdn.net/weixin_39440438/article/details/103843150