Redis 哨兵集群搭建

泪湿孤枕 提交于 2020-01-17 06:07:20

Redis 哨兵集群搭建

在搭建哨兵集群之前先搭建 redis主从服务器 redis主从复制详尽步骤
哨兵服务器的搭建最少要3台服务,这里是在上一篇的基础上搭建的是伪集群服务

如图 192.168.172.21 服务器上创建文件夹
从 redis源码里面复制一份 redis-sentinel.conf 各个端口文件夹下面并改名

[root@localhost ~]# mkdir /opt/redis/redis-sentinel/{sentinel_26379,sentinel_26380,sentinel_26381}
[root@localhost ~]# tree /opt/redis/
/opt/redis/
├── redis-5.0.7
│   ├── bin
│   │   ├── redis-benchmark
│   │   ├── redis-check-aof
│   │   ├── redis-check-rdb
│   │   ├── redis-cli
│   │   ├── redis-sentinel -> redis-server
│   │   └── redis-server
│   ├── conf
│   │   └── redis.conf
│   ├── logs
│   │   └── redis_6379.log
│   └── redis_6379.pid
└── redis-sentinel
    ├── logs
    │   └── sentinel_26379.log
    ├── sentinel_26379
    │   └── sentinel_26379.conf
    ├── sentinel_26379.pid
    ├── sentinel_26380
    │   └── sentinel_26380.conf
    └── sentinel_26381
        └── sentinel_26381.conf

9 directories, 14 files

修改配置文件 sentinel_26379.conf

#以守护进程方式运行
daemonize yes
pidfile /opt/redis/redis-sentinel/sentinel_26379.pid
logfile "/opt/redis/redis-sentinel/logs/sentinel_26379.log"

sentinel monitor mymaster 192.168.172.11 6379 2
#mymaster为指定的master服务器起一个别名
#master IP和端口号
#2的含义:当开启的sentinel server认为当前master主观下线的(+sdown)数量达到2时,
# 则sentinel server认为当前master客观下线(+odown)系统开始自动迁移。2的计算(建议):
#sentinel server数量的大多数,至少为count(sentinel server)/2 向上取整。2>3/2(主观下线与客观下线?)
#master别名和认证密码。这就提醒了用户,在master-slave系统中,各服务的认证密码应该保持一致。
sentinel auth-pass mymaster pwd@123
#  6秒 判断
sentinel down-after-milliseconds mymaster 6000

sentinel failover-timeout mymaster 18000

(多开服务只需要在以上配置基础上修改端口号, log和pid文件位置名称即可,其它保持不变 port 26380/port 26381)

开启Sentinel服务

# redis-sentinel /opt/redis/redis-sentinel/sentinel_26379/sentinel_26379.conf 
# redis-sentinel /opt/redis/redis-sentinel/sentinel_26380/sentinel_26380.conf 
# redis-sentinel /opt/redis/redis-sentinel/sentinel_26381/sentinel_26381.conf 

启动之后可以看到日志信息,每个哨兵都能去监控到对应的redis master,并能够自动发现对应的slave,哨兵之间,互相会自动进行发现,用的就是之前说的pub/sub,消息发布和订阅channel消息系统和机制

检查哨兵状态

redis-cli -h 127.0.0.1 -p 26379 -a pwd@123

sentinel master mymaster
sentinel slaves mymaster
sentinel sentinels mymaster

sentinel get-master-addr-by-name mymaster

slave切换为Master的优先级:slave-priority,值越小优先级越高
slave-priority 设置最好不要相同,容易引起哨兵推举错误
每个slave都有可能切换成master,所以每个实例都要配置两个指令
响应密码,requirepass
连接密码,masterauth

master宕机测试

通过哨兵看一下当前的master:sentinel get-master-addr-by-name mymaster

127.0.0.1:26379> sentinel  get-master-addr-by-name mymaster
1) "192.168.172.11"
2) "6379"
127.0.0.1:26379> 

把master节点kill -9掉,pid文件也删除掉

[root@localhost /]# ps aux|grep redis
root       1074  0.1  0.7 156456  7632 ?        Ssl  13:57   0:05 /opt/redis/redis-5.0.7/bin/redis-server *:6379
root       1646  0.1  0.8 156456  8004 ?        Ssl  14:33   0:03 redis-server *:6381
root       1657  0.1  0.7 156456  7916 ?        Ssl  14:36   0:03 redis-server *:6380
root       1688  0.0  0.0 112728   972 pts/0    R+   15:27   0:00 grep --color=auto redis
[root@localhost /]# kill -9 1074

等待6秒 查看,master 已经从192.168.172.11 切换到 192.168.172.21 上面了

127.0.0.1:26379> sentinel  get-master-addr-by-name mymaster
1) "192.168.172.11"
2) "6379"
127.0.0.1:26379> sentinel  get-master-addr-by-name mymaster
1) "192.168.172.21"
2) "6379"

查看sentinal的日志,是否出现+sdown字样,识别出了master的宕机问题; 然后出现+odown字样,就是指定的quorum哨兵数量,都认为master宕机了

(1)三个哨兵进程都认为master是sdown了
(2)超过quorum指定的哨兵进程都认为sdown之后,就变为odown
(3)哨兵1是被选举为要执行后续的主备切换的那个哨兵
(4)哨兵1去新的master(slave)获取了一个新的config version
(5)尝试执行failover
(6)投票选举出一个slave区切换成master,每隔哨兵都会执行一次投票
(7)让salve,slaveof noone,不让它去做任何节点的slave了; 把slave提拔成master; 旧的master认为不再是master了
(8)哨兵就自动认为之前的master变成slave,将投票出的slave变成master
(9)哨兵去探查了一下之前的master(变成来salve)的状态,认为它sdown了

故障恢复,再将旧的master重新启动,查看是否被哨兵自动切换成slave节点

查看到结果将192.168.172.21 6379切换为slave节点

1412:X 05 Jan 2020 23:39:26.393 # +sdown master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:26.448 # +odown master mymaster 192.168.172.11 6379 #quorum 2/2				-- 进入ODOWN状态时。有三个哨兵认为master当机了
1412:X 05 Jan 2020 23:39:26.448 # +new-epoch 3
1412:X 05 Jan 2020 23:39:26.448 # +try-failover master mymaster 192.168.172.11 6379	  						-- 尝试故障转移,正等待其他sentinel的选举。
1412:X 05 Jan 2020 23:39:26.449 # +vote-for-leader 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3			-- 投票给领导
1412:X 05 Jan 2020 23:39:26.452 # 6603bf447268c0242762ce53f0c0851d2f682846 voted for 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3
1412:X 05 Jan 2020 23:39:26.452 # 3014f58f6e74d789e057c33df1501bba25397894 voted for 3d9adf1c0bc6ae9a42e410deef85a4fe97de07fd 3
1412:X 05 Jan 2020 23:39:26.526 # +elected-leader master mymaster 192.168.172.11 6379	 					-- 被选举为去执行failover的时候。
1412:X 05 Jan 2020 23:39:26.526 # +failover-state-select-slave master mymaster 192.168.172.11 6379			-- 开始要选择一个slave当选新master时。
1412:X 05 Jan 2020 23:39:26.609 # +selected-slave slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379	-- 找到了 21:6379 一个适合的slave来担当新master
                              																	-- 当把选择为新master的slave的身份进行切换的时候。
1412:X 05 Jan 2020 23:39:26.609 * +failover-state-send-slaveof-noone slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379						-- 等待提升 21:6379 为新的master
1412:X 05 Jan 2020 23:39:26.668 * +failover-state-wait-promotion slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:26.805 # +promoted-slave slave 192.168.172.21:6379 192.168.172.21 6379 @ mymaster 192.168.172.11 6379			-- 提升 21:6379 master
1412:X 05 Jan 2020 23:39:26.805 # +failover-state-reconf-slaves master mymaster 192.168.172.11 6379				-- Failover状态变为reconf-slaves状态时
1412:X 05 Jan 2020 23:39:26.877 * +slave-reconf-sent slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379								  -- 重新配置 6381为slave
1412:X 05 Jan 2020 23:39:27.527 # -odown master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:27.648 * +slave-reconf-inprog slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:28.721 * +slave-reconf-done slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:28.775 * +slave-reconf-sent slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.728 * +slave-reconf-inprog slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.728 * +slave-reconf-done slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.828 # +failover-end master mymaster 192.168.172.11 6379
1412:X 05 Jan 2020 23:39:29.828 # +switch-master mymaster 192.168.172.11 6379 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6381 192.168.172.11 6381 @ mymaster 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6380 192.168.172.11 6380 @ mymaster 192.168.172.21 6379
1412:X 05 Jan 2020 23:39:29.828 * +slave slave 192.168.172.11:6379 192.168.172.11 6379 @ mymaster 192.168.172.21 6379

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!