环境:
192.168.205.37: as MHA server
192.168.205.47: as Master server
192.168.205.57: as Slave1 server
192.168.205.67: as Slave2 server
版本:
OS: centos 7 1810 with mini install
mariadb-server 5.5.60
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
目地:
MHA使用perl脚本写的程序,当主节点down时如果从节点复制没有完成时,当他成为主节点后,它需要尝试将主节点的二进制日志复制到新的主节点,从而保证数据的完整性,所以它需要sshkey验证。我们使用一台做为管理节点,其中三个为数据库节点,一主两从,模拟主节点down机,看从是如何被提升为主节点的。
安装数据库
- 安装mariadb使用如下脚本:
[root@master data]#cat maridb_yum.sh #!/bin/bash ID=`ip a show dev eth0 | sed -r '3!d;s@(.*inet)(.*)(/.*)@\2@' | cut -d. -f4` rpm -q mariadb-server ||yum install -y mariadb-server [ -d /data/mysql ] || mkdir -p /data/mysql [ -d /data/logs ] || mkdir -p /data/logs chown mysql:mysql /data/{mysql,logs} sed -i 's@datadir=/var/lib/mysql@datadir=/data/mysql@' /etc/my.cnf grep "log-bin" /etc/my.cnf || sed -i '/\[mysqld\]/a log-bin=/data/logs/bin' /etc/my.cnf grep "innodb_file_per_table" /etc/my.cnf || sed -i '/\[mysqld\]/a innodb_file_per_table = on' /etc/my.cnf grep "skip_name_resolve" /etc/my.cnf || sed -i '/\[mysqld\]/a skip_name_resolve = on' /etc/my.cnf grep "server-id" /etc/my.cnf || sed -i "/\[mysqld\]/a server-id=$ID" /etc/my.cnf service mariadb restart
- 修改两个从节点的配置文件
[root@slave1 data]#vi /etc/my.cnf [mysqld] read_only relay_log_purge=0 #不删除中继日志
-
建立主节点的复制位置和帐号
MariaDB [(none)]> show master logs; +------------+-----------+ | Log_name | File_size | +------------+-----------+ | bin.000001 | 30373 | | bin.000002 | 1038814 | | bin.000003 | 245 | +------------+-----------+ 3 rows in set (0.00 sec) MariaDB [(none)]> grant replication slave on *.* to repluser@'192.168.205.%' identified by 'centos'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> grant all on *.* to mhauser@'192.168.205.%' identified by 'centos'; Query OK, 0 rows affected (0.00 sec)
-
配置从节点的change master to并启动线程
MariaDB [(none)]> CHANGE MASTER TO -> MASTER_HOST='192.168.205.47', -> MASTER_USER='repluser', -> MASTER_PASSWORD='centos', -> MASTER_PORT=3306, -> MASTER_LOG_FILE='bin.000003', -> MASTER_LOG_POS=245; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.205.47 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: bin.000003 Read_Master_Log_Pos: 401 Relay_Log_File: mariadb-relay-bin.000002 Relay_Log_Pos: 679 Relay_Master_Log_File: bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes
MHA实现
- 安装包,mha依赖epel源,我们要把epel起用
[root@MHA ~]#yum install mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm
- 在所有的sql节点上安装node包
[root@master ~]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm [root@slave1 data]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm [root@slave2 data]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm
- 配置ssh key 验证
[root@MHA ~]#ssh-keygen [root@MHA ~]#ssh-copy-id 192.168.205.37 [root@MHA ~]#cat .ssh/authorized_keys [root@MHA ~]#scp -r .ssh 192.168.205.47:/root/ [root@MHA ~]#scp -r .ssh 192.168.205.57:/root/ [root@MHA ~]#scp -r .ssh 192.168.205.67:/root/
- 生成配置MHA文件
[root@MHA ~]#mkdir /etc/mha [root@MHA ~]#vim /etc/mha/app1.cnf [server default] master_binlog_dir=/data/logs/ user=mhauser password=centos manager_workdir=/data/mastermha/app1/ manager_log=/data/mastermha/app1/manager.log remote_workdir=/data/mastermha/app1/ ssh_user=root repl_user=repluser repl_password=centos ping_interval=1 [server1] hostname=192.168.205.47 candidate_master=1 [server2] hostname=192.168.205.57 [server3] hostname=192.168.205.67 candidate_master=1
- 启动之前进行检查ssh,repl复制是否准备好
[root@MHA ~]#masterha_check_ssh --conf=/etc/mha/app1.cnf [root@MHA ~]#masterha_check_repl --conf=/etc/mha/app1.cnf
- 起动进程,此进程前台运行,当主节点失败时切换完成后它会终止,所以要想持续要重启进程
[root@MHA ~]#masterha_manager --conf=/etc/mha/app1.cnf Mon Aug 12 23:33:22 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:33:22 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:33:22 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
测试
-
我们模拟一个正在写入操作时,主节点down, 看服务器能不能正常切换并不能复制丢失的数据
MariaDB [(none)]> use test; Database changed MariaDB [test]> create table testlog (id int auto_increment primary key,name char(10),age int default 20); Query OK, 0 rows affected (0.01 sec) MariaDB [test]> MariaDB [test]> delimiter $$ MariaDB [test]> MariaDB [test]> create procedure sp_testlog() -> begin -> declare i int; -> set i = 1; -> while i <= 100000 -> do insert into testlog(name,age) values (concat('wang',i),i); -> set i = i +1; -> end while; -> end$$ Query OK, 0 rows affected (0.00 sec) MariaDB [test]> MariaDB [test]> delimiter ; MariaDB [test]> call sp_testlog;
- 正在写入数据时我们及时关闭47服务器,MHA检测到节点,并响应,成功提升其它的为主后退出程序。
[root@MHA ~]#masterha_manager --conf=/etc/mha/app1.cnf Mon Aug 12 23:33:22 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:33:22 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:33:22 2019 - [info] Reading server configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:45:16 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:45:16 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:45:16 2019 - [info] Reading server configuration from /etc/mha/app1.cnf.. [root@MHA ~]#
-
我们可以查看日志看一下发生的情况,slave2提升为主
[root@MHA data]#cat mastermha/app1/manager.log ----- Failover Report ----- app1: MySQL Master failover 192.168.205.47(192.168.205.47:3306) to 192.168.205.67(192.168.205.67:3306) succeeded Master 192.168.205.47(192.168.205.47:3306) is down! Check MHA Manager logs at MHA:/data/mastermha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave 192.168.205.57(192.168.205.57:3306) has all relay logs for recovery. Selected 192.168.205.67(192.168.205.67:3306) as a new master. 192.168.205.67(192.168.205.67:3306): OK: Applying all logs succeeded. 192.168.205.57(192.168.205.57:3306): This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. 192.168.205.57(192.168.205.57:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.205.67(192.168.205.67:3306) 192.168.205.67(192.168.205.67:3306): Resetting slave info succeeded. Master failover to 192.168.205.67(192.168.205.67:3306) completed successfully.
- 我们来核实一下,在slave2上看一下slave的状态,发现已经没了,因为它是主
MariaDB [(none)]> show slave status\G Empty set (0.00 sec)
- 检查一下slave1,发现主节点指向了slave2
MariaDB [test]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.205.67 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: bin.000005 Read_Master_Log_Pos: 245 Relay_Log_File: mariadb-relay-bin.000002 Relay_Log_Pos: 523 Relay_Master_Log_File: bin.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes
- 在新主节点上看一下是否将read_only off掉,发现是,但不要忘记改回配置文件,如果重启的话又会变成read_only
MariaDB [(none)]> show variables like 'read_only'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | OFF | +---------------+-------+ 1 row in set (0.00 sec)
来源:https://blog.51cto.com/127601/2428895