MHA0.58安装
MHA(Master High Availability)由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。
管理节点
mha4mysql-manager-0.58
mha4mysql-manager-0.58
下载地址:
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz
MHA Manager
可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Manager会定时探测集群中的master节点,当master出现故障时,
它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。
MHA Node
运行在每台MySQL服务器上,定时和 MHA Manager交互信息。
在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。
例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。
使用MySQL 的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,
MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。
搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库。
主从架构
master:
10.150.20.90 ed3jrdba90
slave:
10.15.20.97 ed3jrdba97
10.150.20.132 ed3jrdba132
manager:
10.150.20.95 ed3jrdba95
四台机器的系统情况:
OS:CentOS7.3
MySQL:5.7.21
MHA:0.58
网卡名:ens3
1:MHA的各节点os配置ssh免密登录
测试
[root@qa05 ~]#ssh ed3jrdba90
[root@qa05 ~]#ssh ed3jrdba97
[root@qa05 ~]#ssh ed3jrdba132
2:在master-slave节点按照统一规范安装MySQL数据库
3:搭建主从复制一主两从
每个节点的 binlog-do-db 和 replicate-ignore-db 设置必须相同
记录初始化MySQL实例root临时密码,记录在错误日志,登录交互界面进行修改
master修改root@localhost密码
root@(none) 03:15:25>set password for 'root'@'localhost'= password('root');
master 创建监控用户
mysql> create user 'mha_monitor'@'10.150.20.%' identified by 'mha_monitor';
mysql> grant all privileges on *.* to 'mha_monitor'@'10.150.20.%';
mysql> flush privileges;
master创建复制用户
mysql> create user 'replicator'@'10.150.20.%' identified by 'replicator';
mysql> grant replication slave on *.* to 'replicator'@'10.150.20.%';
mysql> flush privileges;
master备库数据库实例,slave同步数据
master:
[root@qa05 ~]# mysqldump -uroot -p --master-data=2 --single-transaction -R --triggers -A > /data/mysql_all.sql
slave:
[root@qa06 ~]# mysql -uroot -p < /data/mysql_all.sql
[root@cgdb ~]#mysql -uroot -p < /data/mysql_all.sql
salve设置为 read only,从库对外提供读服务,不写进配置文件,因为随时slave会提升为master。
mysql> set global read_only=1;
4:安装 mha node
在ed3jrdba90、ed3jrdba95、ed3jrdba97、ed3jrdba132安装mha4mysql-node-0.58
# cd mha4mysql-node-0.58 # perl Makefile.PL # make # make install
# ll /usr/local/bin/
-rwxr-xr-x 1 root root 16377 12月 26 2017 apply_diff_relay_logs
-rwxr-xr-x 1 root root 4803 12月 26 2017 filter_mysqlbinlog
-rwxr-xr-x 1 root root 8257 12月 26 2017 purge_relay_logs
-rwxr-xr-x 1 root root 7521 12月 26 2017 save_binary_log
备注:
apply_diff_relay_logs识别差异的中继日志事件并将其差异的事件应用到其他的slave
filter_mysqlbinlog去除不必要的rollback事件(MHA已不再使用这个工具)
purge_relay_logs清楚中继日志(不会阻塞SQL线程)
save_binary_log保存和复制master的二进制日志
5:安装 mha manager
在ed3jrdba95安装mha4mysql-manager-0.58
# cd mha4mysql-manager-0.58
# perl Makefile.PL
# make
# make install
[root@dev05 ~]# ll /usr/local/bin/
60
-r-xr-xr-x 1 root root 1995 110 11:32 masterha_check_repl
-r-xr-xr-x 1 root root 1779 110 11:32 masterha_check_ssh
-r-xr-xr-x 1 root root 1865 110 11:32 masterha_check_status
-r-xr-xr-x 1 root root 3201 110 11:32 masterha_conf_host
-r-xr-xr-x 1 root root 2517 110 11:32 masterha_manager
-r-xr-xr-x 1 root root 2165 110 11:32 masterha_master_monitor
-r-xr-xr-x 1 root root 2373 110 11:32 masterha_master_switch
-r-xr-xr-x 1 root root 5172 110 11:32 masterha_secondary_check
-r-xr-xr-x 1 root root 1739 110 11:32 masterha_stop
备注:
masterha_check_repl检查MySQL复制状况
masterha_check_ssh检查MHA的SSH配置状况
masterha_check_status检测当前MHA的运行状态
masterha_conf_host添加和删除配置的server信息
masterha_manager启动MHA
masterha_master_monitor检测master是否宕机
masterha_master_switch控制故障转移(自动、手动)
masterha_secondary_check
masterha_stop
cp /opt/source/mha4mysql-manager-0.58/samples/scripts/* /usr/local/bin/
-rwxr-xr-x 1 dev_bj dev_bj 3648 323 2018 master_ip_failover
-rwxr-xr-x 1 dev_bj dev_bj 9870 323 2018 master_ip_online_change
-rwxr-xr-x 1 dev_bj dev_bj 11867 323 2018 power_manager
-rwxr-xr-x 1 dev_bj dev_bj 1360 323 2018 send_report
---------------------
备注
master_ip_failover 自动切换时vip管理的脚本,不是必须
master_ip_online_change 在线切换时vip的管理,不是必须
power_manager 故障发生后关闭主机的脚本,不是必须
send_report 因故障切换后发送报警的脚本,不是必须
---------------------
6:MHA配置
创建mha工作目录
# mkdir -p /etc/mysql_mha/
# cp /opt/source/mha4mysql-manager-0.58/samples/conf/* /etc/masterha/
编辑配置app1.cnf文件:
vi /etc/mysql_mha/app1.cnf
[server default]
#manager工作目录
manager_workdir=/data/mysql_mha/app1
#manager日志路径
manager_log=/data/mysql_mha/app1-manager.log
#master保存binlog的位置,以便MHA可以找到master的日志
master_binlog_dir=/data/mysql_33061/logs
#自动failover时候的切换脚本
#master_ip_failover_script=/usr/local/bin/master_ip_failover
#手动切换时候的切换脚本
#master_ip_online_change_script= /usr/local/bin/master_ip_online_change
#master创建的监控用户
user=mha_monitor
#MySQL创建的监控用户密码
password=mha_monitor
#设置监控主库,发送ping包的时间 间隔,默认为3秒,尝试三次没有回应的时候自动进行failover
ping_interval=5
#远端MySQL在发生切换时,binlog的保存位置
remote_workdir=/data/mysql_mha/app1
#MySQL创建的复制用户密码
repl_password=replicator
repl_user=replicator
#发生切换后发送的报警的脚本
#report_script=/usr/local/send_report
#MHA到ed3jrdba90之间的监控异常,MHA manager将会尝试从ed3jrdba97登录到ed3jrdba90
secondary_check_script=/usr/local/bin/masterha_secondary_check -s ed3jrdba97 -s ed3jrdba90
#故障发生后关闭故障主机脚本
shutdown_script=""
#ssh的登录用户名,需事先配置好ssh免密登录
ssh_user=root
#master信息
[server1]
hostname=10.150.20.90
port=33061
#candidate master信息
[server2]
hostname=10.150.20.97
port=33061
#候选master,如果设置该参数后,发生主从切换时,将该库提升为主库,即使这个主库不是集群中事件最新的SLAVE
candidate_master=1
#默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,
#因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,
#这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
check_repl_delay=0
#slave 信息
[server3]
hostname=10.150.20.132
port=33061
全局配置文件编辑:
# cat /etc/mysql_mha/masterha_default.cnf
[server default]
user=mha_monitor
password=mha_monitor
ssh_user=root
# for replication
repl_user=replicator
repl_password=replicator
# for monitor
ping_interval=1
shutdown_script=""
#master_ip_failover_script=/usr/local/bin/master_ip_failover_script
#remote_workdir=/data/log/masterha
#secondary_check_script= masterha_secondary_check -s remote_host1 -s remote_host2
#ping_interval=1
# master_ip_failover_script= /script/masterha/master_ip_failover
# shutdown_script= /script/masterha/power_manager
# report_script= /script/masterha/send_report
# master_ip_online_change_script= /script/masterha/master_ip_online_change
relay log的设置
在每个slave上执行,将relay log的自动设置清除为OFF
mysql> set global relay_log_purge=0;
# vi /usr/local/bin/purge_relay_log.sh
1 #!/bin/bash 2 user=root 3 passwd=root 4 port=33061 5 log_dir='/data/mysql_33061/logs' 6 work_dir='/data/mysql_33061' 7 purge='/usr/local/bin/purge_relay_logs' 8 9 if [ ! -d $log_dir ] 10 then 11 mkdir $log_dir -p 12 fi 13 14 $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs1.log 2>&1
检查MHA Manger到所有MHA Node的SSH连接
# masterha_check_ssh --conf=/etc/mysql_mha/app1.cnf
Mon Dec 10 16:39:49 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Dec 10 16:39:49 2018 - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf..
Mon Dec 10 16:39:49 2018 - [info] Reading server configuration from /etc/mysql_mha/app1.cnf..
Mon Dec 10 16:39:49 2018 - [info] Starting SSH connection tests..
Mon Dec 10 16:39:50 2018 - [debug]
Mon Dec 10 16:39:49 2018 - [debug] Connecting via SSH from root@10.150.20.90(10.150.20.90:22) to root@10.150.20.97(10.150.20.97:22)..
Mon Dec 10 16:39:49 2018 - [debug] ok.
Mon Dec 10 16:39:49 2018 - [debug] Connecting via SSH from root@10.150.20.90(10.150.20.90:22) to root@10.150.20.132(10.150.20.132:22)..
Mon Dec 10 16:39:50 2018 - [debug] ok.
Mon Dec 10 16:39:51 2018 - [debug]
Mon Dec 10 16:39:50 2018 - [debug] Connecting via SSH from root@10.150.20.97(10.150.20.97:22) to root@10.150.20.90(10.150.20.90:22)..
Mon Dec 10 16:39:50 2018 - [debug] ok.
Mon Dec 10 16:39:50 2018 - [debug] Connecting via SSH from root@10.150.20.97(10.150.20.97:22) to root@10.150.20.132(10.150.20.132:22)..
Mon Dec 10 16:39:50 2018 - [debug] ok.
Mon Dec 10 16:39:52 2018 - [debug]
Mon Dec 10 16:39:50 2018 - [debug] Connecting via SSH from root@10.150.20.132(10.150.20.132:22) to root@10.150.20.90(10.150.20.90:22)..
Mon Dec 10 16:39:51 2018 - [debug] ok.
Mon Dec 10 16:39:51 2018 - [debug] Connecting via SSH from root@10.150.20.132(10.150.20.132:22) to root@10.150.20.97(10.150.20.97:22)..
Mon Dec 10 16:39:51 2018 - [debug] ok.
Mon Dec 10 16:39:52 2018 - [info] All SSH connection tests passed successfully.
检查整个复制环境
# vi /etc/mysql_mha/app1.cnf
#先注销掉这几个参数
#master_ip_failover_script= /usr/local/bin/master_ip_failover
#master_ip_online_change_script= /usr/local/bin/master_ip_online_change
#report_script=/usr/local/send_report
/usr/local/bin/masterha_check_repl --conf=/etc/mysql_mha/app1.cnf
/usr/local/bin/masterha_check_repl --conf=/etc/mysql_mha/app1.cnf Wed Dec 12 11:13:53 2018 - [info] Reading default configuration from /etc/masterha_default.cnf.. Wed Dec 12 11:13:53 2018 - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf.. Wed Dec 12 11:13:53 2018 - [info] Reading server configuration from /etc/mysql_mha/app1.cnf.. Wed Dec 12 11:13:53 2018 - [info] MHA::MasterMonitor version 0.58. Wed Dec 12 11:13:54 2018 - [info] GTID failover mode = 0 Wed Dec 12 11:13:54 2018 - [info] Dead Servers: Wed Dec 12 11:13:54 2018 - [info] Alive Servers: Wed Dec 12 11:13:54 2018 - [info] 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:13:54 2018 - [info] 10.150.20.97(10.150.20.97:33061) Wed Dec 12 11:13:54 2018 - [info] 10.150.20.132(10.150.20.132:33061) Wed Dec 12 11:13:54 2018 - [info] Alive Slaves: Wed Dec 12 11:13:54 2018 - [info] 10.150.20.97(10.150.20.97:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Wed Dec 12 11:13:54 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:13:54 2018 - [info] 10.150.20.132(10.150.20.132:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Wed Dec 12 11:13:54 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:13:54 2018 - [info] Current Alive Master: 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:13:54 2018 - [info] Checking slave configurations.. Wed Dec 12 11:13:54 2018 - [info] read_only=1 is not set on slave 10.150.20.97(10.150.20.97:33061). Wed Dec 12 11:13:54 2018 - [warning] relay_log_purge=0 is not set on slave 10.150.20.97(10.150.20.97:33061). Wed Dec 12 11:13:54 2018 - [info] read_only=1 is not set on slave 10.150.20.132(10.150.20.132:33061). Wed Dec 12 11:13:54 2018 - [info] Checking replication filtering settings.. Wed Dec 12 11:13:54 2018 - [info] binlog_do_db= , binlog_ignore_db= Wed Dec 12 11:13:54 2018 - [info] Replication filtering check ok. Wed Dec 12 11:13:54 2018 - [info] GTID (with auto-pos) is not supported Wed Dec 12 11:13:54 2018 - [info] Starting SSH connection tests.. Wed Dec 12 11:13:57 2018 - [info] All SSH connection tests passed successfully. Wed Dec 12 11:13:57 2018 - [info] Checking MHA Node version.. Wed Dec 12 11:13:58 2018 - [info] Version check ok. Wed Dec 12 11:13:58 2018 - [info] Checking SSH publickey authentication settings on the current master.. Wed Dec 12 11:13:58 2018 - [info] HealthCheck: SSH to 10.150.20.90 is reachable. Wed Dec 12 11:13:58 2018 - [info] Master MHA Node version is 0.58. Wed Dec 12 11:13:58 2018 - [info] Checking recovery script configurations on 10.150.20.90(10.150.20.90:33061).. Wed Dec 12 11:13:58 2018 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000006 Wed Dec 12 11:13:58 2018 - [info] Connecting to root@10.150.20.90(10.150.20.90:22).. Creating /data/mysql_mha/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysql_33061/logs, up to mysql-bin.000006 Wed Dec 12 11:13:58 2018 - [info] Binlog setting check done. Wed Dec 12 11:13:58 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Wed Dec 12 11:13:58 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.97 --slave_ip=10.150.20.97 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Wed Dec 12 11:13:58 2018 - [info] Connecting to root@10.150.20.97(10.150.20.97:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/logs, up to relaylog.000002 Temporary relay log file is /data/mysql_33061/logs/relaylog.000002 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Dec 12 11:13:58 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.132 --slave_ip=10.150.20.132 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Wed Dec 12 11:13:58 2018 - [info] Connecting to root@10.150.20.132(10.150.20.132:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/data, up to cgdb-relay-bin.000002 Temporary relay log file is /data/mysql_33061/data/cgdb-relay-bin.000002 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Dec 12 11:13:59 2018 - [info] Slaves settings check done. Wed Dec 12 11:13:59 2018 - [info] 10.150.20.90(10.150.20.90:33061) (current master) +--10.150.20.97(10.150.20.97:33061) +--10.150.20.132(10.150.20.132:33061) Wed Dec 12 11:13:59 2018 - [info] Checking replication health on 10.150.20.97.. Wed Dec 12 11:13:59 2018 - [info] ok. Wed Dec 12 11:13:59 2018 - [info] Checking replication health on 10.150.20.132.. Wed Dec 12 11:13:59 2018 - [info] ok. Wed Dec 12 11:13:59 2018 - [info] Checking master_ip_failover_script status: Wed Dec 12 11:13:59 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port=33061 Bareword "FIXME_xxx" not allowed while "strict subs" in use at /usr/local/bin/master_ip_failover line 93. Execution of /usr/local/bin/master_ip_failover aborted due to compilation errors. Wed Dec 12 11:13:59 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln229] Failed to get master_ip_failover_script status with return code 255:0. Wed Dec 12 11:13:59 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/local/bin/masterha_check_repl line 48. Wed Dec 12 11:13:59 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers. Wed Dec 12 11:13:59 2018 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK!
编辑master_ip_failover
注释掉:# FIXME_xxx
再次,测试#masterha_check_repl --conf=/etc/mysql_mha/app1.cnf
# masterha_check_repl --conf=/etc/mysql_mha/app1.cnf Wed Dec 12 11:23:49 2018 - [info] Reading default configuration from /etc/masterha_default.cnf.. Wed Dec 12 11:23:49 2018 - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf.. Wed Dec 12 11:23:49 2018 - [info] Reading server configuration from /etc/mysql_mha/app1.cnf.. Wed Dec 12 11:23:49 2018 - [info] MHA::MasterMonitor version 0.58. Wed Dec 12 11:23:50 2018 - [info] GTID failover mode = 0 Wed Dec 12 11:23:50 2018 - [info] Dead Servers: Wed Dec 12 11:23:50 2018 - [info] Alive Servers: Wed Dec 12 11:23:50 2018 - [info] 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:23:50 2018 - [info] 10.150.20.97(10.150.20.97:33061) Wed Dec 12 11:23:50 2018 - [info] 10.150.20.132(10.150.20.132:33061) Wed Dec 12 11:23:50 2018 - [info] Alive Slaves: Wed Dec 12 11:23:50 2018 - [info] 10.150.20.97(10.150.20.97:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Wed Dec 12 11:23:50 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:23:50 2018 - [info] 10.150.20.132(10.150.20.132:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Wed Dec 12 11:23:50 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:23:50 2018 - [info] Current Alive Master: 10.150.20.90(10.150.20.90:33061) Wed Dec 12 11:23:50 2018 - [info] Checking slave configurations.. Wed Dec 12 11:23:50 2018 - [info] read_only=1 is not set on slave 10.150.20.97(10.150.20.97:33061). Wed Dec 12 11:23:50 2018 - [warning] relay_log_purge=0 is not set on slave 10.150.20.97(10.150.20.97:33061). Wed Dec 12 11:23:50 2018 - [info] read_only=1 is not set on slave 10.150.20.132(10.150.20.132:33061). Wed Dec 12 11:23:50 2018 - [info] Checking replication filtering settings.. Wed Dec 12 11:23:50 2018 - [info] binlog_do_db= , binlog_ignore_db= Wed Dec 12 11:23:50 2018 - [info] Replication filtering check ok. Wed Dec 12 11:23:50 2018 - [info] GTID (with auto-pos) is not supported Wed Dec 12 11:23:50 2018 - [info] Starting SSH connection tests.. Wed Dec 12 11:23:53 2018 - [info] All SSH connection tests passed successfully. Wed Dec 12 11:23:53 2018 - [info] Checking MHA Node version.. Wed Dec 12 11:23:53 2018 - [info] Version check ok. Wed Dec 12 11:23:53 2018 - [info] Checking SSH publickey authentication settings on the current master.. Wed Dec 12 11:23:54 2018 - [info] HealthCheck: SSH to 10.150.20.90 is reachable. Wed Dec 12 11:23:54 2018 - [info] Master MHA Node version is 0.58. Wed Dec 12 11:23:54 2018 - [info] Checking recovery script configurations on 10.150.20.90(10.150.20.90:33061).. Wed Dec 12 11:23:54 2018 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000006 Wed Dec 12 11:23:54 2018 - [info] Connecting to root@10.150.20.90(10.150.20.90:22).. Creating /data/mysql_mha/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysql_33061/logs, up to mysql-bin.000006 Wed Dec 12 11:23:54 2018 - [info] Binlog setting check done. Wed Dec 12 11:23:54 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Wed Dec 12 11:23:54 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.97 --slave_ip=10.150.20.97 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Wed Dec 12 11:23:54 2018 - [info] Connecting to root@10.150.20.97(10.150.20.97:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/logs, up to relaylog.000002 Temporary relay log file is /data/mysql_33061/logs/relaylog.000002 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Dec 12 11:23:54 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.132 --slave_ip=10.150.20.132 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Wed Dec 12 11:23:54 2018 - [info] Connecting to root@10.150.20.132(10.150.20.132:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/data, up to cgdb-relay-bin.000002 Temporary relay log file is /data/mysql_33061/data/cgdb-relay-bin.000002 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Dec 12 11:23:55 2018 - [info] Slaves settings check done. Wed Dec 12 11:23:55 2018 - [info] 10.150.20.90(10.150.20.90:33061) (current master) +--10.150.20.97(10.150.20.97:33061) +--10.150.20.132(10.150.20.132:33061) Wed Dec 12 11:23:55 2018 - [info] Checking replication health on 10.150.20.97.. Wed Dec 12 11:23:55 2018 - [info] ok. Wed Dec 12 11:23:55 2018 - [info] Checking replication health on 10.150.20.132.. Wed Dec 12 11:23:55 2018 - [info] ok. Wed Dec 12 11:23:55 2018 - [info] Checking master_ip_failover_script status: Wed Dec 12 11:23:55 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port=33061 Wed Dec 12 11:23:55 2018 - [info] OK. Wed Dec 12 11:23:55 2018 - [warning] shutdown_script is not defined. Wed Dec 12 11:23:55 2018 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
启动 mha manager
#nohup masterha_manager --conf=/etc/mysql_mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /data/mysql_mha/app1-manager.log 2>&1 &
查看manater status
# masterha_check_status --conf=/etc/mysql_mha/app1.cnf
app1 (pid:16203) is running(0:PING_OK), master:10.150.20.90
观察manager log
# tail -100f app1-manager.log
Mon Dec 10 16:54:02 2018 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Dec 10 16:54:02 2018 - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf.. Mon Dec 10 16:54:02 2018 - [info] Reading server configuration from /etc/mysql_mha/app1.cnf.. Mon Dec 10 16:54:02 2018 - [info] MHA::MasterMonitor version 0.58. Mon Dec 10 16:54:03 2018 - [info] GTID failover mode = 0 Mon Dec 10 16:54:03 2018 - [info] Dead Servers: Mon Dec 10 16:54:03 2018 - [info] Alive Servers: Mon Dec 10 16:54:03 2018 - [info] 10.150.20.90(10.150.20.90:33061) Mon Dec 10 16:54:03 2018 - [info] 10.150.20.97(10.150.20.97:33061) Mon Dec 10 16:54:03 2018 - [info] 10.150.20.132(10.150.20.132:33061) Mon Dec 10 16:54:03 2018 - [info] Alive Slaves: Mon Dec 10 16:54:03 2018 - [info] 10.150.20.97(10.150.20.97:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Mon Dec 10 16:54:03 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Mon Dec 10 16:54:03 2018 - [info] Primary candidate for the new Master (candidate_master is set) Mon Dec 10 16:54:03 2018 - [info] 10.150.20.132(10.150.20.132:33061) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled Mon Dec 10 16:54:03 2018 - [info] Replicating from 10.150.20.90(10.150.20.90:33061) Mon Dec 10 16:54:03 2018 - [info] Current Alive Master: 10.150.20.90(10.150.20.90:33061) Mon Dec 10 16:54:03 2018 - [info] Checking slave configurations.. Mon Dec 10 16:54:03 2018 - [info] Checking replication filtering settings.. Mon Dec 10 16:54:03 2018 - [info] binlog_do_db= , binlog_ignore_db= Mon Dec 10 16:54:03 2018 - [info] Replication filtering check ok. Mon Dec 10 16:54:03 2018 - [info] GTID (with auto-pos) is not supported Mon Dec 10 16:54:03 2018 - [info] Starting SSH connection tests.. Mon Dec 10 16:54:06 2018 - [info] All SSH connection tests passed successfully. Mon Dec 10 16:54:06 2018 - [info] Checking MHA Node version.. Mon Dec 10 16:54:07 2018 - [info] Version check ok. Mon Dec 10 16:54:07 2018 - [info] Checking SSH publickey authentication settings on the current master.. Mon Dec 10 16:54:07 2018 - [info] HealthCheck: SSH to 10.150.20.90 is reachable. Mon Dec 10 16:54:07 2018 - [info] Master MHA Node version is 0.58. Mon Dec 10 16:54:07 2018 - [info] Checking recovery script configurations on 10.150.20.90(10.150.20.90:33061).. Mon Dec 10 16:54:07 2018 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000005 Mon Dec 10 16:54:07 2018 - [info] Connecting to root@10.150.20.90(10.150.20.90:22).. Creating /data/mysql_mha/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysql_33061/logs, up to mysql-bin.000005 Mon Dec 10 16:54:07 2018 - [info] Binlog setting check done. Mon Dec 10 16:54:07 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Dec 10 16:54:07 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.97 --slave_ip=10.150.20.97 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Mon Dec 10 16:54:07 2018 - [info] Connecting to root@10.150.20.97(10.150.20.97:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/logs, up to relaylog.000002 Temporary relay log file is /data/mysql_33061/logs/relaylog.000002 Checking if super_read_only is defined and turned on..Disabling super_read_only, enabling read_only, so that the applying can be done on the slave Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Enabling super_read_only again..Enabling super_read_only again after applying Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Dec 10 16:54:07 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.132 --slave_ip=10.150.20.132 --slave_port=33061 --workdir=/data/mysql_mha/app1 --target_version=5.7.21-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx Mon Dec 10 16:54:07 2018 - [info] Connecting to root@10.150.20.132(10.150.20.132:22).. Checking slave recovery environment settings.. Opening /data/mysql_33061/logs/relay-log.info ... ok. Relay log found at /data/mysql_33061/data, up to cgdb-relay-bin.000002 Temporary relay log file is /data/mysql_33061/data/cgdb-relay-bin.000002 Checking if super_read_only is defined and turned on..Disabling super_read_only, enabling read_only, so that the applying can be done on the slave Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Enabling super_read_only again..Enabling super_read_only again after applying Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Dec 10 16:54:08 2018 - [info] Slaves settings check done. Mon Dec 10 16:54:08 2018 - [info] 10.150.20.90(10.150.20.90:33061) (current master) +--10.150.20.97(10.150.20.97:33061) +--10.150.20.132(10.150.20.132:33061) Mon Dec 10 16:54:08 2018 - [info] Checking master_ip_failover_script status: Mon Dec 10 16:54:08 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port=33061 IN SCRIPT TEST====systemctl stop keepalived==systemctl start keepalived=== Checking the Status of the script.. OK Mon Dec 10 16:54:08 2018 - [info] OK. Mon Dec 10 16:54:08 2018 - [warning] shutdown_script is not defined. Mon Dec 10 16:54:08 2018 - [info] Set master ping interval 5 seconds. Mon Dec 10 16:54:08 2018 - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s ed3jrdba97 -s ed3jrdba90 Mon Dec 10 16:54:08 2018 - [info] Starting ping health check on 10.150.20.90(10.150.20.90:33061).. Mon Dec 10 16:54:08 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
关闭 mha manager
# masterha_stop --conf=/etc/mysql_mha/app1.cnf
Stopped app1 successfully.
参考:
https://blog.csdn.net/ctypyb2002/article/details/81478098
MHA实战篇:https://yq.aliyun.com/articles/58920