zoukankan      html  css  js  c++  java
  • MHA搭建及故障维护

    MHA是一种方便简单可靠的MySQL高可用架构,具体的介绍我在这里就不多说了,下面是我在网上找的一个教程,我在此基础上进行了一些修改:

    大致步骤

    (一)、环境介绍
    (二)、用ssh-keygen实现四台主机之间相互免密钥登录
    (三)、安装MHAmha4mysql-node,mha4mysql-manager和perl环境包
    (四)、建立master,slave1,slave2之间主从复制
    (五)、管理机manager上配置MHA文件
    (六)、masterha_check_ssh工具验证ssh信任登录是否成功
    (七)、masterha_check_repl工具验证mysql复制是否成功
    (八)、启动MHA manager,并监控日志文件
    (九)、测试master宕机后,是否会自动切换 

    (一)环境介绍

    1.主机部署

    CentOS 7改主机名

    hostnamectl set-hostname master
    
    192.168.56.121    master
    192.168.56.122    slave1    #备用master
    192.168.56.123    slave2    
    192.168.56.124    manager 

    将ip和域名配置到/etc/hosts文件中

    尝试在各主机上的防火墙上加上端口的允许

    iptables -I INPUT -s 0/0 -p tcp --dport 3306 -j ACCEPT 

    这条规则的意思是,想要在输入数据INPUT中,protocol为tcp/IP的方式,访问端口3306,都会被允许的

    iptables -L -n|grep 3306
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:3306 

    (二)用ssh-keygen实现四台主机之间相互免密钥登录

    1.生成密钥

    [master,slave1,slave2,manager]

    ssh-keygen -t rsa 

    [slave1,slave2,manager]

    scp .ssh/id_rsa.pub master:/root/.ssh/slave1.pub 
    scp .ssh/id_rsa.pub master:/root/.ssh/slave2.pub
    scp .ssh/id_rsa.pub master:/root/.ssh/manager.pub 

    2.在主机上用cat xxx >> authorized_keys导入公钥到/root/.ssh/authorized_keys文件中

    [master]

    cat ~/.ssh/*.pub>>~/.ssh/authorized_keys
    scp ~/.ssh/authorized_keys slave1:/root/.ssh/authorized_keys 
    scp ~/.ssh/authorized_keys slave2:/root/.ssh/authorized_keys 
    scp ~/.ssh/authorized_keys manager:/root/.ssh/authorized_keys 

    (三)安装MHAmha4mysql-node,mha4mysql-manager 软件包

    1.安装MHAmha4mysql-node

    [manager,master,slave1,slave2]
    
    yum -y install perl-DBD-MySQL
    yum -y install perl-Config-Tiny    
    yum -y install perl-Log-Dispatch    
    yum -y install perl-Parallel-ForkManager
    mha4mysql-node-0.55-0.el6.noarch.rpm 

    2.安装mha4mysql-manager

    [manager]
    
     yum -y install perl
     yum -y install cpan
     rpm -ivh mha4mysql-manager-0.55-0.el6.noarch.rpm 

    缺啥,yum install xxx 啥就行。

    (四)、建立master,slave1,slave2之间主从复制

    (五)、管理机manager上配置MHA文件

    [manager] 

    1.创建目录

    mkdir -p /masterha/app1
    mkdir /etc/masterha
    vi /etc/masterha/app1.cnf
    
    [server default]
    user=root
    password=root
    manager_workdir=/masterha/app1
    manager_log=/masterha/app1/manager.log
    remote_workdir=/masterha/app1
    ssh_user=root
    repl_user=rep
    repl_password=repl
    ping_interval=1
    [server1]
    hostname=192.168.56.122
    master_binlog_dir=/var/lib/mysql
    candidate_master=1
    #relay_log_purge=0
    [server2]
    hostname=192.168.56.121
    master_binlog_dir=/var/lib/mysql
    candidate_master=1
    [server3]
    hostname=192.168.56.123
    master_binlog_dir=/var/lib/mysql
    no_master=1
    #relay_log_purge=0 

    (六)、masterha_check_ssh工具验证ssh信任登录是否成功

    [manager]
    
    masterha_check_ssh --conf=/etc/masterha/app1.cnf
    [root@manager ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
    Thu Feb 23 12:00:24 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Thu Feb 23 12:00:24 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
    Thu Feb 23 12:00:24 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf..
    Thu Feb 23 12:00:24 2017 - [info] Starting SSH connection tests..
    Thu Feb 23 12:00:25 2017 - [debug] 
    Thu Feb 23 12:00:24 2017 - [debug]  Connecting via SSH from root@192.168.56.122(192.168.56.122:22) to root@192.168.56.121(192.168.56.121:22)..
    Thu Feb 23 12:00:25 2017 - [debug]   ok.
    Thu Feb 23 12:00:25 2017 - [debug]  Connecting via SSH from root@192.168.56.122(192.168.56.122:22) to root@192.168.56.123(192.168.56.123:22)..
    Thu Feb 23 12:00:25 2017 - [debug]   ok.
    Thu Feb 23 12:00:25 2017 - [debug] 
    Thu Feb 23 12:00:25 2017 - [debug]  Connecting via SSH from root@192.168.56.121(192.168.56.121:22) to root@192.168.56.122(192.168.56.122:22)..
    Warning: Permanently added '192.168.56.121' (ECDSA) to the list of known hosts.
    Thu Feb 23 12:00:25 2017 - [debug]   ok.
    Thu Feb 23 12:00:25 2017 - [debug]  Connecting via SSH from root@192.168.56.121(192.168.56.121:22) to root@192.168.56.123(192.168.56.123:22)..
    Thu Feb 23 12:00:25 2017 - [debug]   ok.
    Thu Feb 23 12:00:26 2017 - [debug] 
    Thu Feb 23 12:00:25 2017 - [debug]  Connecting via SSH from root@192.168.56.123(192.168.56.123:22) to root@192.168.56.122(192.168.56.122:22)..
    Warning: Permanently added '192.168.56.123' (ECDSA) to the list of known hosts.
    Thu Feb 23 12:00:26 2017 - [debug]   ok.
    Thu Feb 23 12:00:26 2017 - [debug]  Connecting via SSH from root@192.168.56.123(192.168.56.123:22) to root@192.168.56.121(192.168.56.121:22)..
    Thu Feb 23 12:00:26 2017 - [debug]   ok.
    Thu Feb 23 12:00:26 2017 - [info] All SSH connection tests passed successfully.
    [root@manager ~]# 

    (七)、masterha_check_repl工具验证mysql复制是否成功

    [manager]
    
    masterha_check_repl --conf=/etc/masterha/app1.cnf
    
    [root@manager mysql]# masterha_check_repl --conf=/etc/masterha/app1.cnf
    Thu Feb 23 14:37:05 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Thu Feb 23 14:37:05 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
    Thu Feb 23 14:37:05 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf..
    Thu Feb 23 14:37:05 2017 - [info] MHA::MasterMonitor version 0.55.
    Thu Feb 23 14:37:05 2017 - [info] Dead Servers:
    Thu Feb 23 14:37:05 2017 - [info] Alive Servers:
    Thu Feb 23 14:37:05 2017 - [info]   master(192.168.56.121:3306)
    Thu Feb 23 14:37:05 2017 - [info]   slave1(192.168.56.122:3306)
    Thu Feb 23 14:37:05 2017 - [info]   slave2(192.168.56.123:3306)
    Thu Feb 23 14:37:05 2017 - [info] Alive Slaves:
    
    .......此处省略
    
    Thu Feb 23 14:37:08 2017 - [info]   Connecting to root@192.168.56.123(slave2:22).. 
    Creating directory /masterha/app1.. done.
      Checking slave recovery environment settings..
        Opening /var/lib/mysql/relay-log.info ... ok.
        Relay log found at /tmp, up to mysql-relay-bin.000004
        Temporary relay log file is /tmp/mysql-relay-bin.000004
        Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
     done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Thu Feb 23 14:37:08 2017 - [info] Slaves settings check done.
    Thu Feb 23 14:37:08 2017 - [info] 
    master (current master)
     +--slave1
     +--slave2
    
    Thu Feb 23 14:37:08 2017 - [info] Checking replication health on slave1..
    Thu Feb 23 14:37:08 2017 - [info]  ok.
    Thu Feb 23 14:37:08 2017 - [info] Checking replication health on slave2..
    Thu Feb 23 14:37:08 2017 - [info]  ok.
    Thu Feb 23 14:37:08 2017 - [warning] master_ip_failover_script is not defined.
    Thu Feb 23 14:37:08 2017 - [warning] shutdown_script is not defined.
    Thu Feb 23 14:37:08 2017 - [info] Got exit code 0 (Not master dead).
    
    MySQL Replication Health is OK. 

    (八)、启动MHA manager,并监控日志文件

    [manager]
    
    masterha_manager --conf=/etc/masterha/app1.cnf 
    tail -f /masterha/app1/manager.log 

    (九)测试master(宕机后,是否会自动切换

    1.停掉master上的mysql服务

    [master]
    
    [root@master ~]# service mysql stop
    Shutting down MySQL..... SUCCESS! 
    [root@master ~]# 
    [manager] 

    2.宕掉master后,/masterha/app1/manager.log文件显示:

    tail -f /masterha/app1/manager.log 

    日志文件显示:

    ----- Failover Report -----
    
    app1: MySQL Master failover master to slave1 succeeded
    
    Master master is down!
    
    Check MHA Manager logs at manager:/masterha/app1/manager.log for details.
    
    Started automated(non-interactive) failover.
    The latest slave slave1(192.168.56.122:3306) has all relay logs for recovery.
    Selected slave1 as a new master.
    slave1: OK: Applying all logs succeeded.
    slave2: This host has the latest relay log events.
    Generating relay diff files from the latest slave succeeded.
    slave2: OK: Applying all logs succeeded. Slave started, replicating from slave1.
    slave1: Resetting slave info succeeded.
    Master failover to slave1(192.168.56.122:3306) completed successfully. 

    上面的结果表明master成功切换。

    切换过程中需要关注的几个问题

    1.切换过程会自动把read_only关闭

    2.切换之后需要删除手工删除/masterha/app1/app1.failover.complete,才能进行第二次测试

    3.一旦发生切换管理进程将会退出,无法进行再次测试,需将故障数据库加入到MHA环境中来

    4.原主节点重新加入到MHA时只能设置为slave,在

    change master to master_host='192.168.56.122',
    master_user='repl',
    master_password='repl',
    master_log_file='mysql-bin.000010',
    master_log_pos=120; 

    之前需要先 reset slave

    5.关于ip地址的接管有几种方式,这里采用的是MHA自动调用IP别名的方式,好处是在能够保证数据库状态与业务IP切换的一致性。启动管理节点 之后 VIP会自动别名到当前主节点上,Keepalived也只能做到对3306的健康检查,但是做不到比如像MySQL复制中的Slave-SQL、 Slave-IO进程的检查,容易出现对切换的误判。

    6.注意:二级从服务器需要将log_slave_updates打开

    7.手工切换需要先定义好master_ip_online_change_script脚本,不然只会切换mysql,IP地址不会绑定上去,可以根据模板来配置该脚本

    8.通过设置no_master=1可以让某一个节点永远不成为新的主节点

    恢复集群运行

    ①在manager上删除app1.failover.complete文件

    cd /masterha/app1
    rm -f app1.failover.complete 

    ②原master主节点服务启动

    service mysql start 

    ③ manager管理节点,检查同步报错

    masterha_check_repl --conf=/etc/masterha/app1.cnf
    Thu Feb 23 15:00:56 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln604] There are 2 non-slave servers! MHA manages at most one non-slave server. Check configurations. 

    ⑤查看现在的slave1上的信息

    mysql> show master statusG
    *************************** 1. row ***************************
                 File: mysql-bin.000010
             Position: 120
         Binlog_Do_DB: 
     Binlog_Ignore_DB: 
    Executed_Gtid_Set: 
    1 row in set (0.00 sec) 

    ④配置187节点mysql为新的slave,并启动同步进程

    change master to master_host='192.168.56.122',
    master_user='repl',
    master_password='repl',
    master_log_file='mysql-bin.000010',
    master_log_pos=120;
    mysql> start slave; 

    再次在管理节点上检查同步状态成功:

    masterha_check_repl --conf=/etc/masterha/app1.cnf 

    需注意:按如上步骤操作后,此时121节点作为slaver已加入到集群中,但是宕机这段时间122、123中新产生的数据在121中没有,所以还需要先从主节点备份导入最新的数据再启动同步

    ⑤启动MHA
    nohup masterha_manager –conf=/etc/masterha/app1.cnf > /mha/app1/mha_manager.log &1 &

    回切:
    同样的道理,以上步骤配置无问题的话停止当前master的MySQL进程,MHA可直接切换master至原节点

     来源:http://blog.itpub.net/24742969/viewspace-2134542/

  • 相关阅读:
    nginx获取上游真实IP(ngx_http_realip_module)
    配置NFS固定端口
    elasticsearch 占CPU过高
    jdk集合常用方法分析之HashSet和TreeSet
    SparkSQL使用之如何使用UDF
    SparkSQL使用之JDBC代码访问Thrift JDBC Server
    SparkSQL使用之Thrift JDBC server
    SparkSQL使用之Spark SQL CLI
    jdk分析之String
    jdk集合常用方法分析之ArrayList&LinkedList&以及两者的对比分析
  • 原文地址:https://www.cnblogs.com/fyc119/p/7529897.html
Copyright © 2011-2022 走看看