zoukankan      html  css  js  c++  java
  • MySQL高可用架构之MHA

    MySQL高可用架构之MHA简单搭建

    参考:http://www.cnblogs.com/rayment/p/7355093.html  如涉及版权。请联系删除

    简介:

    MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。

    该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

    在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。

    目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库,因为至少需要三台服务器,出于机器成本的考虑,淘宝也在该基础上进行了改造,目前淘宝TMHA已经支持一主一从。另外对于想快速搭建的可以参考:MHA快速搭建

    我们自己使用其实也可以使用1主1从,但是master主机宕机后无法切换,以及无法补全binlog。master的mysqld进程crash后,还是可以切换成功,以及补全binlog的。

    官方介绍:https://code.google.com/p/mysql-master-ha/

    图01展示了如何通过MHA Manager管理多组主从复制。可以将MHA工作原理总结为如下:

     image

                                     ( 图01 )

    (1)从宕机崩溃的master保存二进制日志事件(binlog events);

    (2)识别含有最新更新的slave;

    (3)应用差异的中继日志(relay log)到其他的slave;

    (4)应用从master保存的二进制日志事件(binlog events);

    (5)提升一个slave为新的master;

    (6)使其他的slave连接新的master进行复制;

    MHA软件由两部分组成,Manager工具包和Node工具包,具体的说明如下。

    Manager工具包主要包括以下几个工具:

    masterha_check_ssh              检查MHA的SSH配置状况
    masterha_check_repl             检查MySQL复制状况
    masterha_manger                 启动MHA
    masterha_check_status           检测当前MHA运行状态
    masterha_master_monitor         检测master是否宕机
    masterha_master_switch          控制故障转移(自动或者手动)
    masterha_conf_host              添加或删除配置的server信息

    Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:

    save_binary_logs                保存和复制master的二进制日志
    apply_diff_relay_logs           识别差异的中继日志事件并将其差异的事件应用于其他的slave
    filter_mysqlbinlog              去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
    purge_relay_logs                清除中继日志(不会阻塞SQL线程)

    注意:

    为了尽可能的减少主库硬件损坏宕机造成的数据丢失,因此在配置MHA的同时建议配置成MySQL 5.5的半同步复制。关于半同步复制原理各位自己进行查阅。(不是必须)

    1.部署MHA

    接下来部署MHA,具体的搭建环境如下(所有操作系统均为centos 7 1804,不是必须,slave1和slave2是maset的从,复制环境搭建后面会简单演示,但是相关的安全复制不会详细说明,需要的童鞋请参考前面的文章,MySQL Replication需要注意的问题):

    角色                          ip地址                    server_id            类型
    Monitor host            192.168.134.191              -                监控复制组
    Master                  192.168.134.193              1                写入
    slave1                  192.168.134.2                2                读
    Slave2                  192.168.134.194              3                读

    其中master对外提供写服务,备选master提供读服务,也提供相关的读服务,一旦master宕机,将会把备选master(slave1/2)提升为新的master,slave1/2指向新的master(slave2/1)

    因为事先下载好的

    yum install mha4mysql-*

    查看MHA提供了多少东西

    [root@c1 ~]# rpm -ql mha4mysql-manager
    ./usr/bin/masterha_check_repl
    /usr/bin/masterha_check_ssh
    /usr/bin/masterha_check_status
    /usr/bin/masterha_conf_host
    /usr/bin/masterha_manager
    /usr/bin/masterha_master_monitor
    /usr/bin/masterha_master_switch
    /usr/bin/masterha_secondary_check
    /usr/bin/masterha_stop
    /usr/share/man/man1/masterha_check_repl.1.gz
    /usr/share/man/man1/masterha_check_ssh.1.gz
    /usr/share/man/man1/masterha_check_status.1.gz
    /usr/share/man/man1/masterha_conf_host.1.gz
    /usr/share/man/man1/masterha_manager.1.gz
    /usr/share/man/man1/masterha_master_monitor.1.gz
    /usr/share/man/man1/masterha_master_switch.1.gz
    /usr/share/man/man1/masterha_secondary_check.1.gz
    /usr/share/man/man1/masterha_stop.1.gz
    /usr/share/perl5/vendor_perl/MHA/Config.pm
    /usr/share/perl5/vendor_perl/MHA/DBHelper.pm
    /usr/share/perl5/vendor_perl/MHA/FileStatus.pm
    /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm
    /usr/share/perl5/vendor_perl/MHA/ManagerAdmin.pm
    /usr/share/perl5/vendor_perl/MHA/ManagerAdminWrapper.pm
    /usr/share/perl5/vendor_perl/MHA/ManagerConst.pm
    /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm
    /usr/share/perl5/vendor_perl/MHA/MasterFailover.pm
    /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm
    /usr/share/perl5/vendor_perl/MHA/MasterRotate.pm
    /usr/share/perl5/vendor_perl/MHA/SSHCheck.pm
    /usr/share/perl5/vendor_perl/MHA/Server.pm
    /usr/share/perl5/vendor_perl/MHA/ServerManager.pm

    2.配置SSH登录无密码验证(使用key登录,工作中常用)我的测试环境已经是使用key登录,服务器之间无需密码验证的。关于配置使用key登录,我想我不再重复。但是有一点需要注意:不能禁止 password 登陆,否则会出现错误

    3.搭建主从复制环境

    注意:binlog-do-db 和 replicate-ignore-db 设置必须相同。 MHA 在启动时候会检测过滤规则,如果过滤规则不同,MHA 不启动监控和故障转移。

    (1)在master上执行备份(192.168.134.191)

     mysqldump -A -B --master-data=2 --events > all.sql

    --master-data[=#]: 此选项须启用二进制日志 1:所备份的数据之前加一条记录为CHANGE MASTER TO语句,非注释,不指定#,默认为1 2:记录为注释的CHANGE MASTER TO语句   --events:备份相关的所有event scheduler

    (2)在server02上创建复制用户:

    mysql>grant all on *.* to mhauser@'192.168.134.%' identified by 'centos'; 监控用户
    mysql>grant replication slave on *.* to repluser@'192.168.134.%' identified by 'centos'; 复制用户

    (3)查看主库备份时的binlog名称和位置,MASTER_LOG_FILE和MASTER_LOG_POS:

    MariaDB [(none)]> show master logs;
    +--------------------+-----------+
    | Log_name           | File_size |
    +--------------------+-----------+
    | mariadb-bin.000001 |      8749 |
    | mariadb-bin.000002 |       245 |
    +--------------------+-----------+

    (4)把备份复制到slave1和slave2

    scp all.sql 192.168.134.2:
    scp all.sql 192.168.134.194:

    (5)导入all.sql

    mysql < all.sql
    mysql > CHANGE MASTER TO
      MASTER_HOST='192.168.134.193',
      MASTER_USER='repluser',
      MASTER_PASSWORD='centos',
      MASTER_PORT=3306,
      MASTER_LOG_FILE='mariadb-bin.000002',
      MASTER_LOG_POS=245;
    查看状态
    MariaDB [(none)]> show slave statusG;
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: 192.168.134.193
                      Master_User: repluser
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mariadb-bin.000002
              Read_Master_Log_Pos: 245
                   Relay_Log_File: mariadb-relay-bin.000002
                    Relay_Log_Pos: 531
            Relay_Master_Log_File: mariadb-bin.000002
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    

    (6)两台slave服务器设置read_only(从库对外提供读服务,只所以没有写进配置文件,是因为随时slave会提升为master

    vim /etc/my.cnf
    [mysqld]
    skip_name_resolve    禁止主机名解析,建议使用
    innodb_file_per_table 可以修改InnoDB为独立表空间模式,每个数据库的每个表都会生成一个数据空间。
    read_only
    relay_log_purge=0    MHA可通过purge_relay_logs禁用自动删除功能以及定期清理
    server_id=3
    log_bin

    4.配置MHA

    (1)创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)。

    [server default]
    user=mhauser
    password=centos
    manager_workdir=/data/mastermha/app1/  //设置manager的工作目录
    manager_log=/data/mastermha/app1/manager.log   //设置manager的日志
    remote_workdir=/data/mastermha/app1/   //设置master 保存binlog的位置,以便MHA可以找到master的日志,我这里的也就是mysql的数据目录
    ssh_user=root                           //设置ssh的登录用户名
    repl_user=repluser                      //设置复制环境中的复制用户名
    repl_password=centos  //设置复制环境中的密码
    ping_interval=1        //设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行railover
    [server1]
    hostname=192.168.134.193
    candidate_master=1
    [server2]
    hostname=192.168.134.2
    candidate_master=1   //设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
    [server3]
    hostname=192.168.134.194

    注意:

    MHA在发生切换的过程中,从库的恢复过程中依赖于relay log的相关信息,所以这里要将relay log的自动清除设置为OFF,采用手动清除relay log的方式。在默认情况下,从服务器上的中继日志会在SQL线程执行完毕后被自动删除。但是在MHA环境中,这些中继日志在恢复其他从服务器时可能会被用到,因此需要禁用中继日志的自动删除功能。定期清除中继日志需要考虑到复制延时的问题。在ext3的文件系统下,删除大的文件需要一定的时间,会导致严重的复制延时。为了避免复制延时,需要暂时为中继日志创建硬链接,因为在linux系统中通过硬链接删除大文件速度会很快。(在mysql数据库中,删除大表时,通常也采用建立硬链接的方式)

    5.检查SSH配置

    检查MHA Manger到所有MHA Node的SSH连接状态:

    [root@c1 ~]# masterha_check_ssh --conf=/etc/mastermha/app1.cnf
    Mon Aug  6 18:02:58 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Aug  6 18:02:58 2018 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:02:58 2018 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:02:58 2018 - [info] Starting SSH connection tests..
    Mon Aug  6 18:03:01 2018 - [debug]
    Mon Aug  6 18:02:59 2018 - [debug]  Connecting via SSH from root@192.168.134.194(192.168.134.194:22) to root@192.168.134.193(192.168.134.193:22)..
    Mon Aug  6 18:03:00 2018 - [debug]   ok.
    Mon Aug  6 18:03:00 2018 - [debug]  Connecting via SSH from root@192.168.134.194(192.168.134.194:22) to root@192.168.134.2(192.168.134.2:22)..
    Mon Aug  6 18:03:01 2018 - [debug]   ok.
    Mon Aug  6 18:03:06 2018 - [debug]
    Mon Aug  6 18:02:58 2018 - [debug]  Connecting via SSH from root@192.168.134.2(192.168.134.2:22) to root@192.168.134.193(192.168.134.193:22)..
    Mon Aug  6 18:03:04 2018 - [debug]   ok.
    Mon Aug  6 18:03:04 2018 - [debug]  Connecting via SSH from root@192.168.134.2(192.168.134.2:22) to root@192.168.134.194(192.168.134.194:22)..
    Warning: Permanently added '192.168.134.194' (ECDSA) to the list of known hosts.
    Mon Aug  6 18:03:05 2018 - [debug]   ok.
    Mon Aug  6 18:03:10 2018 - [debug]
    Mon Aug  6 18:02:58 2018 - [debug]  Connecting via SSH from root@192.168.134.193(192.168.134.193:22) to root@192.168.134.2(192.168.134.2:22)..
    Warning: Permanently added '192.168.134.2' (ECDSA) to the list of known hosts.
    Mon Aug  6 18:03:09 2018 - [debug]   ok.
    Mon Aug  6 18:03:09 2018 - [debug]  Connecting via SSH from root@192.168.134.193(192.168.134.193:22) to root@192.168.134.194(192.168.134.194:22)..
    Warning: Permanently added '192.168.134.194' (ECDSA) to the list of known hosts.
    Mon Aug  6 18:03:10 2018 - [debug]   ok.
    Mon Aug  6 18:03:10 2018 - [info] All SSH connection tests passed successfully.
    

    6.检查整个复制环境状况。

    [root@c1 ~]# masterha_check_repl --conf=/etc/mastermha/app1.cnf
    Mon Aug  6 18:11:54 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Aug  6 18:11:54 2018 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:11:54 2018 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:11:54 2018 - [info] MHA::MasterMonitor version 0.56.
    Mon Aug  6 18:11:55 2018 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln671] Master 192.168.134.191:3306 from which slave 192.168.134.193(192.168.134.193:3306) replicates is not defined in the configuration file!
    Mon Aug  6 18:11:55 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326.
    Mon Aug  6 18:11:55 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
    Mon Aug  6 18:11:55 2018 - [info] Got exit code 1 (Not master dead).
    
    MySQL Replication Health is NOT OK!
    [root@c1 ~]# masterha_check_repl --conf=/etc/mastermha/app1.cnf
    Mon Aug  6 18:13:27 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Aug  6 18:13:27 2018 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:13:27 2018 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 18:13:27 2018 - [info] MHA::MasterMonitor version 0.56.
    Mon Aug  6 18:13:29 2018 - [info] GTID failover mode = 0
    Mon Aug  6 18:13:29 2018 - [info] Dead Servers:
    Mon Aug  6 18:13:29 2018 - [info] Alive Servers:
    Mon Aug  6 18:13:29 2018 - [info]   192.168.134.193(192.168.134.193:3306)
    Mon Aug  6 18:13:29 2018 - [info]   192.168.134.2(192.168.134.2:3306)
    Mon Aug  6 18:13:29 2018 - [info]   192.168.134.194(192.168.134.194:3306)
    Mon Aug  6 18:13:29 2018 - [info] Alive Slaves:
    Mon Aug  6 18:13:29 2018 - [info]   192.168.134.2(192.168.134.2:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled
    Mon Aug  6 18:13:29 2018 - [info]     Replicating from 192.168.134.193(192.168.134.193:3306)
    Mon Aug  6 18:13:29 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Aug  6 18:13:29 2018 - [info]   192.168.134.194(192.168.134.194:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled
    Mon Aug  6 18:13:29 2018 - [info]     Replicating from 192.168.134.193(192.168.134.193:3306)
    Mon Aug  6 18:13:29 2018 - [info] Current Alive Master: 192.168.134.193(192.168.134.193:3306)
    Mon Aug  6 18:13:29 2018 - [info] Checking slave configurations..
    Mon Aug  6 18:13:29 2018 - [info] Checking replication filtering settings..
    Mon Aug  6 18:13:29 2018 - [info]  binlog_do_db= , binlog_ignore_db=
    Mon Aug  6 18:13:29 2018 - [info]  Replication filtering check ok.
    Mon Aug  6 18:13:29 2018 - [info] GTID (with auto-pos) is not supported
    Mon Aug  6 18:13:29 2018 - [info] Starting SSH connection tests..
    Mon Aug  6 18:13:47 2018 - [info] All SSH connection tests passed successfully.
    Mon Aug  6 18:13:47 2018 - [info] Checking MHA Node version..
    Mon Aug  6 18:13:49 2018 - [info]  Version check ok.
    Mon Aug  6 18:13:49 2018 - [info] Checking SSH publickey authentication settings on the current master..
    Mon Aug  6 18:13:49 2018 - [info] HealthCheck: SSH to 192.168.134.193 is reachable.
    Mon Aug  6 18:13:50 2018 - [info] Master MHA Node version is 0.56.
    Mon Aug  6 18:13:50 2018 - [info] Checking recovery script configurations on 192.168.134.193(192.168.134.193:3306)..
    Mon Aug  6 18:13:50 2018 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.56 --start_file=mariadb-bin.000001
    Mon Aug  6 18:13:50 2018 - [info]   Connecting to root@192.168.134.193(192.168.134.193:22)..
      Creating /data/mastermha/app1 if not exists.. Creating directory /data/mastermha/app1.. done.
       ok.
      Checking output directory is accessible or not..
       ok.
      Binlog found at /var/lib/mysql, up to mariadb-bin.000001
    Mon Aug  6 18:13:50 2018 - [info] Binlog setting check done.
    Mon Aug  6 18:13:50 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
    Mon Aug  6 18:13:50 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=192.168.134.2 --slave_ip=192.168.134.2 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
    Mon Aug  6 18:13:50 2018 - [info]   Connecting to root@192.168.134.2(192.168.134.2:22)..
    Creating directory /data/mastermha/app1/.. done.
      Checking slave recovery environment settings..
        Opening /var/lib/mysql/relay-log.info ... ok.
        Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002
        Testing mysql connection and privileges.. done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Mon Aug  6 18:13:56 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=192.168.134.194 --slave_ip=192.168.134.194 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
    Mon Aug  6 18:13:56 2018 - [info]   Connecting to root@192.168.134.194(192.168.134.194:22)..
    Creating directory /data/mastermha/app1/.. done.
      Checking slave recovery environment settings..
        Opening /var/lib/mysql/relay-log.info ... ok.
        Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002
        Testing mysql connection and privileges.. done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Mon Aug  6 18:13:56 2018 - [info] Slaves settings check done.
    Mon Aug  6 18:13:56 2018 - [info]
    192.168.134.193(192.168.134.193:3306) (current master)
     +--192.168.134.2(192.168.134.2:3306)
     +--192.168.134.194(192.168.134.194:3306)
    
    Mon Aug  6 18:13:56 2018 - [info] Checking replication health on 192.168.134.2..
    Mon Aug  6 18:13:56 2018 - [info]  ok.
    Mon Aug  6 18:13:56 2018 - [info] Checking replication health on 192.168.134.194..
    Mon Aug  6 18:13:56 2018 - [info]  ok.
    Mon Aug  6 18:13:56 2018 - [warning] master_ip_failover_script is not defined.
    Mon Aug  6 18:13:56 2018 - [warning] shutdown_script is not defined.
    Mon Aug  6 18:13:56 2018 - [info] Got exit code 0 (Not master dead).
    
    MySQL Replication Health is OK.

    7.检查MHA Manager的状态:

    通过master_check_status脚本查看Manager的状态:

    [root@c1 ~]# masterha_manager --conf=/etc/mastermha/app1.cnf
    Mon Aug  6 19:10:08 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Aug  6 19:10:08 2018 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Mon Aug  6 19:10:08 2018 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
      Creating /data/mastermha/app1 if not exists..    ok.
      Checking output directory is accessible or not..
       ok.
      Binlog found at /var/lib/mysql, up to mariadb-bin.000001
     
  • 相关阅读:
    [Head First Python]2. BIF(内置函数)
    [转]mac下Python升级到指定的版本
    [Head First Python]2. python of comment
    自动化测试-----Python基础
    自动化测试----python等工具下载、测试环境搭配、常用的DOS命令
    Python初识与安装
    Python网络爬虫部分
    不知道数据库中表的列类型的前提下,使用JDBC正确的取出数据
    如何做好测试接口
    测试登录界面
  • 原文地址:https://www.cnblogs.com/OrochWang/p/9432981.html
Copyright © 2011-2022 走看看