zoukankan      html  css  js  c++  java
  • MHA+keepalived集群环境搭建

    整个MHA+keepalived集群环境搭建

    1.1. 环境简介
    1.1.1、vmvare虚拟机,系统版本CentOS6.5 x86_64位最小化安装,mysql的版本5.7.21,
    1.1.2、虚拟机器的ssh端口均为默认22,
    1.1.3、虚拟机的iptables全部关闭,
    1.1.4、虚拟机的selinux全部关闭,
    1.1.5、虚拟机服务器时间全部一致 ntpdate 0.asia.pool.ntp.org
    1.1.6、3台机器的ssh端口为22

    1.2、此次试验采用的是3台机器,机器具体部署如下:
    角色             IP地址(内网)    主机名称      节点机器部署服务                             业务用途

    Monitor         192.168.52.250       db250         mha4mysql-manager-0.56-0.el6
    Master          192.168.52.251       db251          mha4mysql-node-0.56-0.el6                      写入(keepalived)

    --------------------------------------------------------------------------------------------

    slave(备master) 192.168.52.252      db252         mha4mysql-node-0.56-0.el6                  keepalived

    Slave           192.168.52.253       db253          mha4mysql-node-0.56-0.el6      

    读+备份数据

    1.3 说明介绍:

    server03和server04是server02的slave从库,复制环境搭建后面会简单演示,其中master对外提供写服务,备选master(实际的slave,主机名server03)提供读服务,slave也提供相关的读服务,一旦master宕机,将会把备 
    选备master提升为新的master,slave指向新的master
    server04上部署Monitor(MHA Manager监控),主要是监控主从复制的集群中主库master是否正常,一旦master挂掉,MHA Manager会自动完成主库和slave从库的自动切换

    1.4 安装mysql

    Mater和slave都需要安装
    wget http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm
    yum localinstall mysql57-community-release-el7-8.noarch.rpm
    yum install mysql-community-server -y
    systemctl enable mysqld
    systemctl daemon-reload
    systemctl start mysqld
    systemctl status mysqld
    grep 'temporary password' /var/log/mysqld.log
    mysql -uroot -p
    [root@server04 ~]# mysql -u root
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
    service mysqld restart
    192.168.52.251

    vim /etc/my.cnf

    [mysqld]

    datadir=/var/lib/mysql

    socket=/var/lib/mysql/mysql.sock

    character-set-server=utf8

    server-id=1

    log-bin=master-log

    relay-log=relay-log

    innodb_file_per_table = ON

    skip_name_resolve = ON

    max_connections = 5000#不在配置文件中限定只读,但是要记得在slave上限制只读

    symbolic-links=0

    log-error=/var/log/mysqld.log

    pid-file=/var/run/mysqld/mysqld.pid

    192.168.52.252

    vim /etc/my.cnf

    [mysqld]

    datadir=/var/lib/mysql

    socket=/var/lib/mysql/mysql.sock

    character-set-server=utf8

    server-id=2

    log-bin=master-log

    relay-log=relay-log

    relay_log_purge=0

    read_only=1

    skip_name_resolve=1

    innodb_file_per_table=1

    # Disabling symbolic-links is recommended to prevent assorted security risks

    symbolic-links=0

    log-error=/var/log/mysqld.log

    pid-file=/var/run/mysqld/mysqld.pid

    192.168.52.253

    vim /etc/my.cnf

    [mysqld]

    datadir=/var/lib/mysql

    socket=/var/lib/mysql/mysql.sock

    character-set-server=utf8

    server-id=3

    log-bin=master-log

    relay-log=relay-log

    relay_log_purge=0

    read_only=1

    skip_name_resolve=1

    innodb_file_per_table=1

    # Disabling symbolic-links is recommended to prevent assorted security risks

    symbolic-links=0

    log-error=/var/log/mysqld.log

    pid-file=/var/run/mysqld/mysqld.pid

    第二:新建用户repl_user设置密码123456

          重置root密码:123456

    #更改密码
    mysql -u root
    #更改密码策略
    set global validate_password_policy=0;
    set global validate_password_length=4;
    SET PASSWORD = PASSWORD('123456');
    GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.52.%' IDENTIFIED BY '123456' WITH GRANT OPTION;
    flush privileges;
    主从复制授权:

    第三:在 3 个 mysql 节点做授权配置(主从复制授权)

    Master:

    grant replication slave,replication client on *.* to 'repluser'@'192.168.52.%' identified by '123456' ;

    flush privileges;

    #授权MHA管理用户-mhaadmin

    grant all on *.* to 'mhaadmin'@'192.168.52.%' identified by 'mhapass' ;

    flush privileges;

    mysql> show master status;

    +-------------------+----------+--------------+------------------+-------------------+

    | File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

    +-------------------+----------+--------------+------------------+-------------------+

    | master-log.000005 |      154 |              |                  |                   |

    +-------------------+----------+--------------+------------------+-------------------+

    1 row in set (0.00 sec)

    slave(两个从库)

     #配置主从复制起点

     change master to master_host='192.168.52.251',master_user='repluser',master_password='123456',master_log_file='master-log.000005',master_log_pos=154;

     start slave;

     show slave statusG

                Slave_IO_Running: Yes

                Slave_SQL_Running: Yes

    mysql> set global read_only=1; #查看slave IO和slave sql是否都正常

    #查看主从复制情况

     show grants for 'repluser'@'192.168.52.%';

    mysql> flush privileges;   #刷新权限

    Query OK, 0 rows affected (0.00 sec)

    删除多余用户

    mysql> drop user root@'localhost';

    mysql> select user,host from mysql.user;

    1.5 ssh授信

    配置三台机器的ssh互信(三台都要操作)

         ssh-keygen -t rsa

         ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.52.250

         ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.52.251

         ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.52.252

         ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.52.253

    #测试是否成功

        ssh 192.168.52.251 date

    1.6 安装MHA软件

    安装MHA软件(在三个节点上都装mha的node软件)

        #先安装依赖

         wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

         rpm -ivh epel-release-latest-7.noarch.rpm

         yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager -y

        下载软件

         wget https://qiniu.wsfnk.com/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

         rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

    仅在manager节点上安装mha管理软件(192.168.52.250)

         wget https://qiniu.wsfnk.com/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

         yum install perl-Parallel-ForkManager -y

        rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

        yum install mailx -y   #该软件是用来发送邮件的

    -rwxr-xr-x 1 root root  5172 Jan  7 14:09 masterha_secondary_check

    -rwxr-xr-x 1 root root  1739 Jan  7 14:09 masterha_stop

    -rwxr-xr-x 1 root root  8337 Jan  7 14:14 purge_relay_logs

    -rwxr-xr-x 1 root root  7525 Jan  7 14:14 save_binary_logs

    1.7 配置MHA软件

    192.168.52.250机器操作:

    [root@db250 bin]# cd /usr/bin/

    [root@db250 bin]# find ./ -name apply_diff_relay_logs

    ./apply_diff_relay_logs

    [root@db250 bin]# cp /usr/bin/save_binary_logs /usr/local/bin/

    [root@db250 bin]# cp /usr/bin/purge_relay_logs /usr/local/bin/

    [root@db250 bin]# cp /usr/bin/filter_mysqlbinlog /usr/local/bin/

    [root@db250 bin]# cp /usr/bin/apply_diff_relay_logs /usr/local/bin/

    [root@db250 bin]# ln -s /usr/local/mysql/bin/mysql  /usr/bin/mysql

    [root@db250 bin]# ln -s /usr/local/mysql/bin/mysqlbinlog  /usr/bin/mysqlbinlog

    [root@db250 bin]# cd /usr/local/bin/

    [root@db250 bin]# ll

    total 88

    -rwxr-xr-x 1 root root 17639 Jan  7 14:14 apply_diff_relay_logs

    -rwxr-xr-x 1 root root  4807 Jan  7 14:14 filter_mysqlbinlog

    -rwxr-xr-x 1 root root  1995 Jan  7 14:09 masterha_check_repl

    -rwxr-xr-x 1 root root  1779 Jan  7 14:09 masterha_check_ssh

    -rwxr-xr-x 1 root root  1865 Jan  7 14:09 masterha_check_status

    -rwxr-xr-x 1 root root  3201 Jan  7 14:09 masterha_conf_host

    -rwxr-xr-x 1 root root  2517 Jan  7 14:09 masterha_manager

    -rwxr-xr-x 1 root root  2165 Jan  7 14:09 masterha_master_monitor

    -rwxr-xr-x 1 root root  2373 Jan  7 14:09 masterha_master_switch

    MHA配置文件如下

    mkdir -p /etc/masterha
    [root@db250 app1]# cat  /etc/masterha/app1.cnf

    [server default]

    manager_log=/var/log/masterha/app1/manager.log

    manager_workdir=/var/log/masterha/app1.log

    master_binlog_dir=/var/lib/mysql

    master_ip_failover_script=/usr/local/bin/master_ip_failover

    master_ip_online_change_script=/usr/local/bin/master_ip_online_change

    password=123456

    ping_interval=1

    remote_workdir=/tmp

    repl_password=123456

    repl_user=repluser

    secondary_check_script=/usr/local/bin/masterha_secondary_check -s db251 -s db252 --user=root --master_host=db252 --master_ip=192.168.52.252 --master_port=3306

    shutdown_script=""

    ssh_port=22

    ssh_user=root

    [server1]

    hostname=192.168.52.251

    candidate_master=1

    port=3306

    [server2]

    candidate_master=1

    check_repl_delay=0

    hostname=192.168.52.252

    port=3306

    [server3]

    hostname=192.168.52.253

    port=3306

    二、设置relay log的清除方式(在每个slave节点上):

    三个节点服务器本地hosts解析

    [root@db250 app1]# vim /etc/hosts

    192.168.52.250 db250

    192.168.52.251 db251

    192.168.52.252 db252

    192.168.52.253 db253

    在slave master01 192.168.52.252操作:

    [root@db251 ~]# mysql -uroot -p123456 -e "set global relay_log_purge=0"

    mysql: [Warning] Using a password on the command line interface can be insecure.

    在slave master02 192.168.52.253操作:

    [root@ db252 ~]# mysql -uroot -p123456 -e "set global relay_log_purge=0"

    mysql: [Warning] Using a password on the command line interface can be insecure.

    注意:

    MHA在发生切换的过程中,从库的恢复过程中依赖于relay log的相关信息,所以这里要将relay log的自动清除设置为OFF,采用手动清除relay log的方式。在默认情况下,从服务器上的中继日志会在SQL线程执行完毕后被自动删除。但是在MHA环境中,这些中继日志在恢复其他从服务器时可能会被用到,因此需要禁用中继日志的自动删除功能。定期清除中继日志需要考虑到复制延时的问题。在ext3的文件系统下,删除大的文件需要一定的时间,会导致严重的复制延时。为了避免复制延时,需要暂时为中继日志创建硬链接,因为在linux系统中通过硬链接删除大文件速度会很快。(在mysql数据库中,删除大表时,通常也采用建立硬链接的方式)

    2.2设置定期清理relay脚本(两台slave服务器):

    [root@ db252~]# cat /data/scripts/purge_relay_log.sh

    #!/bin/bash

    user=root

    passwd=123456

    port=3306

    log_dir='/data/masterha/log'

    work_dir='/data'

    purge='/usr/local/bin/purge_relay_logs'

    if [ ! -d $log_dir ]

    then

       mkdir $log_dir -p

    fi

    $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>

    脚本介绍:

    --user mysql                      //用户名

    --password mysql                  //密码

    --port                            //端口号

    --workdir                         //指定创建relay log的硬链接的位置,默认是/var/tmp,由于系统不同分区创建硬链接文件会失败,故需要执行硬链接具体位置,成功执行脚本后,硬链接的中继日志文件被删除

    --disable_relay_log_purge         //默认情况下,如果relay_log_purge=1,脚本会什么都不清理,自动退出,通过设定这个参数,当relay_log_purge=1的情况下会将relay_log_purge设置为0。清理relay log之后,最后将参数设置为OFF。

    purge_relay_logs脚本删除中继日志不会阻塞SQL线程。下面我们手动执行看看什么情况:

    [root@ db252 ~]# purge_relay_logs --user=root --password=123456 --port=3306 -disable_relay_log_purge --workdir=/data/

    2018-07-01 11:53:16: purge_relay_logs script started.

     Found relay_log.info: /data/mysql/relay-log.info

     Opening /data/mysql/logs/relay-log/relay-bin.000001 ..

     Opening /data/mysql/logs/relay-log/relay-bin.000002 ..

     Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SET GLOBAL relay_log_purge=0; .. ok.

    2018-07-01 11:53:20: All relay log purging operations succeeded.

    主从failover脚本

    [root@db250 app1]#  cat /usr/local/bin/master_ip_failover

    #!/usr/bin/env perl

    #Copyright (C) 2011 DeNA Co.,Ltd.

    #This program is free software; you can redistribute it and/or modify

    #t under the terms of the GNU General Public License as published by

    #the Free Software Foundation; either version 2 of the License, or

    #(at your option) any later version.

    #This program is distributed in the hope that it will be useful,

    #but WITHOUT ANY WARRANTY; without even the implied warranty of

    #MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

    #GNU General Public License for more details.

    #You should have received a copy of the GNU General Public License

    #along with this program; if not, write to the Free Software

    #Foundation, Inc.,

    #51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

    ##Note: This is a sample script and is not complete. Modify the script based on your environment.

    ######################################################

    use strict;

    use warnings FATAL => 'all';

    use Getopt::Long;

    use MHA::DBHelper;

    my (

      $command,        $ssh_user,         $orig_master_host,

      $orig_master_ip, $orig_master_port, $new_master_host,

      $new_master_ip,  $new_master_port,  $new_master_user,

      $new_master_password

    );

    my $vip = '192.168.52.199';

    my $ssh_start_vip = "systemctl start keepalived";

    my $ssh_stop_vip = "systemctl stop keepalived ";

    GetOptions(

      'command=s'             => $command,

      'ssh_user=s'            => $ssh_user,

      'orig_master_host=s'    => $orig_master_host,

      'orig_master_ip=s'      => $orig_master_ip,

      'orig_master_port=i'    => $orig_master_port,

      'new_master_host=s'     => $new_master_host,

      'new_master_ip=s'       => $new_master_ip,

      'new_master_port=i'     => $new_master_port,

      'new_master_user=s'     => $new_master_user,

      'new_master_password=s' => $new_master_password,

    );

    exit &main();

    sub main {

        print " IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip=== ";

      if ( $command eq "stop" || $command eq "stopssh" ) {

        #$orig_master_host, $orig_master_ip, $orig_master_port are passed.

        #If you manage master ip address at global catalog database,

        #invalidate orig_master_ip here.

        my $exit_code = 1;

        eval {

          print "Disabling the VIP on old master: $orig_master_host ";

          &stop_vip();

          #updating global catalog, etc

          $exit_code = 0;

        };

        if ($@) {

          warn "Got Error: $@ ";

          exit $exit_code;

        }

        exit $exit_code;

      }

      elsif ( $command eq "start" ) {

        #all arguments are passed.

        #If you manage master ip address at global catalog database,

        #activate new_master_ip here.

        #You can also grant write access (create user, set read_only=0, etc) here.

        my $exit_code = 10;

        eval {

          print "Enabling the VIP - $vip on the new master - $new_master_host ";

          &start_vip();

          $exit_code = 0;

            };

        if ($@) {

          warn $@;

          #If you want to continue failover, exit 10.

          exit $exit_code;

        }

        exit $exit_code;

      }

      elsif ( $command eq "status" ) {

         print "Checking the Status of the script.. OK ";

        #do nothing

        exit 0;

      }

      else {

        &usage();

        exit 1;

      }

    }

    sub start_vip() {

        `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;

    }

    #A simple system call that disable the VIP on the old_master

    sub stop_vip() {

        `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;

        }

    sub usage {

      print

    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port ";

    }

    chmod +x /usr/local/bin/master_ip_failover

    1.8 MHA相关测试

    测试SSH免密码登录

    masterha_check_ssh --conf=/etc/masterha/app1.cnf

    测试MHA数据库同步

    masterha_check_repl --conf=/etc/masterha/app1.cnf

    chmod +x /usr/local/bin/masterha_check_repl
    启动MHA监控服务

    查看MHA Manager监控是否正常:

    masterha_check_status --conf=/etc/masterha/app1.cnf
     

    关闭MHA Manage监控

    masterha_stop --conf=/etc/masterha/app1.cnf
    1.9 启动mha数据监控

    masterha_check_status --conf=/etc/masterha/app1.cnf

    mkdir -p  /var/log/masterha/app1/
     

    nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
    ps -ef|grep perl
    1.10 配置vip实现MHA架构中主库故障自动切换

    192.168.52.251和192.168.52.252

    Yum install keepalived -y

    192.168.52.251

    [root@db251 ~]# vim /etc/keepalived/keepalived.conf

    global_defs {

       notification_email {

       305xxx7536@qq.com

       }

       notification_email_from Alexandre.Cassen@firewall.loc

       smtp_server 192.168.52.251

       smtp_connect_timeout 30

       router_id LVS_01

    }

    vrrp_instance VI_1 {

        #state MASTER

        state BACKUP

        interface ens33

        virtual_router_id 51

        priority 100

        advert_int 1

        nopreempt

        authentication {

            auth_type PASS

            auth_pass 1111

        }

        virtual_ipaddress {

        192.168.2.199/24

        }

    }

    systemctl start keepalived

    192.168.52.252

    [root@db252 ~]# vim /etc/keepalived/keepalived.conf

    global_defs {

       notification_email {

       305xxx7536@qq.com

       }

       notification_email_from Alexandre.Cassen@firewall.loc

       smtp_server 192.168.52.252

       smtp_connect_timeout 30

       router_id LVS_01

    }

    vrrp_instance VI_1 {

        #state MASTER

        state BACKUP

        interface ens33

        virtual_router_id 51

        priority 90

        advert_int 1

        nopreempt

        authentication {

            auth_type PASS

            auth_pass 1111

        }

        virtual_ipaddress {

        192.168.52.199/24

        }

    }

    systemctl start keepalived

    #####特别注意!!!!!
    上面两台服务器的keepalived都设置为了BACKUP模式,在keepalived中2种模式,分别是master->backup模式和backup->backup模式。这两种模式有很大区别。在master->backup模式下,一旦主库宕机,虚拟ip会自动漂移到从库,当主库修复后,keepalived启动后,还会把虚拟ip抢占过来,即使设置了非抢占模式(nopreempt)抢占ip的动作也会发生。在backup->backup模式下,当主库宕机后虚拟ip会自动漂移到从库上,当原主库恢复和keepalived服务启动后,并不会抢占新主的虚拟ip,即使是优先级高于从库的优先级别,也不会发生抢占。为了减少ip漂移次数,通常是把修复好的主库当做新的备库。
    ++++到此处MHA架构中keepalived服务安装配置完成++++

    1.11检查故障切换后MHA集群相关服务的变化
    配置文件/etc/masterha/app1.cnf变化

    1、在db250 192.168.52.250管理节点查看一下配置文件/etc/masterha/app1.cnf可以发现[server1]的内容已经被自动去掉了:

    2、masterha_manager 服务自动退出

    3、源master192.168.52.251机器上keepalived服务被停掉了

    #######重要提示!!! 
    当db251 192.168.52.251 机器上的mysql挂掉后,db252 192.168.52.252机器提升为master时,192.168.52.251 机器上的keepalived会停掉,而 192.168.52.252机器的keepalived会开启,VIP票到199机器上。
    此时需要重启192.168.52.251上的mysql,一般都是要恢复它作为252新主的从库,此时192.168.52.251机器上的keepalived千万不要开启,因为开启keepalived,会抢占252机器上的VIP,导致程序连接数据库出现混乱。同时192.168.52.251机器和192.168.52.252机器上的keepalived服务不要设置为开机自启动

    1.12 mha高可用重新加入监控

    1、出问题的master:
    show master status;
    Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server UUIDs;
    these UUIDs must be different for replication to work.
    2、解决方法:
    rm -rf /var/lib/mysql/auto.cnf
    systemctl restart mysqld
    3、数据库从库同步主库
    show master status;
    #配置主从复制起点
    change master to master_host='192.168.52.251',master_user='repluser',master_password='123456',master_log_file='master-log.000004',master_log_pos=154; #注意masterIP地址
    start slave;
    show slave statusG
    4、切换完毕后缺少
    secondary_check_script=/usr/local/bin/masterha_secondary_check -s db251 -s db252 --user=root --master_host=db252 --master_ip=192.168.52.252 --master_port=3306
    添加:
    [server1]
    hostname=192.168.52.252
    candidate_master=1
    port=3306
    5、测试
    测试ssh
    masterha_check_ssh --conf=/etc/masterha/app1.cnf
    测试MHA数据库同步,并自动启动
    masterha_check_repl --conf=/etc/masterha/app1.cnf
    查看MHA Manager监控是否正常:
    masterha_check_status --conf=/etc/masterha/app1.cnf
    6、日志查看
    tailf /var/log/masterha/app1/manager.log
    7、开启MHA Manager监控
    nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
    8、查询manger程序
    ps -ef|grep perl
    ---------------------
    作者:guoshaoliang789
    来源:CSDN
    原文:https://blog.csdn.net/guoshaoliang789/article/details/86086181
    版权声明:本文为博主原创文章,转载请附上博文链接!

  • 相关阅读:
    从TCP三次握手说起——浅析TCP协议中的疑难杂症
    动态绑定是如何实现的?
    C++对象的内存模型
    C/C++关键字
    libevent库介绍--事件和数据缓冲
    libevent编程疑难解答
    大型工程多个目录下的Makefile写法
    C++中的RAII机制
    C++中的智能指针
    二叉树的非递归遍历
  • 原文地址:https://www.cnblogs.com/sidesky/p/10890882.html
Copyright © 2011-2022 走看看