zoukankan      html  css  js  c++  java
  • 利用sysbench工具测试MHA

    1. sysbench准备数据

    sysbench /usr/share/sysbench/oltp_read_write.lua 
    sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
    Initializing worker threads...
    Creating table 'sbtest10'...
    Creating table 'sbtest6'...
    Creating table 'sbtest1'...
    Creating table 'sbtest3'...
    Creating table 'sbtest2'...
    Creating table 'sbtest7'...
    Creating table 'sbtest5'...
    Creating table 'sbtest8'...
    Creating table 'sbtest4'...
    Creating table 'sbtest9'...
    Inserting 500000 records into 'sbtest10'
    Inserting 500000 records into 'sbtest9'
    Inserting 500000 records into 'sbtest6'
    Inserting 500000 records into 'sbtest5'
    Inserting 500000 records into 'sbtest8'
    Inserting 500000 records into 'sbtest4'
    Inserting 500000 records into 'sbtest7'
    Inserting 500000 records into 'sbtest3'
    Inserting 500000 records into 'sbtest2'
    Inserting 500000 records into 'sbtest1'
    Creating a secondary index on 'sbtest5'...
    Creating a secondary index on 'sbtest6'...
    Creating a secondary index on 'sbtest9'...
    Creating a secondary index on 'sbtest10'...
    Creating a secondary index on 'sbtest7'...
    Creating a secondary index on 'sbtest3'...
    Creating a secondary index on 'sbtest8'...
    Creating a secondary index on 'sbtest2'...
    Creating a secondary index on 'sbtest4'...
    Creating a secondary index on 'sbtest1'...

    2. sysbench开始压测

    sysbench /usr/share/sysbench/oltp_read_write.lua 

    3. master模拟意外宕机

    # ps -ef|grep mysqld
    avahi      741     1  0 08:50 ?        00:00:00 avahi-daemon: running [mysqldb1.local]
    mysql     4741  4228  2 11:03 pts/3    00:06:30 mysqld --defaults-file=/etc/my3306.cnf
    root      8346  8288  0 15:34 pts/0    00:00:00 grep --color=auto mysqld
    # kill -9 4741

    4. mysqldb2 上观察mha状态

    # tail -f /var/log/masterha/app1/manager.log
    Mon Oct 29 15:34:40 2018 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
    Mon Oct 29 15:34:40 2018 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s mysqldb2 -s mysqldb1  --user=root  --master_host=  --master_ip=  --master_port=3306 --master_user=mha_rep --master_password=123456 --ping_type=SELECT
    Mon Oct 29 15:34:40 2018 - [info] Executing SSH check script: exit 0
    Mon Oct 29 15:34:40 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
    Mon Oct 29 15:34:40 2018 - [warning] Connection failed 2 time(s)..
    Mon Oct 29 15:34:41 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
    Mon Oct 29 15:34:41 2018 - [warning] Connection failed 3 time(s)..
    Mon Oct 29 15:34:42 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
    Mon Oct 29 15:34:42 2018 - [warning] Connection failed 4 time(s)..
    Mon Oct 29 15:34:45 2018 - [warning] HealthCheck: Got timeout on checking SSH connection to! at /usr/local/share/perl5/MHA/HealthCheck.pm line 342.
    Monitoring server mysqldb2 is reachable, Master is not reachable from mysqldb2. OK.
    Monitoring server mysqldb1 is reachable, Master is not reachable from mysqldb1. OK.
    Mon Oct 29 15:34:53 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
    Mon Oct 29 15:34:53 2018 - [warning] Master is not reachable from health checker!
    Mon Oct 29 15:34:53 2018 - [warning] Master is not reachable!
    Mon Oct 29 15:34:53 2018 - [warning] SSH is NOT reachable.
    Mon Oct 29 15:34:53 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
    Mon Oct 29 15:34:53 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Oct 29 15:34:53 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:34:53 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:34:55 2018 - [info] GTID failover mode = 1
    Mon Oct 29 15:34:55 2018 - [info] Dead Servers:
    Mon Oct 29 15:34:55 2018 - [info]
    Mon Oct 29 15:34:55 2018 - [info] Alive Servers:
    Mon Oct 29 15:34:55 2018 - [info]
    Mon Oct 29 15:34:55 2018 - [info]
    Mon Oct 29 15:34:55 2018 - [info] Alive Slaves:
    Mon Oct 29 15:34:55 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:34:55 2018 - [info]     GTID ON
    Mon Oct 29 15:34:55 2018 - [info]     Replicating from
    Mon Oct 29 15:34:55 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Oct 29 15:34:55 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:34:55 2018 - [info]     GTID ON
    Mon Oct 29 15:34:55 2018 - [info]     Replicating from
    Mon Oct 29 15:34:55 2018 - [info]     Not candidate for the new Master (no_master is set)
    Mon Oct 29 15:34:55 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Oct 29 15:34:55 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:34:55 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:34:55 2018 - [info] Checking slave configurations..
    Mon Oct 29 15:34:55 2018 - [info] Checking replication filtering settings..
    Mon Oct 29 15:34:55 2018 - [info]  Replication filtering check ok.
    Mon Oct 29 15:34:55 2018 - [info] Master is down!
    Mon Oct 29 15:34:55 2018 - [info] Terminating monitoring script.
    Mon Oct 29 15:34:55 2018 - [info] Got exit code 20 (Master dead).
    Mon Oct 29 15:34:55 2018 - [info] MHA::MasterFailover version 0.57.
    Mon Oct 29 15:34:55 2018 - [info] Starting master failover.
    Mon Oct 29 15:34:55 2018 - [info] 
    Mon Oct 29 15:34:55 2018 - [info] * Phase 1: Configuration Check Phase..
    Mon Oct 29 15:34:55 2018 - [info] 
    Mon Oct 29 15:34:56 2018 - [info] GTID failover mode = 1
    Mon Oct 29 15:34:56 2018 - [info] Dead Servers:
    Mon Oct 29 15:34:56 2018 - [info]
    Mon Oct 29 15:34:56 2018 - [info] Checking master reachability via MySQL(double check)...
    Mon Oct 29 15:34:56 2018 - [info]  ok.
    Mon Oct 29 15:34:56 2018 - [info] Alive Servers:
    Mon Oct 29 15:34:56 2018 - [info]
    Mon Oct 29 15:34:56 2018 - [info]
    Mon Oct 29 15:34:56 2018 - [info] Alive Slaves:
    Mon Oct 29 15:34:56 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:34:56 2018 - [info]     GTID ON
    Mon Oct 29 15:34:56 2018 - [info]     Replicating from
    Mon Oct 29 15:34:56 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Oct 29 15:34:56 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:34:56 2018 - [info]     GTID ON
    Mon Oct 29 15:34:56 2018 - [info]     Replicating from
    Mon Oct 29 15:34:56 2018 - [info]     Not candidate for the new Master (no_master is set)
    Mon Oct 29 15:34:56 2018 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln309] Last failover was done at 2018/10/29 10:09:03. Current time is too early to do failover again. If you want to do failover, manually remove /var/log/masterha/app1/app1.failover.complete and run this script again.
    Mon Oct 29 15:34:56 2018 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR:  at /usr/local/bin/masterha_manager line 65.


    rm -rf /var/log/masterha/app1/app1.failover.complete

    5. 手工failover切换master

    # masterha_master_switch --conf=/etc/masterha/app1.cnf --dead_master_host= --master_state=dead --new_master_host= --ignore_last_failover
    也就是 /var/log/masterha/app1产生app1.failover.complete文件,
    # masterha_master_switch --conf=/etc/masterha/app1.cnf --dead_master_host= --master_state=dead --new_master_host= --ignore_last_failover
    --dead_master_ip=<dead_master_ip> is not set. Using
    --dead_master_port=<dead_master_port> is not set. Using 3306.
    Mon Oct 29 15:49:04 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Mon Oct 29 15:49:04 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:49:04 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
    Mon Oct 29 15:49:04 2018 - [info] MHA::MasterFailover version 0.57.
    Mon Oct 29 15:49:04 2018 - [info] Starting master failover.
    Mon Oct 29 15:49:04 2018 - [info] 
    Mon Oct 29 15:49:04 2018 - [info] * Phase 1: Configuration Check Phase..
    Mon Oct 29 15:49:04 2018 - [info] 
    Mon Oct 29 15:49:05 2018 - [info] GTID failover mode = 1
    Mon Oct 29 15:49:05 2018 - [info] Dead Servers:
    Mon Oct 29 15:49:05 2018 - [info]
    Mon Oct 29 15:49:05 2018 - [info] Checking master reachability via MySQL(double check)...
    Mon Oct 29 15:49:05 2018 - [info]  ok.
    Mon Oct 29 15:49:05 2018 - [info] Alive Servers:
    Mon Oct 29 15:49:05 2018 - [info]
    Mon Oct 29 15:49:05 2018 - [info]
    Mon Oct 29 15:49:05 2018 - [info] Alive Slaves:
    Mon Oct 29 15:49:05 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:05 2018 - [info]     GTID ON
    Mon Oct 29 15:49:05 2018 - [info]     Replicating from
    Mon Oct 29 15:49:05 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Oct 29 15:49:05 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:05 2018 - [info]     GTID ON
    Mon Oct 29 15:49:05 2018 - [info]     Replicating from
    Mon Oct 29 15:49:05 2018 - [info]     Not candidate for the new Master (no_master is set)
    Master is dead. Proceed? (yes/NO): yes
    Mon Oct 29 15:49:06 2018 - [info] Starting GTID based failover.
    Mon Oct 29 15:49:06 2018 - [info] 
    Mon Oct 29 15:49:06 2018 - [info] ** Phase 1: Configuration Check Phase completed.
    Mon Oct 29 15:49:06 2018 - [info] 
    Mon Oct 29 15:49:06 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
    Mon Oct 29 15:49:06 2018 - [info] 
    Mon Oct 29 15:49:07 2018 - [info] HealthCheck: SSH to is reachable.
    Mon Oct 29 15:49:07 2018 - [info] Forcing shutdown so that applications never connect to the current master..
    Mon Oct 29 15:49:07 2018 - [info] Executing master IP deactivation script:
    Mon Oct 29 15:49:07 2018 - [info]   /usr/local/bin/master_ip_failover --orig_master_host= --orig_master_ip= --orig_master_port=3306 --command=stopssh --ssh_user=root  
    IN SCRIPT TEST====/sbin/ifconfig eth0:0 down==/sbin/ifconfig eth0:0 up===
    Disabling the VIP on old master: 
    Mon Oct 29 15:49:08 2018 - [info]  done.
    Mon Oct 29 15:49:08 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
    Mon Oct 29 15:49:08 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
    Mon Oct 29 15:49:08 2018 - [info] 
    Mon Oct 29 15:49:08 2018 - [info] * Phase 3: Master Recovery Phase..
    Mon Oct 29 15:49:08 2018 - [info] 
    Mon Oct 29 15:49:08 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
    Mon Oct 29 15:49:08 2018 - [info] 
    Mon Oct 29 15:49:08 2018 - [info] The latest binary log file/position on all slaves is my3306_binlog.000027:860190114
    Mon Oct 29 15:49:08 2018 - [info] Retrieved Gtid Set: 7390a401-b705-11e8-9ed9-080027b0b461:140923-149275
    Mon Oct 29 15:49:08 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
    Mon Oct 29 15:49:08 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:08 2018 - [info]     GTID ON
    Mon Oct 29 15:49:08 2018 - [info]     Replicating from
    Mon Oct 29 15:49:08 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Oct 29 15:49:08 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:08 2018 - [info]     GTID ON
    Mon Oct 29 15:49:08 2018 - [info]     Replicating from
    Mon Oct 29 15:49:08 2018 - [info]     Not candidate for the new Master (no_master is set)
    Mon Oct 29 15:49:08 2018 - [info] The oldest binary log file/position on all slaves is my3306_binlog.000027:860190114
    Mon Oct 29 15:49:08 2018 - [info] Retrieved Gtid Set: 7390a401-b705-11e8-9ed9-080027b0b461:140923-149275
    Mon Oct 29 15:49:08 2018 - [info] Oldest slaves:
    Mon Oct 29 15:49:08 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:08 2018 - [info]     GTID ON
    Mon Oct 29 15:49:08 2018 - [info]     Replicating from
    Mon Oct 29 15:49:08 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Mon Oct 29 15:49:08 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
    Mon Oct 29 15:49:08 2018 - [info]     GTID ON
    Mon Oct 29 15:49:08 2018 - [info]     Replicating from
    Mon Oct 29 15:49:08 2018 - [info]     Not candidate for the new Master (no_master is set)
    Mon Oct 29 15:49:08 2018 - [info] 
    Mon Oct 29 15:49:08 2018 - [info] * Phase 3.3: Determining New Master Phase..
    Mon Oct 29 15:49:08 2018 - [info] 
    Mon Oct 29 15:49:08 2018 - [info] can be new master.
    Mon Oct 29 15:49:08 2018 - [info] New master is
    Mon Oct 29 15:49:08 2018 - [info] Starting master failover..
    Mon Oct 29 15:49:08 2018 - [info] 
    From: (current master)
    To: (new master)
    Starting master switch from to (yes/NO): yes
    Mon Oct 29 15:49:09 2018 - [info] New master decided manually is
    Mon Oct 29 15:49:09 2018 - [info] 
    Mon Oct 29 15:49:09 2018 - [info] * Phase 3.3: New Master Recovery Phase..
    Mon Oct 29 15:49:09 2018 - [info] 
    Mon Oct 29 15:49:09 2018 - [info]  Waiting all logs to be applied.. 
    Mon Oct 29 15:49:09 2018 - [info]   done.
    Mon Oct 29 15:49:09 2018 - [info] Getting new master's binlog name and position..
    Mon Oct 29 15:49:09 2018 - [info]  my3306_binlog.000016:965341200
    Mon Oct 29 15:49:09 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
    Mon Oct 29 15:49:09 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: my3306_binlog.000016, 965341200, 7390a401-b705-11e8-9ed9-080027b0b461:1-149275,
    Mon Oct 29 15:49:09 2018 - [info] Executing master IP activate script:
    Mon Oct 29 15:49:09 2018 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host= --orig_master_ip= --orig_master_port=3306 --new_master_host= --new_master_ip= --new_master_port=3306 --new_master_user='mha_rep'   --new_master_password=xxx
    Unknown option: new_master_user
    Unknown option: new_master_password
    IN SCRIPT TEST====/sbin/ifconfig eth0:0 down==/sbin/ifconfig eth0:0 up===
    Enabling the VIP - on the new master - 
    bash: /usr/bin/arping: No such file or directory
    Mon Oct 29 15:49:10 2018 - [info]  OK.
    Mon Oct 29 15:49:10 2018 - [info] Setting read_only=0 on
    Mon Oct 29 15:49:10 2018 - [info]  ok.
    Mon Oct 29 15:49:10 2018 - [info] ** Finished master recovery successfully.
    Mon Oct 29 15:49:10 2018 - [info] * Phase 3: Master Recovery Phase completed.
    Mon Oct 29 15:49:10 2018 - [info] 
    Mon Oct 29 15:49:10 2018 - [info] * Phase 4: Slaves Recovery Phase..
    Mon Oct 29 15:49:10 2018 - [info] 
    Mon Oct 29 15:49:10 2018 - [info] 
    Mon Oct 29 15:49:10 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
    Mon Oct 29 15:49:10 2018 - [info] 
    Mon Oct 29 15:49:10 2018 - [info] -- Slave recovery on host started, pid: 16740. Check tmp log /var/log/masterha/app1/ if it takes time..
    Mon Oct 29 15:49:12 2018 - [info] 
    Mon Oct 29 15:49:12 2018 - [info] Log messages from ...
    Mon Oct 29 15:49:12 2018 - [info] 
    Mon Oct 29 15:49:10 2018 - [info]  Resetting slave and starting replication from the new master
    Mon Oct 29 15:49:10 2018 - [info]  Executed CHANGE MASTER.
    Mon Oct 29 15:49:11 2018 - [info]  Slave started.
    Mon Oct 29 15:49:11 2018 - [info]  gtid_wait(7390a401-b705-11e8-9ed9-080027b0b461:1-149275,
    a4774e9e-bba9-11e8-bf3e-08002712513f:1-2) completed on Executed 0 events.
    Mon Oct 29 15:49:12 2018 - [info] End of log messages from
    Mon Oct 29 15:49:12 2018 - [info] -- Slave on host started.
    Mon Oct 29 15:49:12 2018 - [info] All new slave servers recovered successfully.
    Mon Oct 29 15:49:12 2018 - [info] 
    Mon Oct 29 15:49:12 2018 - [info] * Phase 5: New master cleanup phase..
    Mon Oct 29 15:49:12 2018 - [info] 
    Mon Oct 29 15:49:12 2018 - [info] Resetting slave info on the new master..
    Mon Oct 29 15:49:13 2018 - [info] Resetting slave info succeeded.
    Mon Oct 29 15:49:13 2018 - [info] Master failover to completed successfully.
    Mon Oct 29 15:49:13 2018 - [info] 
    ----- Failover Report -----
    app1: MySQL Master failover to succeeded
    Master is down!
    Check MHA Manager logs at mysqldb2 for details.
    Started manual(interactive) failover.
    Invalidated master IP address on
    Selected as a new master. OK: Applying all logs succeeded. OK: Activated master IP address. OK: Slave started, replicating from Resetting slave info succeeded.
    Master failover to completed successfully.

    6. 原mysqldb1手工加入集群

    mysql> start slave;

    7. 手工在线切换

    masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host= --orig_master_is_new_slave
  • 相关阅读:
  • 原文地址:https://www.cnblogs.com/wanbin/p/9899596.html
Copyright © 2011-2022 走看看