zoukankan      html  css  js  c++  java
  • mysql5.7 MGR配置

    1. 规划

    2 连接master报错, 使用主机名连接,所有需要修改主机 /etc/hosts

    cat /etc/hosts
    127.0.0.1    localhost
    172.18.12.2     master
    172.18.12.3     slave1
    172.18.12.4     slave2
    172.18.12.5     slave3
    172.18.12.6     slave4

    3.设置mysql启动参数

    #关于group replication 一些参数
    master_info_repository = TABLE
    relay_log_info_repository = TABLE
    relay_log_recovery=1
    gtid_mode = on
    enforce_gtid_consistency = 1
    log_slave_updates = ON
    binlog_checksum=NONE
    slave_preserve_commit_order = ON #开启并行复制相关
    server_id=1 #第一个节点是1 ,第二个节点是2 ......
    slave_parallel_type=LOGICAL_CLOCK
    slave_parallel_workers=2-8 #根据cpu数量 transaction_write_set_extraction
    =XXHASH64 loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" loose-group_replication_start_on_boot=off loose-group_replication_local_address="172.18.12.2:23306" loose-group_replication_group_seeds="172.18.12.2:23306,172.18.12.3:23306,172.18.12.4:23306,172.18.12.5:23306,172.18.12.6:23306" loose-grup_replication_bootstrap_group=off

    4.初始化:

    mysqld --initialize


    5.第一个节点启动

    mysqld --defaults-file=/etc/my.cnf &

    6. 登录到mysql, 创建一个账户,不记录到binlog中

    mysql -u root -p 

    set
    sql_log_bin=0;
    create user
    'feng'@'172.18.12.%'; grant replication slave on *.* to 'feng'@'172.18.12.%' identified by '123456'; set sql_log_bin=1;

    7.  channel 的名字不能改变

    root@localhost 03:07:  [(none)]> change master to master_user='feng', master_password='123456' FOR CHANNEL  'group_replication_recovery';

    8. 安装 加载group_replication插件

    install plugin group_replication soname 'group_replication.so';
    show plugins;

    9. 查看group相关参数

    root@localhost 03:11:  [(none)]> show global variables  like '%group%';

    10. 设置第一个节点启动为on

     set global group_replication_bootstrap_group=on;   #只在第一个节点使用

    11. 启动group_replication

    start group_replication;

    #启动后报错

      2020-03-25T13:31:11.636568+08:00 6 [Warning] Plugin group_replication reported: 'Group Replication requires slave-preserve-commit-order to be set to ON when using more than 1 applier threads.'

      slave_preserve_commit_order=on #并行复制相关 

    12. 查看是否加入到组中

    select * from performance_schema.replication_group_members;

    root@localhost 05:50: [performance_schema]> select * from replication_group_members;
    +---------------------------+--------------------------------------+-------------+-------------+--------------+
    | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
    +---------------------------+--------------------------------------+-------------+-------------+--------------+
    | group_replication_applier | a59bc963-5772-11ea-93aa-0242ac120c02 | master | 3306 | ONLINE |
    +---------------------------+--------------------------------------+-------------+-------------+--------------+

    二、加入二个节点172.18.12.3 节点

    2.设置mysql启动参数

    #关于group replication 一些参数
    master_info_repository = TABLE
    relay_log_info_repository = TABLE
    relay_log_recovery=1
    gtid_mode = on
    enforce_gtid_consistency = 1
    log_slave_updates = ON
    server_id=2 #第一个节点是1 ,第二个节点是2 ......
    binlog_checksum=NONE
    slave_preserve_commit_order = ON #开启并行复制相关
    slave_parallel_type=LOCIGCAL_CLOCK
    slave_parallel_workers=2-8 #根据cpu数量

    transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" loose-group_replication_start_on_boot=off loose-group_replication_local_address="172.18.12.3:23306" loose-group_replication_group_seeds="172.18.12.2:23306,172.18.12.3:23306,172.18.12.4:23306,172.18.12.5:23306,172.18.12.6:23306" loose-grup_replication_bootstrap_group=off

    初始化mysql,启动mysql

    mysql --initialize
    
    mysqld --defaults-file=/etc/my.cnf &

     5. 登录到mysql, 创建一个账户,不记录到binlog中

    mysql -u root -p 

    set sql_log_bin=0; create user 'feng'@'172.18.12.%'; grant replication slave on *.* to 'feng'@'172.18.12.%' identified by '123456'; set sql_log_bin=1;

    6.  channel 的名字不能改变

    root@localhost 03:07:  [(none)]> change master to master_user='feng', master_password='123456' FOR CHANNEL  'group_replication_recovery';

    7. 安装 加载group_replication插件

    install plugin group_replication soname 'group_replication.so';
    show plugins;

    10. 启动group_replication

    set global group_replication_allow_local_disjoint_gtids_join=ON;
    start group_replication;

    11. 查看状态:

    root@localhost 07:12:  [(none)]> show slave status for channel 'group_replication_recovery' G;
    *************************** 1. row ***************************
                   Slave_IO_State: 
                      Master_Host: master
                      Master_User: rpl_user
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: 
              Read_Master_Log_Pos: 4
                   Relay_Log_File: slave1-relay-bin-group_replication_recovery.000001
                    Relay_Log_Pos: 4
            Relay_Master_Log_File: 
                 Slave_IO_Running: No
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 0
                  Relay_Log_Space: 1534
                  Until_Condition: SQL_VIEW_ID
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 2005
                    Last_IO_Error: error connecting to master 'rpl_user@master:3306' - retry-time: 60  retries: 1
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 0
                      Master_UUID: 
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
               Master_Retry_Count: 1
                      Master_Bind: 
          Last_IO_Error_Timestamp: 200225 07:15:55
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 
                Executed_Gtid_Set: 
                    Auto_Position: 1
             Replicate_Rewrite_DB: 
                     Channel_Name: group_replication_recovery
               Master_TLS_Version: 
    1 row in set (0.00 sec)
    
    ERROR: 
    No query specified

    再次查看:

    root@localhost 07:20:  [(none)]> show slave status for channel 'group_replication_recovery' G;
    *************************** 1. row ***************************
                   Slave_IO_State: 
                      Master_Host: <NULL>
                      Master_User: rpl_user
                      Master_Port: 0
                    Connect_Retry: 60
                  Master_Log_File: 
              Read_Master_Log_Pos: 4
                   Relay_Log_File: slave1-relay-bin-group_replication_recovery.000001
                    Relay_Log_Pos: 4
            Relay_Master_Log_File: 
                 Slave_IO_Running: No
                Slave_SQL_Running: No
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 0
                  Relay_Log_Space: 496
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 0
                      Master_UUID: 
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: 
               Master_Retry_Count: 1
                      Master_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 
                Executed_Gtid_Set: 09440083-2b74-11ea-838b-549f3510fe78:1-4
                    Auto_Position: 1
             Replicate_Rewrite_DB: 
                     Channel_Name: group_replication_recovery
               Master_TLS_Version: 
    1 row in set (0.00 sec)
    
    ERROR: 
    No query specified

    11. 查看是否加入到组中

    root@localhost 07:21:  [performance_schema]> select * from replication_group_members;
    +---------------------------+--------------------------------------+-------------+-------------+--------------+
    | CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
    +---------------------------+--------------------------------------+-------------+-------------+--------------+
    | group_replication_applier | 66576b31-579b-11ea-b037-0242ac120c02 | master      |        3306 | ONLINE       |
    | group_replication_applier | 6808a18a-579b-11ea-97cd-0242ac120c03 | slave1      |        3306 | ONLINE       |
    | group_replication_applier | 695065b9-579b-11ea-a932-0242ac120c05 | slave3      |        3306 | ONLINE       |
    | group_replication_applier | 6bc82210-579b-11ea-bb3b-0242ac120c04 | slave2      |        3306 | ONLINE       |
    +---------------------------+--------------------------------------+-------------+-------------+--------------+
    4 rows in set (0.00 sec)

    配置出现的问题:

    1 [ERROR] Plugin group_replication reported: 'Gtid mode should be ON for Group Replication'

    没有打开GtID

    2 Last_IO_Error: error connecting to master 'rpl_user@master:3306' - retry-time: 60  retries: 1

    添加所有节点的主机名和ip的对应关系到/etc/hosts

    3.  [ERROR] Plugin group_replication reported: '[GCS] Error on opening a connection to 172.17.12.6:23306 on local port: 23306.

    /etc/my.cnf 配置文件 loose-group_replication_group_seeds ip地址写错了

    4. 第二个节点加入MGR时报错[

    ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: f9418f39-8f6a-11ea-8e60-005056841c10:1 > Group transactions: 6c0d784d-8f66-11ea-bd2e-00505684742a:1,
    aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-5'
    2020-05-06T16:09:46.375914+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'

    解决办法 需要设置参数,也就是兼容加入组
    set global group_replication_allow_local_disjoint_gtids_join=ON;

    12 创建测试数据库(master可以写入,其他节点不能写入)

    master:

    root@localhost 05:52: [fengjian]> create database fengjian; root@localhost 05:52: [fengjian]> use fengjian; root@localhost 05:52: [fengjian]> create table test(id int, name varchar(60), primary key(id)); root@localhost 05:52: [fengjian]> insert into test values(1,'feng');


    slave: 报错,无法写入

      root@localhost 08:18: [fengjian]> insert into test values(2,'su');
      ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement


    查看其他节点所有节点的read_only 选项

    master
    slave1

    Group replication 工作模式

    1. 基于single-master 环境:

    在group replication 配置中,默认模式是 single-master 模式, 只有一个节点可以进行写操作,其他节点开启 --super-read-only 模式, 当主几点挂了, 其他节点会选出来一个新的master

    在 single-master 上查看那个节点是 主节点:

    root@localhost 08:25:  [performance_schema]> select variable_value from global_status where variable_name='group_replication_primary_member';
    +--------------------------------------+
    | variable_value                       |
    +--------------------------------------+
    | 43567e06-6e49-11ea-a1ef-0242ac110003 |
    +--------------------------------------+

    登录到 其他节点上,查看 uuid

    root@localhost 08:27:  [fengjian]> select @@server_uuid;
    +--------------------------------------+
    | @@server_uuid                        |
    +--------------------------------------+
    | 43567e06-6e49-11ea-a1ef-0242ac110003 |
    +--------------------------------------+

    multi-master 模式 (多主,性能有15%的损失)

    如果要开启multi-master 模式,需要在配置文件/etc/my.cnf 中指定

    loose-group_replication_single_primary_mode=OFF; 
    loose-group_replication_enforce_update_everywhere_checks=ON;

    不管是 single-master 还是 multi-master ,我们只提供一台负责写操作

    集群重启:

    1. 滚动重启

    节点上执行 :  start group_replication;

    2. 全部一起重启(集体掉电)

    第一个节点启动(single-master 和 multi-master 都一样)

    要在每个节点执行 show master stauts; 如果那个gtid多,则设置成主节点。
    
    set global_replication_bootstrap_group=on;
    
    start group_replication;

    其他节点 start  group-replication;

    MGR 监控点讲解:

    可用性监控

    root@localhost 09:28:  [performance_schema]> select member_id,member_host,member_state from replication_group_members where member_id=@@server_uuid;
    +--------------------------------------+-------------+--------------+
    | member_id                            | member_host | member_state |
    +--------------------------------------+-------------+--------------+
    | 66576b31-579b-11ea-b037-0242ac120c02 | master      | ONLINE       |
    +--------------------------------------+-------------+--------------+

    查看当前是不是可读

    root@localhost 09:32:  [performance_schema]> select * from performance_schema.global_variables where variable_name in ('read_only', 'super_read_only');
    +-----------------+----------------+
    | VARIABLE_NAME   | VARIABLE_VALUE |
    +-----------------+----------------+
    | read_only       | OFF            |
    | super_read_only | OFF            |
    +-----------------+----------------+

    为可读节点


    复制是不是存在延迟:
    对比获得到的GTID和本节点执行的GTID是不是一致:

    从远程获取的GTID:

    SELECT Received_transaction_set FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier';


    本节点执行的GTID:

    select  @@global.gtid_executed;   不显示结果

    root@localhost 09:47: [performance_schema]> show global variables like '%gtid_executed%';
    +----------------------------------+--------------------------------------------+
    | Variable_name | Value |
    +----------------------------------+--------------------------------------------+
    | gtid_executed | 09440083-2b74-11ea-838b-549f3510fe78:1-977 |

    
    远程获取的GTID - 本节点执行的GTID = 延迟的GTID数

    本节点执行队列是不是有堆积(大于0表示有延迟):

    select count_transactions_in_queue from replication_group_member_stats where member_id=@@server_uuid;

    流控(flow control)
    在MGR中如果节点落后集群中其它成员太多,就会发起让其它节点等他完成在做的控制,这个叫流控。

    当启用: 
    group_replication_flow_control_mode=QUOTA 是表示启用流控。 流控默认通过两个参数控制: group_replication_flow_control_applier_threshold (默认: 25000) group_replication_flow_control_certifier_threshold (默认: 25000)
    也就说默认延迟在25000个GTID时,会对整个集群Block住写操作。

    当然,也可以允许,节点延迟,就如同我们主从结构,从节点延迟,不往上面发请求就可以。

    关闭流控:

    set global group_replication_flow_control_mode='DISABLED';

    提示: 关闭流控制,注意查看是不是存在延迟,如果延迟,自已控制阀值不向上面发请求即可。 多IDC结构的MGR,建议关闭流控。

    MGR调优参数


    因为基本复制结构,所有的数据复制,还是逻辑的重放,所以优化也是复制优化点。
    更改:

    slave_parallel_type -> LOGICAL_CLOCK   #增强sql_thread个数:
    slave_parallel_workers -> 2-8          #如果CPU瓶颈,网络没问题,减少CPU压缩:
    group_replication_compression_threshold = 1000000 -> 2000000
    由原来的1M变成2M,再进行压缩(主要针对大事务传述优化)
    
     

    MGR备份及新节点的加入

    1. 通过备份工具,建立备份
    2. 加载MGR插件
    3. mysqldump -u root --single-transaction -A > full_bak.sql  其他从库,备份
    4. 新加入的mgr节点配置文件

      transaction_write_set_extraction=XXHASH64
      loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
      loose-group_replication_start_on_boot=off
      loose-group_replication_local_address="172.18.12.7:23306"
      loose-group_replication_group_seeds="172.18.12.2:23306,172.18.12.3:23306,172.18.12.4:23306,172.18.12.5:23306,172.18.12.6:23306,172.18.12.7:23306"
      loose-group_replication_bootstrap_group=off

    5. mysql -u root -p  ; reset master ; mysql -u root -p < full_bak.sql
    6. 可以加入集群的关键点: set global gtid_purged='xxx-xxxx-xxxx:id'
    7. start group_replication;
    8. use performance_schema; select * from replication_group_members;
     

    root@localhost 08:52: [performance_schema]> show slave status FOR CHANNEL 'group_replication_recovery' G
    *************************** 1. row ***************************
    Slave_IO_State:
    Master_Host: <NULL>
    Master_User: rpl_user
    Master_Port: 0
    Connect_Retry: 60
    Master_Log_File:
    Read_Master_Log_Pos: 4
    Relay_Log_File: slave3-relay-bin-group_replication_recovery.000001
    Relay_Log_Pos: 4
    Relay_Master_Log_File:
    Slave_IO_Running: No
    Slave_SQL_Running: No
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 0
    Relay_Log_Space: 496
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
    Last_IO_Errno: 0
    Last_IO_Error:
    Last_SQL_Errno: 0
    Last_SQL_Error:
    Replicate_Ignore_Server_Ids:
    Master_Server_Id: 0
    Master_UUID:
    Master_Info_File: mysql.slave_master_info
    SQL_Delay: 0
    SQL_Remaining_Delay: NULL
    Slave_SQL_Running_State:
    Master_Retry_Count: 1
    Master_Bind:
    Last_IO_Error_Timestamp:
    Last_SQL_Error_Timestamp:
    Master_SSL_Crl:
    Master_SSL_Crlpath:
    Retrieved_Gtid_Set:
    Executed_Gtid_Set: 09440083-2b74-11ea-838b-549f3510fe78:1-896
    Auto_Position: 1
    Replicate_Rewrite_DB:
    Channel_Name: group_replication_recovery
    Master_TLS_Version:

    
    

    如果binlog已经过期, start group_replication 启动报错,因为从库的gtid信息已经删除了,

    引用

    https://www.cnblogs.com/manger/p/7212039.html

  • 相关阅读:
    SpringMVC从Request域中获取数据
    SpringMVC重定向
    SpringMVC的请求转发的三种方法
    SpringMVC文件上传
    SpringMVC处理请求释放静态资源的三种方式
    jackson实现json转换
    SpringMVC之请求部分
    SpringMVC的执行流程
    Java [Leetcode 39]Combination Sum
    深入解析Java对象的hashCode和hashCode在HashMap的底层数据结构的应用
  • 原文地址:https://www.cnblogs.com/fengjian2016/p/12361361.html
Copyright © 2011-2022 走看看