问题背景:
客户反馈slave每到凌晨就出现延迟现象,需要排查原因
1>首先查看master库每天凌晨有什么操作:
分析binlog日志
1 mysqlbinlog --no-defaults --base64-output=decode-rows -v mysql-bin.000204 > mysql-bin.000204.sql
查看日志在凌晨有大量的delete操作
#191126 0:08:35 server id 1 end_log_pos 1073744234 CRC32 0x11bf4f5d Table_map: `user`.`table_name` mapped to number 4012691 # at 1073744234 #191126 0:08:35 server id 1 end_log_pos 1073744393 CRC32 0xa3229214 Delete_rows: table id 4012691 flags: STMT_END_F ### DELETE FROM `user`.`table_name` ### WHERE ### @1=259121 ### @2='2019-11-25' ### @3=1 ### @4=1 ### @5=0 ### @6='2019-11-25' ### @7='08:30' ### @8='2019-11-25' ### @9='17:30' ### @10=540 ### @11='' ### @12='' ### @13=NULL ### @14='' ### @15='' ### @16=NULL ### @17=0 ### @18=0 ### @19=0 ### @20=0 ### @21=0 ### @22=540 ### @23=0 ### @24=0 ### @25=0 ### @26=0 ### @27=0 ### @28='{}' # at 1073744393 #191126 0:08:35 server id 1 end_log_pos 1073744424 CRC32 0x5a03e7aa Xid = 25909247548
2> 判断为大量的delete操作产出大量的binlog日志,slave应用不过来
一般而言,slave相对master延迟较大,其根本原因就是slave上的复制线程没办法真正做到并发。简单说,
在master上是并发模式(以InnoDB引擎为主)完成事务提交的,而在slave上,
复制线程只有一个sql thread用于binlog的apply,所以slave在高并发时会远落后master。
查看slave复制方式
1 mysql> show variables like"%slave%"; 2 3 +-------------------------------------------+----------------------+ 4 5 | Variable_name | Value | 6 7 +-------------------------------------------+----------------------+ 8 9 | init_slave | | 10 11 | log_slave_updates | ON | 12 13 | log_slow_slave_statements | OFF | 14 15 | pseudo_slave_mode | OFF | 16 17 | rpl_semi_sync_master_wait_for_slave_count | 1 | 18 19 | rpl_semi_sync_master_wait_no_slave | OFF | 20 21 | rpl_semi_sync_slave_enabled | ON | 22 23 | rpl_semi_sync_slave_trace_level | 32 | 24 25 | rpl_stop_slave_timeout | 31536000 | 26 27 | slave_allow_batching | OFF | 28 29 | slave_checkpoint_group | 512 | 30 31 | slave_checkpoint_period | 300 | 32 33 | slave_compressed_protocol | OFF | 34 35 | slave_exec_mode | STRICT | 36 37 | slave_load_tmpdir | /tmp | 38 39 | slave_max_allowed_packet | 1073741824 | 40 41 | slave_net_timeout | 60 | 42 43 | slave_parallel_type | DATABASE | 44 45 | slave_parallel_workers | 0 | 46 47 | slave_pending_jobs_size_max | 134217728 | 48 49 | slave_preserve_commit_order | OFF | 50 51 | slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN | 52 53 | slave_skip_errors | OFF | 54 55 | slave_sql_verify_checksum | ON | 56 57 | slave_transaction_retries | 10 | 58 59 | slave_type_conversions | | 60 61 | sql_slave_skip_counter | 0 | 62 63 +-------------------------------------------+----------------------+ 64 65 27 rows in set (0.00 sec)
配置从服务器上的多线程并行复制的参数(此处为实现多线程复制的重要参数)在数据库配置文件 my.cnf中设置
1 slave-parallel-type=LOGICAL_CLOCK 2 3 slave-parallel-workers=16 #16为设置的并发线程个数,之后根据项目对数据传输的具体要求再更改 4 5 #一个schema下,slave_parallel_workers中的worker线程并发执行relay log中主库提交的事务 6 7 master_info_repository=TABLE 8 9 relay_log_info_repository=TABLE 10 11 relay_log_recovery=ON 12 13 注:变量slave-parallel-type可以有两个值 14 15 16 DATABASE 为默认值,意为基于库的并行复制方式; 17 18 LOGICAL_CLOCK:基于组提交的并行复制方式
1、下面查看复制类型和并行数量配置
1 mysql> show variables like 'slave_parallel_type'; 2 3 +---------------------+----------+ 4 5 | Variable_name | Value | 6 7 +---------------------+----------+ 8 9 | slave_parallel_type | DATABASE | 10 11 +---------------------+----------+ 12 13 1 row in set (0.00 sec)
当前的复制类型是 DATABASE,也就是统一数据库下只有一个线程进行复制,不能并行复制。
1 mysql> show variables like 'slave_parallel_workers'; 2 3 +------------------------+-------+ 4 5 | Variable_name | Value | 6 7 +------------------------+-------+ 8 9 | slave_parallel_workers | 0 | 10 11 +------------------------+-------+ 12 13 1 row in set (0.01 sec) 14 15
当前并行工作的进程数是 0
配置多线程
1、停止从节点复制
1 mysql> stop slave; 2 3 Query OK, 0 rows affected (0.01 sec)
2、设置复制类型为 LOGICAL_CLOCK
1 mysql> set global slave_parallel_type='logical_clock'; 2 3 Query OK, 0 rows affected (0.00 sec) 4 5 mysql> show variables like 'slave_parallel_type'; 6 7 +---------------------+---------------+ 8 9 | Variable_name | Value | 10 11 +---------------------+---------------+ 12 13 | slave_parallel_type | LOGICAL_CLOCK | 14 15 +---------------------+---------------+ 16 17 1 row in set (0.01 sec)
3、设置并行数量为 4
1 mysql> set global slave_parallel_workers=4; 2 3 Query OK, 0 rows affected (0.00 sec) 4 5 mysql> show variables like 'slave_parallel_workers'; 6 7 +------------------------+-------+ 8 9 | Variable_name | Value | 10 11 +------------------------+-------+ 12 13 | slave_parallel_workers | 4 | 14 15 +------------------------+-------+ 16 17 1 row in set (0.00 sec)
4、启动从节点复制
1 mysql> start slave; 2 3 Query OK, 0 rows affected (0.02 sec) 4 5