第一种情况,开启GTID,从库与主库不同步。
1、在从库上查看从的状态
mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Master_Host: 10.120.141.168 Master_User: sys_repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: mysqld-relay-bin.000001 Relay_Log_Pos: 4 。。。。。。。。 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.' 。。。。。。 Retrieved_Gtid_Set: 79212d47-7122-11e7-8641-0050569f788a:13579-17760 Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7, 635227c2-af6c-11e8-a447-5254003471ec:1-297, 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13, 79212d47-7122-11e7-8641-0050569f788a:1-17760, 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
结果报1236的错误,再从主库上查看下主库的状态
mysql> show master status G *************************** 1. row *************************** File: mysql-bin.000480 Position: 10751 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7, 635227c2-af6c-11e8-a447-5254003471ec:1-297, 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13, 79212d47-7122-11e7-8641-0050569f788a:1-17797, 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6, ac29dfbf-aa66-11e8-9d1e-5254003471ec:1 1 row in set (0.00 sec) mysql> show variables like '%gtid_purged%'; +---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Variable_name | Value | +---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | gtid_purged | 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7, 635227c2-af6c-11e8-a447-5254003471ec:1-297, 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13, 79212d47-7122-11e7-8641-0050569f788a:1-15338, 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6, ac29dfbf-aa66-11e8-9d1e-5254003471ec:1 | +---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
发现有一个GTID从库还没有跑,主库就把这个GTID purged掉了,由于从库没有业务在跑,属于备份库,所以我索性直接
mysql> stop slave; mysql> reset master; mysql> set global gtid_purged = '07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7, '> 635227c2-af6c-11e8-a447-5254003471ec:1-297, '> 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13, '> 79212d47-7122-11e7-8641-0050569f788a:1-15223, '> 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6, '> ac29dfbf-aa66-11e8-9d1e-5254003471ec:1'; #主库的gtid_purged# mysql> change master to master_host='10.120.141.136',master_user='sys_replication',master_password='x!Jkz@SIe',master_port=3306,master_auto_position=1; mysql> start slave; mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.120.141.136 Master_User: sys_replication Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000480 Read_Master_Log_Pos: 10272 Relay_Log_File: mysqld-relay-bin.000034 Relay_Log_Pos: 10285 Relay_Master_Log_File: mysql-bin.000480 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 10272 Relay_Log_Space: 10580 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 141135 Master_UUID: 79212d47-7122-11e7-8641-0050569f788a Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 79212d47-7122-11e7-8641-0050569f788a:15224-17793 Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7, 635227c2-af6c-11e8-a447-5254003471ec:1-297, 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13, 79212d47-7122-11e7-8641-0050569f788a:1-17793, 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6, ac29dfbf-aa66-11e8-9d1e-5254003471ec:1 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
总结:
从库在开始同步前,主库会依靠GTID来确认从库在开始同步以后, 能够把每一个主库上执行过的事务(包括slave的SQL Thread)都复现一次,最终保持和主库完全一致;
判断方法也很简单,基本基于两个条件:
1.主库不能purge从库还没有execute的事务(即从库的executed_GTID要大于主库的GTID_Purged);
2.主库上的事务号不能低于从库(即从库的executed_GTID的最后一个事务要在主库的executed_GTID的范围之内);
2、 构架为双主(一主一从,且互为主从),业务和应用在主库上跑,从库做备份,基本没有业务和应用。
从库(s)指向主库(m)时连接良好,主库(m)指向从库(s)时报错1236。
mysql> show slave status G #主库(m)状态 *************************** 1. row *************************** Slave_IO_State: Master_Host: 10.120.141.168 Master_User: sys_repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: mysqld-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 0 Relay_Log_Space: 154 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.' Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 141168 Master_UUID: 055a9521-4906-11e8-8cdb-0050569f3621 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 181119 14:27:58 Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: 055a9521-4906-11e8-8cdb-0050569f3621:1-164859, 2f7211a3-fba0-11e5-b668-0050569f3621:1-627, 30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14, 375942c7-0723-11e6-b55c-0050569f3621:1-16630, 61edd40b-af6c-11e8-a4f6-525400adeb6d:1-5059, 971844d6-d7ca-11e6-8d01-0050569f6058:1-11929269, d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
再看从库(s)的gtid_purged
mysql> show variables like '%gtid%purged%'G *************************** 1. row *************************** Variable_name: gtid_purged Value: 055a9521-4906-11e8-8cdb-0050569f3621:1-157766, 2f7211a3-fba0-11e5-b668-0050569f3621:1-627, 30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14, 375942c7-0723-11e6-b55c-0050569f3621:1-16633, 971844d6-d7ca-11e6-8d01-0050569f6058:1-9581086, d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354 1 row in set (0.00 sec)
发现由于从库(s)的 gtid_purged大于主库(m)的Executed_Gtid #从库指向主库的结构已经搭建完成,现在是搭建主库指向从库时报错,即当前主是从#
根据之前总结的规则,主库(s)不能purge从库(m)还没有execute的事务(即从库(m)的executed_GTID要大于主库(s)的GTID_Purged)
所以会报1236的错误。由于主库(m)上还有业务和应用在跑,所以不能生硬的reset master,所以只能想办法把execunt gtid追回来,
我的方法是跳过这三个事务(不是唯一解法,如果差的事务号过多,这个办法就很愚蠢,在这个构架下出现这种错误很有可能在从库上有应用执行过事务,
如果从库执行的事务太多,那就要查查原因了)
stop slave; set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16631'; --指定下一个事务执行的版本,即想要跳过的GTID begin; commit; --注入一个空事物 set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16632'; begin; commit; set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16633'; begin; commit; set gtid_next='AUTOMATIC'; --自动的寻找GTID事务。 start slave; --开始同步 mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.120.141.168 Master_User: sys_repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000026 Read_Master_Log_Pos: 185245755 Relay_Log_File: mysqld-relay-bin.000002 Relay_Log_Pos: 414 Relay_Master_Log_File: mysql-bin.000026 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 185245755 Relay_Log_Space: 622 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 141168 Master_UUID: 055a9521-4906-11e8-8cdb-0050569f3621 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: 055a9521-4906-11e8-8cdb-0050569f3621:1-164859, 2f7211a3-fba0-11e5-b668-0050569f3621:1-627, 30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14, 375942c7-0723-11e6-b55c-0050569f3621:1-16633, 61edd40b-af6c-11e8-a4f6-525400adeb6d:1-5059, 971844d6-d7ca-11e6-8d01-0050569f6058:1-11929305, d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
这样就好了。