收到开发反应一库的sql频繁超时,系统几乎瘫痪,无法执行任何操作,我登上库先查看到当前的线程,发现有大量的线程状态是
Waiting for table flush
查看当前的事务
从昨天开始执行,到今天早晨还没执行完,具体原因还没深究,先将此线程释放,然后备份才可以flush table成功继而备份完成后后面一系列被阻塞的sql都得以正常运行
mysql> select * from information_schema.innodb_trxG *************************** 1. row *************************** trx_id: 192611452 trx_state: RUNNING trx_started: 2017-11-30 18:33:58 trx_requested_lock_id: NULL trx_wait_started: NULL trx_weight: 3688 trx_mysql_thread_id: 352932171 trx_query: DELETE FROM xx WHERE xx IN(SELECT xx FROM xx WHERE Remarks LIKE xx) trx_operation_state: unlock_row trx_tables_in_use: 2 trx_tables_locked: 2 trx_lock_structs: 3688 trx_lock_memory_bytes: 368848 trx_rows_locked: 4 trx_rows_modified: 0 trx_concurrency_tickets: 0 trx_isolation_level: READ COMMITTED trx_unique_checks: 1 trx_foreign_key_checks: 1 trx_last_foreign_key_error: NULL trx_adaptive_hash_latched: 0 trx_adaptive_hash_timeout: 0 trx_is_read_only: 0 trx_autocommit_non_locking: 0
后来想了一下每天的凌晨两点有物理备份,于是查看备份日志,发现果然是上面的事务阻塞了物理备份;
物理备份的整个流程
先记录当前redo log的序列号 171201 02:00:02 >> log scanned up to (54138135415) xtrabackup: Generating a list of tablespaces xtrabackup: using the full scan for incremental backup xtrabackup: Starting 4 threads for parallel data files transfer 然后备份innodb库表 171201 02:00:12 [01] Copying . 备份完之后flush table;因为被阻塞,所以知道释放完事务后才成功 171201 02:00:17 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...
接着开始备份非事务库表 171201 09:36:13 Executing FLUSH TABLES WITH READ LOCK... 171201 09:36:13 >> log scanned up to (54147795188) 171201 09:36:14 Starting to backup non-InnoDB tables and files 171201 09:36:14 [01] Copying .... xtrabackup: The latest check point (for incremental): '54138858140' xtrabackup: Stopping log copying thread. .171201 09:36:14 >> log scanned up to (54147795198) 171201 09:36:14 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
备份完之后释放表锁 171201 09:36:14 Executing UNLOCK TABLES 171201 09:36:14 All tables unlocked 171201 09:36:14 [00] Copying ib_buffer_pool to xxx 171201 09:36:14 [00] ...done 171201 09:36:14 Backup created in directory xxxx MySQL binlog position: xxx 171201 09:36:14 [00] Writing backup-my.cnf 171201 09:36:14 [00] ...done 171201 09:36:14 [00] Writing xtrabackup_info 171201 09:36:14 [00] ...done xtrabackup: Transaction log of lsn (54138129801) to (54147795198) was copied. 171201 09:36:15 completed OK!
被阻塞的语句是FLUSH NO_WRITE_TO_BINLOG TABLES...
官方解释flush tables
Closes all open tables, forces all tables in use to be closed, and flushes the query cache and prepared statement cache.
没有涉及到锁相关的字眼;但是测试表明在执行查询或者变更还未完成时,如果另起一个会话执行flush tables 则会被阻塞,
如果此后如果有操作慢查询中的表的任何sql都会被阻塞;