http://book.2cto.com/201305/23345.html
doublewrite由两部分组成,一部分是内存中的doublewrite buffer,大小为2MB,另一部分是物理磁盘上共享表空间中连续的128个页,即2个区(extent),大小同样为2MB。在对缓冲池的脏页进行刷新时,并不直接写磁盘,而是会通过memcpy函数将脏页先复制到内存中的doublewrite buffer,之后通过doublewrite buffer再分两次,每次1MB顺序地写入共享表空间的物理磁盘上,然后马上调用fsync函数,同步磁盘,避免缓冲写带来的问题。在这个过程中,因为doublewrite页是连续的,因此这个过程是顺序写的,开销并不是很大。在完成doublewrite页的写入后,再将doublewrite buffer中的页写入各个表空间文件中,此时的写入则是离散的。可以通过以下命令观察到doublewrite运行的情况:
doublewrite如何工作?
你可以将doublewrite看作是在Innodb表空间内部分配的一个短期的日志文件,这一日志文件包含100个数据页。Innodb在写出缓冲区中的数据页时采用的是一次写多个页的方式,这样多个页就可以先顺序写入到doublewrite缓冲区并调用fsync()保证这些数据被写出到磁盘,然后数据页才被定出到它们实际的存储位置并再次调用fsync()。故障恢复时Innodb检查doublewrite缓冲区与数据页原存储位置的内容,若数据页在doublewrite缓冲区中处于不一致状态将被简单的丢弃,若在原存储位置中不一致则从doublewrite缓冲区中还原。
doublewrite缓冲区对MySQL有何影响?
虽然doublewrite要求每个数据页都要被写二次,其性能开销远远小于二倍。写出到doublewrite缓冲区时是顺序写,因此开销很小。 doublewrite同时还可以降低Innodb执行的fsync()操作,即不需要写每个页时都调用一下fsync(),而可以提交多个写操作最后再调用一次fsync()操作,这使得操作系统可以优化写操作的执行顺序及并行使用多个存储设备。但在不使用doublewrite技术时也可以用这些优化,事实上这些优化是与doublewrite同时实现的。因此总体来说,我预计使用doublewrite技术带来的性能开销不会超过5%到10%。
master主线程有 每1秒操作,每10秒操作,background操作
/* Number of IO operations per second the server can do */ extern ulong srv_io_capacity; /* Returns the number of IO operations that is X percent of the capacity. PCT_IO(5) -> returns the number of IO operations that is 5% of the max where max is srv_io_capacity. */ #define PCT_IO(p) ((ulong) (srv_io_capacity * ((double) p / 100.0)))
每一秒钟的操作
1)日志缓冲刷新到disk
/* Flush logs if needed */ srv_sync_log_buffer_in_background();
2)合并insert buffer
如果上一秒的disk io 小于 innodb_io_capacity的5%,将innodb_io_capacity的 5%的insert_buffer刷新至disk
#define SRV_RECENT_IO_ACTIVITY (PCT_IO(5))
/* If i/os during one second sleep were less than 5% of capacity, we assume that there is free disk i/o capacity available, and it makes sense to do an insert buffer merge. */ if (n_pend_ios < SRV_PEND_IO_THRESHOLD && (n_ios - n_ios_old < SRV_RECENT_IO_ACTIVITY)) { srv_main_thread_op_info = "doing insert buffer merge"; ibuf_contract_for_n_pages(FALSE, PCT_IO(5)); }
3)刷新缓冲区中的脏页至disk
如果缓冲区中的脏页比例大于75%,则刷新innodb_io_capacity的脏页至disk
如果不大于,通过判断重做日志的速度来判断刷新脏页的数量
srv_max_buf_pool_modified_pct 75 buf_get_modified_ratio_pct 缓冲区中的脏页比例 if (UNIV_UNLIKELY(buf_get_modified_ratio_pct() > srv_max_buf_pool_modified_pct)) { n_pages_flushed = buf_flush_list(PCT_IO(100), IB_ULONGLONG_MAX); } else if (srv_adaptive_flushing) { //通过计算重做日志的速度,得到要刷新脏页个数 ulint n_flush = buf_flush_get_desired_flush_rate(); if (n_flush) { n_flush = ut_min(PCT_IO(100), n_flush); n_pages_flushed =buf_flush_list(n_flush,IB_ULONGLONG_MAX); } }
每十秒钟的操作
1)如果过去10秒内的disk io 小于200%的innodb_io_capacity,则把缓冲区中的100个脏页刷新到disk
#define SRV_PAST_IO_ACTIVITY (PCT_IO(200)) buf_get_total_stat(&buf_stat); n_pend_ios = buf_get_n_pending_ios() + log_sys->n_pending_writes; n_ios = log_sys->n_log_ios + buf_stat.n_pages_read + buf_stat.n_pages_written; srv_main_10_second_loops++; if (n_pend_ios < SRV_PEND_IO_THRESHOLD && (n_ios - n_ios_very_old < SRV_PAST_IO_ACTIVITY)) { srv_main_thread_op_info = "flushing buffer pool pages"; buf_flush_list(PCT_IO(100), IB_ULONGLONG_MAX); /* Flush logs if needed */ srv_sync_log_buffer_in_background(); }
2)合并insert bufferr中的5个页
srv_main_thread_op_info = "doing insert buffer merge"; ibuf_contract_for_n_pages(FALSE, PCT_IO(5));
3)将日志缓冲刷新到disk,即使事务没有commit
srv_sync_log_buffer_in_background();
4)如果缓冲区中的脏页比例超过70%,则把100个脏页刷新到disk,否则只刷新10个脏页
srv_main_thread_op_info = "flushing buffer pool pages"; /* Flush a few oldest pages to make a new checkpoint younger */ if (buf_get_modified_ratio_pct() > 70) { /* If there are lots of modified pages in the buffer pool (> 70 %), we assume we can afford reserving the disk(s) for the time it requires to flush 100 pages */ n_pages_flushed = buf_flush_list( PCT_IO(100), IB_ULONGLONG_MAX); } else { /* Otherwise, we only flush a small number of pages so that we do not unnecessarily use much disk i/o capacity from other work */ n_pages_flushed = buf_flush_list( PCT_IO(10), IB_ULONGLONG_MAX); }
5)创建新的checkpoint
srv_main_thread_op_info = "making checkpoint"; /* Make a new checkpoint about once in 10 seconds */ log_checkpoint(TRUE, FALSE);
background
1)清除undo日志
srv_master_do_purge();
2)合并innodb_io_capacity个 insert_buffer页面(这个不太准)
n_bytes_merged = ibuf_contract_for_n_pages(FALSE,PCT_IO(100));