memcached通过LRU算法(least recently usage)把过期的对象淘汰掉。
简单点说,每个slab自己就是一个双向链表。热数据在链头,冷数据在链尾。
- 创建对象的时候,把这个对象放到链头。
- 当创建对象时分配内存不足,则把链尾的冷数据淘汰掉。
- 同一个对象更新会把对象的时间属性更新。而查询对象是不会更新时间属性的!
2015-4-19发布的mc 1.4.23中。对mc的lru进行了优化。见:https://code.google.com/p/memcached/wiki/ReleaseNotes1423。官方说明如下:
This release is a reworking of memcached's core LRU algorithm. global cache_lock is gone, LRU's are now independently locked. LRU's are now split between HOT, WARM, and COLD LRU's. New items enter the HOT LRU. LRU updates only happen as items reach the bottom of an LRU. If active in HOT, stay in HOT, if active in WARM, stay in WARM. If active in COLD, move to WARM. HOT/WARM each capped at 32% of memory available for that slab class. COLD is uncapped. Items flow from HOT/WARM into COLD.
A background thread exists which shuffles items between/within the LRU's as capacities are reached. The primary goal is to better protect active items from "scanning". items which are never hit again will flow from HOT, through COLD, and out the bottom. Items occasionally active (reaching COLD, but being hit before eviction), move to WARM. There they can stay relatively protected. A secondary goal is to improve latency. The LRU locks are no longer used on item reads, only during sets and from the background thread. Also the background thread is likely to find expired items and release them back to the slab class asynchronously, which speeds up new allocations. Further work on the thread should improve this.
通过这样的策略,我觉得最大的一个区别是:冷区数据只会跑到温区数据,而不是之前我理解的直接跑到链头热区。当这个双向链表数据量很大,需要经常触发lru的时候,可以很好的优化lru的性能。
代码的角度说,主要触发LRU是在do_item_alloc方法分配创建对象的时候;或者启动mc的时候增加参数(lru_maintainer),mc运行时候就会启动后台线程定时执行lru。不过,最终都是调用lru_pull_tail这个方法实现lru的。增加注释说明:
/* Returns number of items remove, expired, or evicted. * Callable from worker threads or the LRU maintainer thread * LRU 具体的算法。 * 根据cur_lru参数 只遍历某一个区域(热、温、冷区) * LRU策略如果是hot的话不作为;warm的话移动到cold区,cold才是真的淘汰 * * orig_id slab数值的id * cur_lru cold warm hot lru * total_chunks 0? * do_evict 是否剔除。cold才是有可能剔除,其他区域策略都是是false。cold剔除就是淘汰,不剔除就是移动到温区 * cur_hv hash值 * */ static int lru_pull_tail(const int slab_idx, const int cur_lru, const unsigned int total_chunks, const bool do_evict, const uint32_t cur_hv) { item *it = NULL; //临时变量,记录有没有找到要淘汰或者移动到另外一个区域的对象 int slabIdx = slab_idx; int removed = 0; if (slabIdx == 0) return 0; int tries = 5; //下面的外层循环只跑5次 item *search; item *next_it; void *hold_lock = NULL; unsigned int move_to_lru = 0; //移动到哪一个区域 uint64_t limit; slabIdx |= cur_lru; //第几个slab+某一种LRU | 运算后,作为一个标记作为锁的标记。 pthread_mutex_lock(&lru_locks[slabIdx]); search = tails[slabIdx]; //某一个区域中最尾部的元素 /* We walk up *only* for locked items, and if bottom is expired. */ //------------------------------------------------------------------------------------------------循环开始 for (; tries > 0 && search != NULL; tries--, search=next_it) { /* we might relink search mid-loop, so search->prev isn't reliable */ next_it = search->prev; if (search->nbytes == 0 && search->nkey == 0 && search->it_flags == 1) {//这个对象是个爬虫? /* We are a crawler, ignore it. */ tries++; continue; } uint32_t hv = hash(ITEM_key(search), search->nkey); /* Attempt to hash item lock the "search" item. If locked, no * other callers can incr the refcount. Also skip ourselves. */ if (hv == cur_hv || (hold_lock = item_trylock(hv)) == NULL) //同一个对象? continue; /* Now see if the item is refcount locked */ if (refcount_incr(&search->refcount) != 2) { /* Note pathological case with ref'ed items in tail. * Can still unlink the item, but it won't be reusable yet */ itemstats[slabIdx].lrutail_reflocked++; /* In case of refcount leaks, enable for quick workaround. */ /* WARNING: This can cause terrible corruption */ if (settings.tail_repair_time && search->time + settings.tail_repair_time < current_time) { itemstats[slabIdx].tailrepairs++; search->refcount = 1; /* This will call item_remove -> item_free since refcnt is 1 */ do_item_unlink_nolock(search, hv); item_trylock_unlock(hold_lock); continue; } } /* Expired or flushed 过期或者清空过数据,删 */ if ((search->exptime != 0 && search->exptime < current_time) || is_flushed(search)) { itemstats[slabIdx].reclaimed++; if ((search->it_flags & ITEM_FETCHED) == 0) { itemstats[slabIdx].expired_unfetched++; } /* refcnt 2 -> 1 */ do_item_unlink_nolock(search, hv); /* refcnt 1 -> 0 -> item_free */ do_item_remove(search); item_trylock_unlock(hold_lock); removed++; /* If all we're finding are expired, can keep going */ continue; } /* If we're HOT_LRU or WARM_LRU and over size limit, send to COLD_LRU. * If we're COLD_LRU, send to WARM_LRU unless we need to evict * 某个对象已经确认, 现在判定在哪一个LRU策略 */ switch (cur_lru) { case HOT_LRU: limit = total_chunks * settings.hot_lru_pct / 100; //不淘汰 //break; case WARM_LRU: limit = total_chunks * settings.warm_lru_pct / 100; //limit是warm区可以容纳的数量 超过这个limit就把对象丢到cold区 if (sizes[slabIdx] > limit) { itemstats[slabIdx].moves_to_cold++; move_to_lru = COLD_LRU; do_item_unlink_q(search); it = search; removed++; break; } else if ((search->it_flags & ITEM_ACTIVE) != 0) { /* 更新对象的时间 Only allow ACTIVE relinking if we're not too large. */ itemstats[slabIdx].moves_within_lru++; search->it_flags &= ~ITEM_ACTIVE; do_item_update_nolock(search); do_item_remove(search); item_trylock_unlock(hold_lock); } else { /* Don't want to move to COLD, not active, bail out */ it = search; } break; case COLD_LRU: it = search; /* 如果是这种策略,一定会在这个策略内处理完并且退出的 No matter what, we're stopping */ if (do_evict) { /* 淘汰对象 */ if (settings.evict_to_free == 0) { /* Don't think we need a counter for this. It'll OOM. */ break; } itemstats[slabIdx].evicted++; itemstats[slabIdx].evicted_time = current_time - search->time; if (search->exptime != 0) itemstats[slabIdx].evicted_nonzero++; if ((search->it_flags & ITEM_FETCHED) == 0) { itemstats[slabIdx].evicted_unfetched++; } do_item_unlink_nolock(search, hv); removed++; } else if ((search->it_flags & ITEM_ACTIVE) != 0 && settings.lru_maintainer_thread) { itemstats[slabIdx].moves_to_warm++; search->it_flags &= ~ITEM_ACTIVE; move_to_lru = WARM_LRU; do_item_unlink_q(search); removed++; } break; } if (it != NULL) break; } //------------------------------------------------------------------------------------------------循环结束 pthread_mutex_unlock(&lru_locks[slabIdx]); if (it != NULL) { //search到一个对象,移动到某个区域(温or冷) if (move_to_lru) { it->slabs_clsid = ITEM_clsid(it); it->slabs_clsid |= move_to_lru; item_link_q(it); } do_item_remove(it); item_trylock_unlock(hold_lock); } return removed; }
itemstats是统计相关的数据,与这个逻辑没太大关系的。