随着运行时间的增加,memtable会慢慢 转化成 sstable。
sstable会越来越多 我们就需要进行整合 compact
代码会在写入查询key值 db写入时等多出位置调用MaybeScheduleCompaction ()
检测是否需要进行compact
1 void DBImpl::MaybeScheduleCompaction() { 2 mutex_.AssertHeld(); 3 if (bg_compaction_scheduled_) { 4 // Already scheduled 5 } else if (shutting_down_.Acquire_Load()) { 6 // DB is being deleted; no more background compactions 7 } else if (imm_ == NULL && 8 manual_compaction_ == NULL && 9 !versions_->NeedsCompaction()) { 10 // No work to be done 11 } else { 12 bg_compaction_scheduled_ = true; 13 env_->Schedule(&DBImpl::BGWork, this); 14 } 15 } 16 17 void DBImpl::BGWork(void* db) { 18 reinterpret_cast<DBImpl*>(db)->BackgroundCall(); 19 } 20 21 void DBImpl::BackgroundCall() { 22 MutexLock l(&mutex_); 23 assert(bg_compaction_scheduled_); 24 if (!shutting_down_.Acquire_Load()) { 25 BackgroundCompaction(); 26 } 27 bg_compaction_scheduled_ = false; 28 29 // Previous compaction may have produced too many files in a level, 30 // so reschedule another compaction if needed. 31 MaybeScheduleCompaction(); 32 bg_cv_.SignalAll(); 33 }
实际进行compact的函数是 void DBImpl::BackgroundCompaction()
1 手动触发情况下 会填写class DBImpl下的一个变量 ManualCompaction manual_compaction_
1 struct ManualCompaction { 2 int level; 3 bool done; 4 const InternalKey* begin; // NULL means beginning of key range 5 const InternalKey* end; // NULL means end of key range 6 InternalKey tmp_storage; // Used to keep track of compaction progress 7 };
为了避免外部指定的 key-range 过大,一次 compact 过多的 sstable 文件, manual_compaction 可能不会一次做完,所以有 done 来标
识是否已经全部完成, tmp_storage 保存上一次 compact 到的 end-key,即下一次的 startkey。
指定的beg end KEY会赋值到 versions_中,以便后面进行compact。 versions_->CompactRange(m->level, m->begin, m->end);
2 通过 versions_->PickCompaction() 选择需要compact的level 和 key range
1 Compaction* VersionSet::PickCompaction() { 2 Compaction* c; 3 int level; 4 5 // We prefer compactions triggered by too much data in a level over 6 // the compactions triggered by seeks. 7 const bool size_compaction = (current_->compaction_score_ >= 1); 8 const bool seek_compaction = (current_->file_to_compact_ != NULL); 9 if (size_compaction) { 10 level = current_->compaction_level_; 11 assert(level >= 0); 12 assert(level+1 < config::kNumLevels); 13 c = new Compaction(level); 14 15 // Pick the first file that comes after compact_pointer_[level] 16 for (size_t i = 0; i < current_->files_[level].size(); i++) { 17 FileMetaData* f = current_->files_[level][i]; 18 if (compact_pointer_[level].empty() || 19 icmp_.Compare(f->largest.Encode(), compact_pointer_[level]) > 0) { 20 c->inputs_[0].push_back(f); 21 break; 22 } 23 } 24 if (c->inputs_[0].empty()) { 25 // Wrap-around to the beginning of the key space 26 c->inputs_[0].push_back(current_->files_[level][0]); 27 } 28 } else if (seek_compaction) { 29 level = current_->file_to_compact_level_; 30 c = new Compaction(level); 31 c->inputs_[0].push_back(current_->file_to_compact_); 32 } else { 33 return NULL; 34 } 35 36 c->input_version_ = current_; 37 c->input_version_->Ref(); 38 39 // Files in level 0 may overlap each other, so pick up all overlapping ones 40 if (level == 0) { 41 InternalKey smallest, largest; 42 GetRange(c->inputs_[0], &smallest, &largest); 43 // Note that the next call will discard the file we placed in 44 // c->inputs_[0] earlier and replace it with an overlapping set 45 // which will include the picked file. 46 current_->GetOverlappingInputs(0, &smallest, &largest, &c->inputs_[0]); 47 assert(!c->inputs_[0].empty()); 48 } 49 50 SetupOtherInputs(c); 51 52 return c; 53 }
PickCompaction函数中 根据 文件尺寸和被seek多次 来确认compact的文件
使用 c = new Compaction(level) 记录要compact的level和文件指针
todo 实际的compact操作 CompactMemTable() DoCompactionWork()
参考
《leveldb实现解析》 淘宝 那岩