我已经在用ssdb的hash结构,存储了很多数据了,但是我现在的用法正确吗? 我使用hash结构合理吗?
1. ssdb数据库说是类似redis,而且他们都有hash结构,但是他们的命名有点不同,ssdb 是(name,key,value) ,其实相对应的redis是(key,field,value),当然了对于使用函数上还是很像的;
那么问题来了,ssdb的hash 和redis的hash结构,使用上一样吗? ssdb中(name,key)都是不能超过 SSDB_KEY_LEN_MAX= 255, redis就没这个限制。
2. ssdb中hash结构是(name,key,value),但leveldb是跳表结构(SkipList)存储的只有(key,value);
(leveldb的 key 实际上是好长的拼装,对应到ssdb 是 name+key,占用了很多空间); std::string dbkey = encode_hash_key(name, key); leveldb::Status s = db->Get(leveldb::ReadOptions(), dbkey, val); std::string encode_hash_key(const Bytes &name, const Bytes &key){ std::string buf; buf.append(1, DataType::HASH); buf.append(1, (uint8_t)name.size()); buf.append(name.data(), name.size()); buf.append(1, '='); buf.append(key.data(), key.size()); return buf; }
3. ssdb中multi_hget 最好不要用,效率不高 应该用 hscan,下面这段是multi_hget,看得出是在循环调用( serv->ssdb->hget)
int proc_multi_hget(NetworkServer *net, Link *link, const Request &req, Response *resp){ CHECK_NUM_PARAMS(3); SSDBServer *serv = (SSDBServer *)net->data; resp->push_back("ok"); Request::const_iterator it=req.begin() + 1; const Bytes name = *it; it ++; for(; it!=req.end(); it+=1){ const Bytes &key = *it; std::string val; int ret = serv->ssdb->hget(name, key, &val); if(ret == 1){ resp->push_back(key.String()); resp->push_back(val); } } return 0; } 应该使用hscan ,它的实现是这样的: HIterator* SSDBImpl::hscan(const Bytes &name, const Bytes &start, const Bytes &end, uint64_t limit){ std::string key_start, key_end; key_start = encode_hash_key(name, start); if(!end.empty()){ key_end = encode_hash_key(name, end); } return new HIterator(this->iterator(key_start, key_end, limit), name); } Iterator* SSDBImpl::iterator(const std::string &start, const std::string &end, uint64_t limit){ leveldb::Iterator *it; leveldb::ReadOptions iterate_options; iterate_options.fill_cache = false; it = db->NewIterator(iterate_options); it->Seek(start); if(it->Valid() && it->key() == start){ it->Next(); } return new Iterator(it, end, limit); } template<typename Key, class Comparator> inline void SkipList<Key,Comparator>::Iterator::Next() { assert(Valid()); node_ = node_->Next(0); }
原来看zset 的写入其实是更新了三个数据:
-
记录zset的记录总数。
std::string encode_zsize_key(const Bytes &name){ std::string buf; buf.append(1, DataType::ZSIZE); buf.append(name.data(), name.size()); return buf; }
-
按照分数排序的排行榜 key=(name+score+key) `std::string encode_zscore_key(const Bytes & name, const Bytes &key, const Bytes &score){ std::string buf; buf.append(1, DataType::ZSCORE); buf.append(1, (uint8_t)name.size()); buf.append(name.data(), name.size());
int64_t s = score.Int64(); if(s < 0){ buf.append(1, '-'); }else{ buf.append(1, '='); } s = encode_score(s); buf.append((char *)&s, sizeof(int64_t)); buf.append(1, '='); buf.append(key.data(), key.size()); return buf; }`
-
按照(name + key)对应score值的(kv存储)
std::string encode_zset_key(const Bytes &name, const Bytes &key){ std::string buf; buf.append(1, DataType::ZSET); buf.append(1, (uint8_t)name.size()); buf.append(name.data(), name.size()); buf.append(1, (uint8_t)key.size()); buf.append(key.data(), key.size()); return buf; }
下面以zset写入命令看,是如何更新这个三块数据库的。 // returns the number of newly added items static int zset_one(SSDBImpl *ssdb, const Bytes &name, const Bytes &key, const Bytes &new_score, char log_type){ int found = ssdb->zget(name, key, &old_score); if(found == 0 || old_score != new_score){ if(found){ // delete zscore key k1 = encode_zscore_key(name, key, old_score); ssdb->binlogs->Delete(k1); } // add zscore key k2 = encode_zscore_key(name, key, new_score); ssdb->binlogs->Put(k2, ""); // update zset k0 = encode_zset_key(name, key); ssdb->binlogs->Put(k0, new_score); ssdb->binlogs->add_log(log_type, BinlogCommand::ZSET, k0); return found? 0 : 1; } return 0; } int SSDBImpl::zset(const Bytes &name, const Bytes &key, const Bytes &score, char log_type){ Transaction trans(binlogs); int ret = zset_one(this, name, key, score, log_type); if(ret >= 0){ if(ret > 0){ if(incr_zsize(this, name, ret) == -1){ return -1; } } leveldb::Status s = binlogs->commit(); if(!s.ok()){ log_error("zset error: %s", s.ToString().c_str()); return -1; } } return ret; }
发现这种查询用户排行多少这种时,效率就非常差了; int64_t SSDBImpl::zrrank(const Bytes &name, const Bytes &key){ ZIterator *it = ziterator(this, name, "", "", "", INT_MAX, Iterator::BACKWARD); uint64_t ret = 0; while(true){ if(it->next() == false){ ret = -1; break; } if(key == it->key){ break; } ret ++; } delete it; return ret; }
总结: 按照score分数范围遍历是很高效的, 查询用户score分数是 很快的。 但是查询用户的rank排行,效率就很差,要从小到大遍历。
转自:https://github.com/sunwsh/sunwsh.github.io/wiki/ssdb%E6%BA%90%E7%A0%81%E5%AD%A6%E4%B9%A0--%E7%AC%AC%E4%B8%80%E5%A4%A9%EF%BC%88hash%E7%BB%93%E6%9E%84%EF%BC%89