Redis中散列函数的实现:
Redis针对整数key和字符串key,采用了不同的散列函数
对于整数key,redis使用了 Thomas Wang的 32 bit Mix Function,实现了dict.c/dictIntHashFunction函数:
1 /* Thomas Wang's 32 bit Mix Function */ 2 unsigned int dictIntHashFunction(unsigned int key) 3 { 4 key += ~(key << 15); 5 key ^= (key >> 10); 6 key += (key << 3); 7 key ^= (key >> 6); 8 key += ~(key << 11); 9 key ^= (key >> 16); 10 return key; 11 }
这段代码的妙处我还没来得及仔细研究,等研究好了会在这里补上,不过找到了两个初看还不错的链接:
首先是Thomas Wang大神本人的链接:
http://web.archive.org/web/20071223173210/http://www.concentric.net/~Ttwang/tech/inthash.htm
再者是他人根据上面链接和其他资料写的总结
http://blog.csdn.net/jasper_xulei/article/details/18364313
对于字符串形式的key,redis使用了MurmurHash2算法和djb算法:
MurmurHash2算法对于key是大小写敏感的,而且在大端机器和小端机器上生成结果不一致
redis的dict.c/dictGenHashFunction是MurmurHash2算法的C语言实现:
1 unsigned int dictGenHashFunction(const void *key, int len) { 2 /* 'm' and 'r' are mixing constants generated offline. 3 They're not really 'magic', they just happen to work well. */ 4 uint32_t seed = dict_hash_function_seed; 5 const uint32_t m = 0x5bd1e995; 6 const int r = 24; 7 8 /* Initialize the hash to a 'random' value */ 9 uint32_t h = seed ^ len; 10 11 /* Mix 4 bytes at a time into the hash */ 12 const unsigned char *data = (const unsigned char *)key; 13 14 while(len >= 4) { 15 uint32_t k = *(uint32_t*)data; 16 17 k *= m; 18 k ^= k >> r; 19 k *= m; 20 21 h *= m; 22 h ^= k; 23 24 data += 4; 25 len -= 4; 26 } 27 28 /* Handle the last few bytes of the input array */ 29 switch(len) { 30 case 3: h ^= data[2] << 16; 31 case 2: h ^= data[1] << 8; 32 case 1: h ^= data[0]; h *= m; 33 }; 34 35 /* Do a few final mixes of the hash to ensure the last few 36 * bytes are well-incorporated. */ 37 h ^= h >> 13; 38 h *= m; 39 h ^= h >> 15; 40 41 return (unsigned int)h; 42 }
而redis则借助djb函数实现了不区分大小写的散列函数dict.c/dictGenCaseHashFunction:
1 unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) { 2 unsigned int hash = (unsigned int)dict_hash_function_seed; 3 4 while (len--) 5 hash = ((hash << 5) + hash) + (tolower(*buf++)); /* hash * 33 + c */ 6 return hash; 7 }
以上三个散列函数(dictIntHashFunction, dictIntHashFunction, dictGenCaseHashFunction)分别用在了redis的不同地方,用以实现了不同场合下的散列需求,接下来的文章将会详细介绍。