zoukankan html css js c++ java

TCMalloc源码学习（二）

替换libc中的malloc free

不同平台替换方式不同。基于unix的系统上的glibc，使用了weak alias的方式替换。具体来说是因为这些入口函数都被定义成了weak symbols，再加上gcc支持 alias attribute，所以替换就变成了这种通用形式：

void* malloc(size_t size) __THROW __attribute__ ((alias (tc_malloc)))

因此所有malloc的调用都跳转到了tc_malloc的实现。

小块内存分配 do_malloc_small

小于等于kMaxSize（256K)的内存被划定为小块内存了，由函数do_malloc_small处理，定义如下：

 1 inline void * do_malloc_small( ThreadCache* heap , size_t size) {
 2 
 3 ASSERT( Static::IsInited ());
 4 
 5 ASSERT( heap != NULL );
 6 
 7 size_t cl = Static ::sizemap()-> SizeClass(size );
 8 
 9 size = Static::sizemap ()->class_to_size( cl);
10 
11 if (( FLAGS_tcmalloc_sample_parameter > 0) && heap ->SampleAllocation( size)) {
12 
13 return DoSampledAllocation (size);
14 
15   } else {
16 
17 // The common case, and also the simplest.  This just pops the
18 
19 // size-appropriate freelist, after replenishing it if it's empty.
20 
21 return CheckedMallocResult (heap-> Allocate(size , cl));
22 
23   }
24 
25 }
26

请求的size会被sizemap对齐成某一个相近的尺寸。sizemap管理着这些映射关系，从源size到目标size的映射主要是通过三个map实现的：

ClassIndex映射
映射方式在代码里面有比较详细的注释：

 1   // Sizes <= 1024 have an alignment >= 8.  So for such sizes we have an
 2   // array indexed by ceil(size/8).  Sizes > 1024 have an alignment >= 128.
 3   // So for these larger sizes we have an array indexed by ceil(size/128).
 4   //
 5   // We flatten both logical arrays into one physical array and use
 6   // arithmetic to compute an appropriate index.  The constants used by
 7   // ClassIndex() were selected to make the flattening work.
 8   //
 9   // Examples:
10   //   Size       Expression                      Index
11   //   -------------------------------------------------------
12   //   0          (0 + 7) / 8                     0
13   //   1          (1 + 7) / 8                     1
14   //   ...
15   //   1024       (1024 + 7) / 8                  128
16   //   1025       (1025 + 127 + (120<<7)) / 128   129
17   //   ...
18   //   32768      (32768 + 127 + (120<<7)) / 128  376
19

简而言之就是：<= 1024字节按照8字节向上取整对齐，>1024按照128字节对齐

class_array_和class_to_size_

class_array_和class_to_size_是简单的数组，在模块加载的时候在SizeMap::Init中初始化：

 1  // Compute the size classes we want to use
 2   int sc = 1;   // Next size class to assign
 3   int alignment = kAlignment;
 4   CHECK_CONDITION(kAlignment <= kMinAlign);
 5   for (size_t size = kAlignment; size <= kMaxSize; size += alignment) {
 6     alignment = AlignmentForSize(size);
 7     CHECK_CONDITION((size % alignment) == 0);
 8 
 9     int blocks_to_move = NumMoveSize(size) / 4;
10     size_t psize = 0;
11     do {
12       psize += kPageSize;
13       // Allocate enough pages so leftover is less than 1/8 of total.
14       // This bounds wasted space to at most 12.5%.
15       while ((psize % size) > (psize >> 3)) {
16         psize += kPageSize;
17       }
18       // Continue to add pages until there are at least as many objects in
19       // the span as are needed when moving objects from the central
20       // freelists and spans to the thread caches.
21     } while ((psize / size) < (blocks_to_move));
22     const size_t my_pages = psize >> kPageShift;
23 
24     if (sc > 1 && my_pages == class_to_pages_[sc-1]) {
25       // See if we can merge this into the previous class without
26       // increasing the fragmentation of the previous class.
27       const size_t my_objects = (my_pages << kPageShift) / size;
28       const size_t prev_objects = (class_to_pages_[sc-1] << kPageShift)
29                                   / class_to_size_[sc-1];
30       if (my_objects == prev_objects) {
31         // Adjust last class to include this size
32         class_to_size_[sc-1] = size;
33         continue;
34       }
35     }
36 
37     // Add new class
38     class_to_pages_[sc] = my_pages;
39     class_to_size_[sc] = size;
40     sc++;
41   }
42

class_to_size_的映射关系是按照不同size的对齐大小累加而成的，而对齐大小由 alignment = AlignmentForSize(size); 计算出，代码如下：

 1 int AlignmentForSize (size_t size) {
 2 
 3 int alignment = kAlignment ;
 4 
 5 if ( size > kMaxSize ) {
 6 
 7 // Cap alignment at kPageSize for large sizes.
 8 
 9 alignment = kPageSize ;
10 
11   } else if (size >= 128) {
12 
13 // Space wasted due to alignment is at most 1/8, i.e., 12.5%.
14 
15 alignment = (1 << LgFloor (size)) / 8;
16 
17   } else if (size >= kMinAlign) {
18 
19 // We need an alignment of at least 16 bytes to satisfy
20 
21 // requirements for some SSE types.
22 
23 alignment = kMinAlign ;
24 
25   }
26 
27 // Maximum alignment allowed is page size alignment.
28 
29 if ( alignment > kPageSize ) {
30 
31 alignment = kPageSize ;
32 
33   }
34 
35 CHECK_CONDITION( size < kMinAlign || alignment >= kMinAlign);
36 
37 CHECK_CONDITION(( alignment & (alignment - 1)) == 0);
38 
39 return alignment;
40 
41 }
42

LgFloor是个二分法求数值二进制最高位是哪一位的函数。对齐方式可以简化成如下的公式：

按照这样的公式 class_to_size_[1] = 8, class_to_size_[2] = 16, class_to_size_[3] = 32 ...

class_array_的初始化在class_to_size_之后：

 1 // Initialize the mapping arrays
 2 
 3 int next_size = 0;
 4 
 5 for ( int c = 1; c < kNumClasses; c ++) {
 6 
 7 const int max_size_in_class = class_to_size_[c ];
 8 
 9 for (int s = next_size; s <= max_size_in_class; s += kAlignment ) {
10 
11 class_array_[ClassIndex (s)] = c;
12 
13     }
14 
15 next_size = max_size_in_class + kAlignment;
16 
17   }
18

总的来说就是 ClassIndex一般按照8字节对齐，结果class_to_size_一般按照16字节对齐，class_array_就是去让他们建立对应关系。

以一个具体例子来说明这个映射关系，比如应用程序申请malloc(25)字节时，tcmalloc实际会给分配多少内存：

ClassIndex class_array_ class_to_size_

25 ----------------> (25+7)/8=4 -------------------> 3 -------------------> 32

结果是32字节的内存。

class_to_pages_ 和num_objects_to_move_

SizeMap中还有两个map：class_to_pages_ ， num_objects_to_move_ 。

class_to_pages_用在central free list中，表示该size class每一次从 page heap中分配的内存页数，初始化也在SzieMap::Init中：

 1 do {
 2 
 3       psize += kPageSize;
 4 
 5 // Allocate enough pages so leftover is less than 1/8 of total.
 6 
 7 // This bounds wasted space to at most 12.5%.
 8 
 9 while ((psize % size) > (psize >> 3)) {
10 
11         psize += kPageSize;
12 
13       }
14 
15 // Continue to add pages until there are at least as many objects in
16 
17 // the span as are needed when moving objects from the central
18 
19 // freelists and spans to the thread caches.
20 
21     } while ((psize / size) < (blocks_to_move));
22

该初始化大小受两个条件决定：

1）必须小于blocks_to_move（既num_objects_to_move_，表示每次分配内存分配多少个object）;

2) 使得分配出页内存若被划分出一个个object内存，剩余的内存空间不超过该size的1/8的约束，也就是浪费的空间要小于 size/8;

总结

SizeMap把tcmalloc所有和内存size有关的map收集封装统一管理，可以通过调整SizeMap来微调分配行为。问题是为什么把要申请的size先按照8字节对齐映射，然后又按照16字节对齐映射，最后再映射两个表？我的一开始想法是把src size直接按照16字节映射，即：

src size index dst size

0 0 0

1 1 16

2 1 16

n (n+15)/16 (n+15)/16 *16

这样实现起来更简单直观，也是可以达到目的。可能tcmalloc有更深层的原因我没发现。

查看全文

相关阅读:
Angularjs总结（一）表单验证
 list集合中指定字段去重
 NodeJS学习笔记—2.AMD规范
 NodeJS学习笔记—1.CommonJS规范
 WCF上传、下载、删除文件
 .net RAW(16)与GUID互相转换
 Angularjs总结（六）上传附件
 可以打开mdb文件的小软件
 数据库导出导入操作（expdp，impdp）
用Ueditor存入数据库带HTML标签的文本，从数据库取出来后，anjular用ng-bind-html处理带HTML标签的文本

原文地址：https://www.cnblogs.com/persistentsnail/p/3446495.html

TCMalloc源码学习（二）

替换libc中的malloc free

小块内存分配 do_malloc_small

ClassIndex映射

class_array_和class_to_size_

class_to_pages_ 和num_objects_to_move_

总结