【版权声明:尊重原创,转载请保留出处:blog.csdn.net/shallnet,文章仅供学习交流。请勿用于商业用途】
上一节最后说到对于小内存区的请求,假设採用伙伴系统来进行分配,则会在页内产生非常多空暇空间无法使用。因此产生slab分配器来处理对小内存区(几十或几百字节)的请求。Linux中引入Slab的主要目的是为了降低对伙伴算法的调用次数。
内核常常重复使用某一内存区。比如。仅仅要内核创建一个新的进程,就要为该进程相关的数据结构(task_struct、打开文件对象等)分配内存区。当进程结束时。收回这些内存区。由于进程的创建和撤销很频繁。linux把那些频繁使用的页面保存在快速缓存中并又一次使用。
slab分配器基于对象进行管理,同样类型的对象归为一类(如进程描写叙述符就是一类),每当要申请这样一个对象。slab分配器就分配一个空暇对象出去,而当要释放时,将其又一次保存在slab分配器中,而不是直接返回给伙伴系统。
对于频繁请求的对象。创建适当大小的专用对象来处理。对于不频繁的对象。用一系列几何分布大小的对象来处理(详见通用对象)。
Slab分配模式把对象分组放进缓冲区,为缓冲区的组织和管理与硬件快速缓存的命中率密切相关,因此。Slab缓冲区并不是由各个对象直接构成。而是由一连串的“大块(Slab)”构成,而每一个大块中则包括了若干个同种类型的对象。这些对象或已被分配。或空暇。实际上。缓冲区就是主存中的一片区域,把这片区域划分为多个块。每块就是一个Slab,每一个Slab由一个或多个页面组成,每一个Slab中存放的就是对象。
slab相关数据结构:
缓冲区数据结构使用kmem_cache结构来表示。
struct kmem_cache { /* 1) per-cpu data, touched during every alloc/free */ struct array_cache *array[NR_CPUS]; /* 2) Cache tunables. Protected by cache_chain_mutex */ unsigned int batchcount; unsigned int limit; unsigned int shared; unsigned int buffer_size; u32 reciprocal_buffer_size; /* 3) touched by every alloc & free from the backend */ unsigned int flags; /* constant flags */ unsigned int num; /* # of objs per slab */ /* 4) cache_grow/shrink */ /* order of pgs per slab (2^n) */ unsigned int gfporder; /* force GFP flags, e.g. GFP_DMA */ gfp_t gfpflags; size_t colour; /* cache colouring range */ unsigned int colour_off; /* colour offset */ struct kmem_cache *slabp_cache; unsigned int slab_size; unsigned int dflags; /* dynamic flags */ /* constructor func */ void (*ctor)(void *obj); /* 5) cache creation/removal */ const char *name; struct list_head next; /* 6) statistics */ #ifdef CONFIG_DEBUG_SLAB unsigned long num_active; unsigned long num_allocations; unsigned long high_mark; unsigned long grown; unsigned long reaped; unsigned long errors; unsigned long max_freeable; unsigned long node_allocs; unsigned long node_frees; unsigned long node_overflow; atomic_t allochit; atomic_t allocmiss; atomic_t freehit; atomic_t freemiss; /* * If debugging is enabled, then the allocator can add additional * fields and/or padding to every object. buffer_size contains the total * object size including these internal fields, the following two * variables contain the offset to the user object and its size. */ int obj_offset; int obj_size; #endif /* CONFIG_DEBUG_SLAB */ /* * We put nodelists[] at the end of kmem_cache, because we want to size * this array to nr_node_ids slots instead of MAX_NUMNODES * (see kmem_cache_init()) * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache * is statically defined, so we reserve the max number of nodes. */ struct kmem_list3 *nodelists[MAX_NUMNODES]; /* * Do not add fields after nodelists[] */ };
当中struct kmem_list3结构体链接slab,共享快速缓存。其定义例如以下:
/* * The slab lists for all objects. */ struct kmem_list3 { struct list_head slabs_partial; /* partial list first, better asm code */ struct list_head slabs_full; struct list_head slabs_free; unsigned long free_objects; unsigned int free_limit; unsigned int colour_next; /* Per-node cache coloring */ spinlock_t list_lock; struct array_cache *shared; /* shared per node */ struct array_cache **alien; /* on other nodes */ unsigned long next_reap; /* updated without locking */ int free_touched; /* updated without locking */ };
该结构包括三个链表:slabs_partial、slabs_full、slabs_free,这些链表包括缓冲区全部slab。slab描写叙述符struct slab用于描写叙述每一个slab:
/* * struct slab * * Manages the objs in a slab. Placed either at the beginning of mem allocated * for a slab, or allocated from an general cache. * Slabs are chained into three list: fully used, partial, fully free slabs. */ struct slab { struct list_head list; unsigned long colouroff; void *s_mem; /* including colour offset */ unsigned int inuse; /* num of objs active in slab */ kmem_bufctl_t free; unsigned short nodeid; };
一个新的缓冲区使用例如以下函数创建:
struct kmem_cache *kmem_cache_create (const char *name, size_t size, size_t align, unsigned long flags, void (*ctor)(void *));
函数创建成功会返回一个指向所创建缓冲区的指针;撤销一个缓冲区调用例如以下函数:
<span style="font-family:Microsoft YaHei;">void kmem_cache_destroy(struct kmem_cache *cachep);</span>
上面两个函数都不能在中断上下文中使用。由于它可能睡眠。
在创建来缓冲区之后,能够通过下列函数获取对象:
/** * kmem_cache_alloc - Allocate an object * @cachep: The cache to allocate from. * @flags: See kmalloc(). * * Allocate an object from this cache. The flags are only relevant * if the cache has no available objects. */ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags) { void *ret = __cache_alloc(cachep, flags, __builtin_return_address(0)); trace_kmem_cache_alloc(_RET_IP_, ret, obj_size(cachep), cachep->buffer_size, flags); return ret; }
该函数从给点缓冲区cachep中返回一个指向对象的指针。
假设缓冲区的全部slab中都没有空暇对象,那么slab层必须通过kmem_getpages()获取新的页。參数flags传递给_get_free_pages()。
<span style="font-family:Microsoft YaHei;">static void *kmem_getpages(struct kmem_cache *cachep, gfp_t flags, int nodeid);</span>
释放对象使用例如以下函数:
/** * kmem_cache_free - Deallocate an object * @cachep: The cache the allocation was from. * @objp: The previously allocated object. * * Free an object which was previously allocated from this * cache. */ void kmem_cache_free(struct kmem_cache *cachep, void *objp) { unsigned long flags; local_irq_save(flags); debug_check_no_locks_freed(objp, obj_size(cachep)); if (!(cachep->flags & SLAB_DEBUG_OBJECTS)) debug_check_no_obj_freed(objp, obj_size(cachep)); __cache_free(cachep, objp); local_irq_restore(flags); trace_kmem_cache_free(_RET_IP_, objp); }
假设你要频繁的创建非常多同样类型的对象,就要当考虑使用slab快速缓存区。
实际上上一节所讲kmalloc()函数也是使用slab分配器分配的。
static __always_inline void *kmalloc(size_t size, gfp_t flags) { struct kmem_cache *cachep; void *ret; if (__builtin_constant_p(size)) { int i = 0; if (!size) return ZERO_SIZE_PTR; #define CACHE(x) if (size <= x) goto found; else i++; #include <linux/kmalloc_sizes.h> #undef CACHE return NULL; found: #ifdef CONFIG_ZONE_DMA if (flags & GFP_DMA) cachep = malloc_sizes[i].cs_dmacachep; else #endif cachep = malloc_sizes[i].cs_cachep; ret = kmem_cache_alloc_notrace(cachep, flags); trace_kmalloc(_THIS_IP_, ret, size, slab_buffer_size(cachep), flags); return ret; } return __kmalloc(size, flags); }kfree函数实现例如以下:
/** * kfree - free previously allocated memory * @objp: pointer returned by kmalloc. * * If @objp is NULL, no operation is performed. * * Don't free memory not originally allocated by kmalloc() * or you will run into trouble. */ void kfree(const void *objp) { struct kmem_cache *c; unsigned long flags; trace_kfree(_RET_IP_, objp); if (unlikely(ZERO_OR_NULL_PTR(objp))) return; local_irq_save(flags); kfree_debugcheck(objp); c = virt_to_cache(objp); debug_check_no_locks_freed(objp, obj_size(c)); debug_check_no_obj_freed(objp, obj_size(c)); __cache_free(c, (void *)objp); local_irq_restore(flags); }最后。结合上一节。看看分配函数的选择:
假设须要连续的物理页,就能够使用某个低级页分配器或kmalloc()。
假设想从高端内存进行分配,使用alloc_pages()。
假设不须要物理上连续的页,而不过虚拟地址上连续的页,那么就是用vmalloc。
假设要创建和销毁非常多大的数据结构,那么考虑建立slab快速缓存。