zoukankan      html  css  js  c++  java
  • 把握linux内核设计思想(十二):内存管理之slab分配器

    【版权声明:尊重原创,转载请保留出处:blog.csdn.net/shallnet,文章仅供学习交流。请勿用于商业用途】

           上一节最后说到对于小内存区的请求,假设採用伙伴系统来进行分配,则会在页内产生非常多空暇空间无法使用。因此产生slab分配器来处理对小内存区(几十或几百字节)的请求。Linux中引入Slab的主要目的是为了降低对伙伴算法的调用次数。

            内核常常重复使用某一内存区。比如。仅仅要内核创建一个新的进程,就要为该进程相关的数据结构(task_struct、打开文件对象等)分配内存区。当进程结束时。收回这些内存区。由于进程的创建和撤销很频繁。linux把那些频繁使用的页面保存在快速缓存中并又一次使用。

            slab分配器基于对象进行管理,同样类型的对象归为一类(如进程描写叙述符就是一类),每当要申请这样一个对象。slab分配器就分配一个空暇对象出去,而当要释放时,将其又一次保存在slab分配器中,而不是直接返回给伙伴系统。

    对于频繁请求的对象。创建适当大小的专用对象来处理。对于不频繁的对象。用一系列几何分布大小的对象来处理(详见通用对象)。

            Slab分配模式把对象分组放进缓冲区为缓冲区的组织和管理与硬件快速缓存的命中率密切相关,因此。Slab缓冲区并不是由各个对象直接构成。而是由一连串的“大块(Slab)”构成,而每一个大块中则包括了若干个同种类型的对象。这些对象或已被分配。或空暇。实际上。缓冲区就是主存中的一片区域,把这片区域划分为多个块。每块就是一个Slab,每一个Slab由一个或多个页面组成,每一个Slab中存放的就是对象。

    slab相关数据结构:

    缓冲区数据结构使用kmem_cache结构来表示。

    struct kmem_cache {
    /* 1) per-cpu data, touched during every alloc/free */
    	struct array_cache *array[NR_CPUS];
    /* 2) Cache tunables. Protected by cache_chain_mutex */
    	unsigned int batchcount;
    	unsigned int limit;
    	unsigned int shared;
    
    	unsigned int buffer_size;
    	u32 reciprocal_buffer_size;
    /* 3) touched by every alloc & free from the backend */
    
    	unsigned int flags;		/* constant flags */
    	unsigned int num;		/* # of objs per slab */
    
    /* 4) cache_grow/shrink */
    	/* order of pgs per slab (2^n) */
    	unsigned int gfporder;
    
    	/* force GFP flags, e.g. GFP_DMA */
    	gfp_t gfpflags;
    
    	size_t colour;			/* cache colouring range */
    	unsigned int colour_off;	/* colour offset */
    	struct kmem_cache *slabp_cache;
    	unsigned int slab_size;
    	unsigned int dflags;		/* dynamic flags */
    
    	/* constructor func */
    	void (*ctor)(void *obj);
    
    /* 5) cache creation/removal */
    	const char *name;
    	struct list_head next;
    
    /* 6) statistics */
    #ifdef CONFIG_DEBUG_SLAB
    	unsigned long num_active;
    	unsigned long num_allocations;
    	unsigned long high_mark;
    	unsigned long grown;
    	unsigned long reaped;
    	unsigned long errors;
    	unsigned long max_freeable;
    	unsigned long node_allocs;
    	unsigned long node_frees;
    	unsigned long node_overflow;
    	atomic_t allochit;
    	atomic_t allocmiss;
    	atomic_t freehit;
    	atomic_t freemiss;
    
    	/*
    	 * If debugging is enabled, then the allocator can add additional
    	 * fields and/or padding to every object. buffer_size contains the total
    	 * object size including these internal fields, the following two
    	 * variables contain the offset to the user object and its size.
    	 */
    	int obj_offset;
    	int obj_size;
    #endif /* CONFIG_DEBUG_SLAB */
    
    	/*
    	 * We put nodelists[] at the end of kmem_cache, because we want to size
    	 * this array to nr_node_ids slots instead of MAX_NUMNODES
    	 * (see kmem_cache_init())
    	 * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
    	 * is statically defined, so we reserve the max number of nodes.
    	 */
    	struct kmem_list3 *nodelists[MAX_NUMNODES];
    	/*
    	 * Do not add fields after nodelists[]
    	 */
    };

    当中struct kmem_list3结构体链接slab,共享快速缓存。其定义例如以下:

    /*
     * The slab lists for all objects.
     */
    struct kmem_list3 {
    	struct list_head slabs_partial;	/* partial list first, better asm code */
    	struct list_head slabs_full;
    	struct list_head slabs_free;
    	unsigned long free_objects;
    	unsigned int free_limit;
    	unsigned int colour_next;	/* Per-node cache coloring */
    	spinlock_t list_lock;
    	struct array_cache *shared;	/* shared per node */
    	struct array_cache **alien;	/* on other nodes */
    	unsigned long next_reap;	/* updated without locking */
    	int free_touched;		/* updated without locking */
    };
    

    该结构包括三个链表:slabs_partialslabs_full、slabs_free,这些链表包括缓冲区全部slab。slab描写叙述符struct slab用于描写叙述每一个slab:

    /*
     * struct slab
     *
     * Manages the objs in a slab. Placed either at the beginning of mem allocated
     * for a slab, or allocated from an general cache.
     * Slabs are chained into three list: fully used, partial, fully free slabs.
     */
    struct slab {
    	struct list_head list;
    	unsigned long colouroff;
    	void *s_mem;		/* including colour offset */
    	unsigned int inuse;	/* num of objs active in slab */
    	kmem_bufctl_t free;
    	unsigned short nodeid;
    };
    

    一个新的缓冲区使用例如以下函数创建:

    struct kmem_cache *kmem_cache_create (const char *name, size_t size, size_t align, unsigned long flags, void (*ctor)(void *)); 

    函数创建成功会返回一个指向所创建缓冲区的指针;撤销一个缓冲区调用例如以下函数:

    <span style="font-family:Microsoft YaHei;">void kmem_cache_destroy(struct kmem_cache *cachep);</span>

    上面两个函数都不能在中断上下文中使用。由于它可能睡眠。

    在创建来缓冲区之后,能够通过下列函数获取对象:

    /**
     * kmem_cache_alloc - Allocate an object
     * @cachep: The cache to allocate from.
     * @flags: See kmalloc().
     *
     * Allocate an object from this cache.  The flags are only relevant
     * if the cache has no available objects.
     */
    void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
    {
    	void *ret = __cache_alloc(cachep, flags, __builtin_return_address(0));
    
    	trace_kmem_cache_alloc(_RET_IP_, ret,
    			       obj_size(cachep), cachep->buffer_size, flags);
    
    	return ret;
    }

    该函数从给点缓冲区cachep中返回一个指向对象的指针。

    假设缓冲区的全部slab中都没有空暇对象,那么slab层必须通过kmem_getpages()获取新的页。參数flags传递给_get_free_pages()。

    <span style="font-family:Microsoft YaHei;">static void *kmem_getpages(struct kmem_cache *cachep, gfp_t flags, int nodeid);</span>

    释放对象使用例如以下函数:

    /**
     * kmem_cache_free - Deallocate an object
     * @cachep: The cache the allocation was from.
     * @objp: The previously allocated object.
     *
     * Free an object which was previously allocated from this
     * cache.
     */
    void kmem_cache_free(struct kmem_cache *cachep, void *objp)
    {
    	unsigned long flags;
    
    	local_irq_save(flags);
    	debug_check_no_locks_freed(objp, obj_size(cachep));
    	if (!(cachep->flags & SLAB_DEBUG_OBJECTS))
    		debug_check_no_obj_freed(objp, obj_size(cachep));
    	__cache_free(cachep, objp);
    	local_irq_restore(flags);
    
    	trace_kmem_cache_free(_RET_IP_, objp);
    }

    假设你要频繁的创建非常多同样类型的对象,就要当考虑使用slab快速缓存区。

    实际上上一节所讲kmalloc()函数也是使用slab分配器分配的。

    static __always_inline void *kmalloc(size_t size, gfp_t flags)
    {
    	struct kmem_cache *cachep;
    	void *ret;
    
    	if (__builtin_constant_p(size)) {
    		int i = 0;
    
    		if (!size)
    			return ZERO_SIZE_PTR;
    
    #define CACHE(x) 
    		if (size <= x) 
    			goto found; 
    		else 
    			i++;
    #include <linux/kmalloc_sizes.h>
    #undef CACHE
    		return NULL;
    found:
    #ifdef CONFIG_ZONE_DMA
    		if (flags & GFP_DMA)
    			cachep = malloc_sizes[i].cs_dmacachep;
    		else
    #endif
    			cachep = malloc_sizes[i].cs_cachep;
    
    		ret = kmem_cache_alloc_notrace(cachep, flags);
    
    		trace_kmalloc(_THIS_IP_, ret,
    			      size, slab_buffer_size(cachep), flags);
    
    		return ret;
    	}
    	return __kmalloc(size, flags);
    }
    kfree函数实现例如以下:

    /**
     * kfree - free previously allocated memory
     * @objp: pointer returned by kmalloc.
     *
     * If @objp is NULL, no operation is performed.
     *
     * Don't free memory not originally allocated by kmalloc()
     * or you will run into trouble.
     */
    void kfree(const void *objp)
    {
    	struct kmem_cache *c;
    	unsigned long flags;
    
    	trace_kfree(_RET_IP_, objp);
    
    	if (unlikely(ZERO_OR_NULL_PTR(objp)))
    		return;
    	local_irq_save(flags);
    	kfree_debugcheck(objp);
    	c = virt_to_cache(objp);
    	debug_check_no_locks_freed(objp, obj_size(c));
    	debug_check_no_obj_freed(objp, obj_size(c));
    	__cache_free(c, (void *)objp);
    	local_irq_restore(flags);
    }

    最后。结合上一节。看看分配函数的选择:

    假设须要连续的物理页,就能够使用某个低级页分配器或kmalloc()。
    假设想从高端内存进行分配,使用alloc_pages()。


    假设不须要物理上连续的页,而不过虚拟地址上连续的页,那么就是用vmalloc。
    假设要创建和销毁非常多大的数据结构,那么考虑建立slab快速缓存。

  • 相关阅读:
    [CSP-S模拟测试]:迷宫(最短路)
    [CSP-S模拟测试]:五子棋(模拟)
    [CSP-S模拟测试]:点亮(状压DP+树上背包DP)
    [CSP-S模拟测试]:统计(树状数组+乱搞)
    [CSP-S模拟测试]:组合(欧拉路)
    [CSP-S模拟测试]:笨小猴(随机化)
    最小表示法
    BZOJ4868 [Shoi2017]期末考试 【三分 + 贪心】
    BZOJ4870 [Shoi2017]组合数问题 【组合数 + 矩乘】
    BZOJ4919 [Lydsy1706月赛]大根堆 【dp + 启发式合并】
  • 原文地址:https://www.cnblogs.com/cxchanpin/p/7216182.html
Copyright © 2011-2022 走看看