zoukankan      html  css  js  c++  java
  • linux主机load average的概念&&计算过程&&注意事项

            最近开发的一个模块需要根据机房各节点的负载情况(如网卡IO、load average等指标)做任务调度,刚开始对Linux机器load average这项指标不是很清楚,经过调研,终于搞清楚了其计算方法和影响因素,作为笔记,记录于此。
    1. load average
            当在shell终端键入top命令时,默认情况下,在输出内容的第一行会有load average这项指标值,如下所示:

    top - 19:10:32 up 626 days,  4:58,  1 user,  load average: 7.74, 5.62, 6.51
    Tasks: 181 total,   8 running, 173 sleeping,   0 stopped,   0 zombie
    Cpu(s):  4.0% us,  0.5% sy,  0.0% ni, 95.4% id,  0.0% wa,  0.0% hi,  0.0% si

            同样,输入uptime命令,load average也会被输出:

    19:15:10 up 129 days,  5:12, 15 users,  load average: 0.01, 0.09, 0.05    

            根据man uptime的说明可知, load average包含的3个值分别表示past 1, 5 and 15 minutes内的系统平均负载。
            那么,这3个值是怎么计算出来的?下面从Linux源码中寻找答案。
    2. linux机器load average的计算过程
            wikipedia在对load的解释( 参见这里)中,提到了linux系统对load的计算方法,为亲自验证,我check了linux源码(linux kernel 2.6.9)中的相关代码,自顶向下的验证过程如下。
            在源码树kernel/timer.c文件中,计算系统load的函数代码如下:

    // 源码树路径:kernel/timer.c
    /*
     * Hmm.. Changed this, as the GNU make sources (load.c) seems to
     * imply that avenrun[] is the standard name for this kind of thing.
     * Nothing else seems to be standardized: the fractional size etc
     * all seem to differ on different machines.
     *
     * Requires xtime_lock to access.
     */
    unsigned long avenrun[3];
    
    /*
     * calc_load - given tick count, update the avenrun load estimates.
     * This is called while holding a write_lock on xtime_lock.
     */
    static inline void calc_load(unsigned long ticks)
    {
    	unsigned long active_tasks; /* fixed-point */
    	static int count = LOAD_FREQ;
    
    	count -= ticks;
    	if (count < 0) {
    		count += LOAD_FREQ;
    		active_tasks = count_active_tasks();
    		CALC_LOAD(avenrun[0], EXP_1, active_tasks);
    		CALC_LOAD(avenrun[1], EXP_5, active_tasks);
    		CALC_LOAD(avenrun[2], EXP_15, active_tasks);
    	}
    }

            从上面的代码可知,定义的数组avenrun[]包含3个元素,分别用于存放past 1, 5 and 15 minutes的load average值。calc_load则是具体的计算函数,其参数ticks表示采样间隔。函数体中,获取当前的活跃进程数(active tasks),然后以其为参数,调用CALC_LOAD分别计算3种load average。
            沿着函数调用链,可以看到count_active_tasks()定义如下(也在kernel/timer.c文件中):

    /*  
     * Nr of active tasks - counted in fixed-point numbers
     */
    static unsigned long count_active_tasks(void)
    {
    	return (nr_running() + nr_uninterruptible()) * FIXED_1;
    }

            由源码可见,count_active_tasks()返回当前的活跃进程数,其中活跃进程包括:1)当前正在运行的进程(nr_running);2)不可中断的sleeping进程(如正在执行IO操作的被挂起进程)。
            关于nr_running进程和nr_uninterruptible进程的计算方法,可以在源码树kernel/schde.c中看到相关代码:

    // 源码树路径:kernel/sched.c
    /*
     * nr_running, nr_uninterruptible and nr_context_switches:
     *
     * externally visible scheduler statistics: current number of runnable
     * threads, current number of uninterruptible-sleeping threads, total
     * number of context switches performed since bootup.
     */
    unsigned long nr_running(void)
    {
    	unsigned long i, sum = 0;
    
    	for (i = 0; i < NR_CPUS; i++)
    		sum += cpu_rq(i)->nr_running;
    
    	return sum;
    }
    
    unsigned long nr_uninterruptible(void)
    {
    	unsigned long i, sum = 0;
    
    	for_each_cpu(i)
    		sum += cpu_rq(i)->nr_uninterruptible;
    
    	return sum;
    }

            继续沿着函数调用链查看,可在include/linux/sched.h中看到CALC_LOAD的定义:

    // 源码树路径:include/linux/sched.h
    /*
     * These are the constant used to fake the fixed-point load-average
     * counting. Some notes:
     *  - 11 bit fractions expand to 22 bits by the multiplies: this gives
     *    a load-average precision of 10 bits integer + 11 bits fractional
     *  - if you want to count load-averages more often, you need more
     *    precision, or rounding will get you. With 2-second counting freq,
     *    the EXP_n values would be 1981, 2034 and 2043 if still using only
     *    11 bit fractions.
     */
    extern unsigned long avenrun[];		/* Load averages */
    
    #define FSHIFT		11		/* nr of bits of precision */
    #define FIXED_1		(1<<FSHIFT)	/* 1.0 as fixed-point */
    #define LOAD_FREQ	(5*HZ)		/* 5 sec intervals */
    #define EXP_1		1884		/* 1/exp(5sec/1min) as fixed-point */
    #define EXP_5		2014		/* 1/exp(5sec/5min) */
    #define EXP_15		2037		/* 1/exp(5sec/15min) */
    
    #define CALC_LOAD(load,exp,n) 
    	load *= exp; 
    	load += n*(FIXED_1-exp); 
    	load >>= FSHIFT;

            可以看到,CALC_LOAD是一个宏定义,load average的值与3个参数相关,但若只考虑某1项指标值(如past 5 minutes的load average),则该值只受当前活跃进程数(active tasks)的影响,而活跃进程数包括两种:当前正在运行的进程和不可中断的挂起进程。
            这符合我的观察结果:三台硬件配置相同的linux机器(8 cup, 16GB memory, 1.8T disk),在当前总进程数相差不多(均为170+)的情况下,其中1台机器有1个普通进程(这里的"普通"是指既非CPU型又非IO型)在运行,其余均sleeping;第2台机器有5个cpu型进程,cpu占用率均达到99%,其余进程sleeping;第3台机器2个进程读写硬盘,其余sleeping。很明显地可以看到:第3台机器的load average指标的3个值均为最大,第2台机器次之,第1台机器的3个值均接近0。
            由此,还可以推断出:与running类型的进程相比,uninterruptible类型的进程(如正在进行IO操作)对系统load的影响较大。( 注:该推断暂无数据或代码支撑,若有误,欢迎指正

    3. 理解load average背后的含义
            上面介绍了load average的概念及linux系统对该指标的计算过程,那么,这个指标值到底怎么解读呢?这篇文章给出了详细且形象的说明,此处不再赘述。

    【参考资料】
    1. wikipedia: Load (computing) 
    2. linux源码(内核版本2.6.9)
    3. Understanding Linux CPU Load - when should you be worried? 

    ================== EOF ===================


  • 相关阅读:
    第36课 经典问题解析三
    第35课 函数对象分析
    67. Add Binary
    66. Plus One
    58. Length of Last Word
    53. Maximum Subarray
    38. Count and Say
    35. Search Insert Position
    28. Implement strStr()
    27. Remove Element
  • 原文地址:https://www.cnblogs.com/jiangu66/p/3162865.html
Copyright © 2011-2022 走看看