zoukankan      html  css  js  c++  java
  • 用linux perf命令来分析程序的cpu cache miss现象

    #include <stdio.h>
    #include <unistd.h>
     
    int main(int argc, char **argv)
    {
        int a[1000][1000];
        if(1 == argc)
        {
            for(int i = 0; i < 1000; ++i)
            {
                    for(int j = 0; j < 1000; ++j)
                    {
                            a[i][j] = 0;
                    }
            }
        }
        else
        {
            for(int i = 0; i < 1000; ++i)
            {
                    for(int j = 0; j < 1000; ++j)
                    {
                            a[j][i] = 0;
                    }
            }
        }
     
        return 0;
    }
     

     上面有两个小程序片段, 哪段效率高? 显然, 第一段效率高, 为什么呢? 因为在C/C++中,数组是按行存储的,程序的按行访问可以充分利用程序的局部性原理(空间局部性), 用time命令来看看结果:

    [root@bogon c++]# g++ miss.c  -o miss
    [root@bogon c++]# ./miss 
    [root@bogon c++]# time ./miss 
    
    real    0m0.009s
    user    0m0.009s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 1
    
    real    0m0.013s
    user    0m0.013s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 
    
    real    0m0.010s
    user    0m0.010s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 1
    
    real    0m0.013s
    user    0m0.013s
    sys     0m0.000s
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss
    
     Performance counter stats for './miss':
    
                88,780      L1-dcache-load-misses                                       
    
           0.009002291 seconds time elapsed
    
           0.009174000 seconds user
           0.000000000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 1
    
     Performance counter stats for './miss 1':
    
             1,015,683      L1-dcache-load-misses                                       
    
           0.012000335 seconds time elapsed
    
           0.006059000 seconds user
           0.006059000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 1
    
     Performance counter stats for './miss 1':
    
             1,015,363      L1-dcache-load-misses                                       
    
           0.012145156 seconds time elapsed
    
           0.006134000 seconds user
           0.006134000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 0
    
     Performance counter stats for './miss 0':
    
             1,011,740      L1-dcache-load-misses                                       
    
           0.012363858 seconds time elapsed
    
           0.012484000 seconds user
           0.000000000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 0
    
     Performance counter stats for './miss 0':
    
             1,015,347      L1-dcache-load-misses                                       
    
           0.012348778 seconds time elapsed
    
           0.006237000 seconds user
           0.006237000 seconds sys
  • 相关阅读:
    【QTP】自动化测试:
    sql基本语句
    【转】ASP.NET网站怎么发布web项目程序和怎么部署
    NHibernate的简单例子
    解决ehcache的UpdateChecker问题
    正则表达式的贪婪与懒惰
    Linux查找文件夹名
    centos安装lxml和pyspider
    如何通过写一个chrome扩展启动本地程序
    网页图片滚动效果
  • 原文地址:https://www.cnblogs.com/dream397/p/14694998.html
Copyright © 2011-2022 走看看