zoukankan      html  css  js  c++  java
  • [20190401]那个更快的疑问.txt

    [20190401]那个更快的疑问.txt

    --//前一阵子,做了11g于10g下,单表单条记录唯一索引扫描的测试,摘要如下:
    --//参考链接:
    http://blog.itpub.net/267265/viewspace-2636321/
    http://blog.itpub.net/267265/viewspace-2636342/

    1.环境:
    --//当时的测试,在11g下测试结果如下:
    SCOTT@book> select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
    METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
    -------------------- ---------- ---------------------- -------------
    result_cache                 50                   8611        430536
    id=1_unique_index            50                   9494        474714
    null                         50                  10664        533197
    id=1_index                   50                  28160       1407987
    notnull                      50                  29279       1463928

    --//在10g下测试结果如下:
    SCOTT@test> select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
    METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
    -------------------- ---------- ---------------------- -------------
    id=1_unique_index            50                   4864        243192
    notnull                      50                  34134       1706713
    id=1_index                   50                  34703       1735173
    null                         50                  37234       1861717

    --//我的测试环境服务器硬件相同,os版本一样,对比可以发现id=1_unique_index的情况下,10g比11g快了1倍(指id=1_unique_index的情况).
    --//而其他方式下11g都明显快于10g,而10g下除了id=1_unique_index下其他执行方式都可以看到大量cbc latch等待事件.
    --//而11g仅仅在id=1_index,notnull下看到大量cbc latch等待时间,null方式下(全表扫描)的情况下反而看不到cbc
    --//latch等待事件.
    --//我一直再想,我是不是测试方法存在什么问题,或者11g做了什么改进?重复测试唯一索引的情况看看:

    1.环境:
    SCOTT@book> @ ver1
    PORT_STRING                    VERSION        BANNER
    ------------------------------ -------------- --------------------------------------------------------------------------------
    x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

    create table t as select rownum id from dual ;
    create unique index pk_t on t(id);
    create table job_times (sid number, time_ela number,method varchar2(20));
    --//分析表略.

    $ cat m1.txt
    set verify off
    host sleep $(echo &&3/50 | bc -l )
    insert into job_times values ( sys_context ('userenv', 'sid') ,dbms_utility.get_time ,'&&2') ;
    commit ;
    declare
    v_id number;
    v_d date;
    begin
        for i in 1 .. &&1 loop
            select /*+ &&3 */ count (*) into v_id from t where id=1;
            --//select /*+ &&3 */ count (*) into v_id from t ;
            --//select /*+ &&3 */ 1 into v_id from dual ;
            --//select /*+ &&3 */ sysdate into v_d from dual ;
        end loop;
    end ;
    /
    update job_times set time_ela = dbms_utility.get_time - time_ela where sid=sys_context ('userenv', 'sid') and method='&&2';
    commit;
    quit

    $ sqlplus -s -l scott/book @m1.txt 1e6 id=1_unique_index 0 >/dev/null
    SCOTT@book> Select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
    METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
    -------------------- ---------- ---------------------- -------------
    id=1_unique_index             1                   2615          2615

    --//在10g环境下重复上面的步骤略.环境:
    SCOTT@test> @ &r/ver1
    PORT_STRING                    VERSION        BANNER
    ------------------------------ -------------- ----------------------------------------------------------------
    x86_64/Linux 2.4.xx            10.2.0.4.0     Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

    $ sqlplus -s -l scott/btbtms @m1.txt 1e6 id=1_unique_index 0 >/dev/null

    SCOTT@test> select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
    METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
    -------------------- ---------- ---------------------- -------------
    id=1_unique_index             1                   1535          1535

    --//可以看出即使我单个用户执行相似的sql语句情况下,唯一索引查询10g下明显快于11g.

    2.使用strace跟踪看看:
    --//执行1e6次有点慢,改成1e5看看.

    --//11g的测试:
    $ strace -f -c sqlplus -s -l scott/book @m1.txt 1e5 id=1_unique_index 0 >/dev/null
    ...
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     72.78    0.043919           0    700708           getrusage
     19.80    0.011947           0    200179           times
      6.67    0.004024         671         6         3 wait4
      0.24    0.000147          29         5           clone
      0.18    0.000106           0       375         2 read
      0.16    0.000096           0       264       106 open
    ...
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.060341                902967       206 total

    --//10g的测试:
    $ strace -f -c sqlplus -s -l scott/btbtms @m1.txt 1e5 id=1_unique_index 0 >/dev/null
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     72.00    0.056425           0    700486           getrusage
     24.98    0.019573        2796         7         2 wait4
      0.91    0.000714           3       236           read
      0.74    0.000582          97         6           clone
      0.73    0.000572         286         2           shmdt
      0.23    0.000177           1       302       189 open
      0.16    0.000122           1       145       108 stat
      0.05    0.000042           1        65           write
    ..
      0.03    0.000025           0       116           times
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.097352                702291       347 total

    --//差异在于10g很少做times的系统调用上.10g下仅仅116次.而11g高达200179.

    --//如果再次执行跟踪如下,11g:
    $ strace -f  -p 57003
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
    --//调用7次getrusage,调用2次times.
    --//对比前面的调用比例也可以看出getrusage调用700708,times调用times200179.非常接近7:2

    --//而10g下仅仅看到:
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0

    --//正是这样的差异导致10g下明显快于11g.
    --//顺便看看times输出表示什么?
    $ man 2 times
    NAME
           times - get process times

    SYNOPSIS
           #include <sys/times.h>

           clock_t times(struct tms *buf);

    DESCRIPTION
           times() stores the current process times in the struct tms that buf points to.  The struct tms is as defined in <sys/times.h>:

           struct tms {
                  clock_t tms_utime;  /* user time */
                  clock_t tms_stime;  /* system time */
                  clock_t tms_cutime; /* user time of children */
                  clock_t tms_cstime; /* system time of children */
           };

           The  tms_utime field contains the CPU time spent executing instructions of the calling process.  The tms_stime
           field contains the CPU time spent in the system while executing tasks on behalf of the calling process.  The
           tms_cutime field contains the sum of the tms_utime and tms_cutime values for all waited-for terminated children.
           The tms_cstime field contains the sum of the tms_stime and tms_cstime values for all waited-for terminated
           children.

           Times for terminated children (and their descendants) is added in at the moment wait(2) or waitpid(2) returns
           their process ID. In particular, times of grandchildren that the children did not wait for are never seen.

           All times reported are in clock ticks.

    RETURN VALUE
           times() returns the number of clock ticks that have elapsed since an arbitrary point in the past.  For Linux 2.4
           and earlier this point is the moment the system was booted.  Since Linux 2.6, this point is (2^32/HZ) -  300
           (i.e.,  about  429 million) seconds before system boot time.  The return value may overflow the possible range of
           type clock_t.  On error, (clock_t) -1 is returned, and errno is set appropriately.

    $ cat   /proc/uptime ;uptime
    104375436.65 104342102.07
    10:12:33 up 1208 days,  1:10,  2 users,  load average: 0.01, 0.01, 0.07

    --//我的计算:
    2^32 = 4294967296
    4294967296-300 = 4294966996 ,文档提到 429 million seconds,不对明显相差10被.难道Hz的单位1/10秒吗?不懂.
    4294967296/60-300 = 429496429.6
    10437543665+429496429.6 = 10867040094.6
    --//这样与上面的结果比较接近了,再次验证看看.

    $ strace -f -p 57447 ; cat /proc/uptime
    ...
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    times({tms_utime=2276, tms_stime=382, tms_cutime=0, tms_cstime=0}) = 10865578855
    times({tms_utime=2276, tms_stime=382, tms_cutime=0, tms_cstime=0}) = 10865578855
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    getrusage(RUSAGE_SELF, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
    ^CProcess 57447 detached
    104377639.45 2484149083.69

    4294967296/2593.795/1^6 -300  = 1655562.27747374021462760164

    104377639.45+1655562.27747374021462760164
    10437763945+(4294967296/10-300) = 10867260374.6

    --//存在很大的误差放弃!!

  • 相关阅读:
    计算机视觉概要和卷积神经网络基础概念
    详解类别不平衡问题
    机器学习中几种常见的模型评价指标
    目标检测中常用的IOU、NMS和mAP
    Trie树(字典树)的C++实现
    台大林轩田老师《机器学习技法》课程笔记2:Combining Predictive Features: Aggregation Models
    台大林轩田老师《机器学习技法》课程笔记1:Embedding Numerous Features: Kernel Models
    利用OpenCV在图片和摄像头中检测人脸
    台大林轩田老师《机器学习基石》课程笔记4:How Can Machines Learn Better?
    台大林轩田老师《机器学习基石》课程笔记3:How can machines learn?
  • 原文地址:https://www.cnblogs.com/lfree/p/10637195.html
Copyright © 2011-2022 走看看