zoukankan      html  css  js  c++  java
  • 直方图统计导致错误的执行计划

    今天下午一哥们遇到个case.他说如下SQL语句

    SELECT WORKITEMID
      FROM WFWIPARTICIPANT
     WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');

    用DBMS_STATS收集统计信息之后会走全表扫描,而用

    analyze table WFWIPARTICIPANT  compute statistics ---收集统计信息会走索引

    analyze table WFWIPARTICIPANT  delete statistics    ----删除统计信息也会走索引,Oracle采用动态采样

    表结构如下:

    SQL> desc WFWIPARTICIPANT;
    Name            Type          Nullable Default Comments
    --------------- ------------- -------- ------- --------
    WIPARTICID      NUMBER                                 
    WORKITEMID      NUMBER        Y                        
    PARTICIPANTTYPE VARCHAR2(20)  Y                        
    PARTICIPANT     VARCHAR2(256) Y                        
    PARTICIPANT2    VARCHAR2(64)  Y                        
    WORKITEMSTATE   NUMBER(2)     Y                        
    PARTIINTYPE     VARCHAR2(20)  Y                        
    EXTEND1         VARCHAR2(64)  Y     

    并且在列partintype 上面有个索引。

    我叫那位哥们把表用exp导出,然后我自己导入到我电脑测试一下,测试过程:

    首先用DBMS_STATS收集统计信息:

    SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);

    PL/SQL procedure successfully completed

    执行该SQL

    SQL> SELECT WORKITEMID
      2    FROM WFWIPARTICIPANT
      3   WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');

    已选择60行。


    执行计划
    ----------------------------------------------------------
    Plan hash value: 1708170390

    -------------------------------------------------------------------------------------
    | Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |                 |   530 |  7420 |   103   (5)| 00:00:02 |
    |*  1 |  TABLE ACCESS FULL| WFWIPARTICIPANT |   530 |  7420 |   103   (5)| 00:00:02 |
    -------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - filter("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR
                  "PARTICIPANT"='771' OR "PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')

    发现确实是走了全表扫描,于是改用ANALYZE 收集统计信息

    SQL> analyze table WFWIPARTICIPANT  delete statistics;

    表已分析。

    SQL> analyze table WFWIPARTICIPANT  compute statistics;

    表已分析。

    SQL> SELECT WORKITEMID
      2    FROM WFWIPARTICIPANT
      3   WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');

    已选择60行。


    执行计划
    ----------------------------------------------------------
    Plan hash value: 1217134846

    ---------------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |                    |    71 |   852 |    64   (0)| 00:00:01 |
    |   1 |  INLIST ITERATOR             |                    |       |       |            |          |
    |   2 |   TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT    |    71 |   852 |    64   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN          | WF_IDX_PARTICIPANT |    71 |       |     6   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
                  "PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')
    用ANALYZE命令分析之后,发现确实走了索引扫描

    解决问题的思路:遇到这类问题,首先应该查询索引的选择率:

    SQL> select a.owner,a.index_name,a.index_type,partitioned,b.num_rows,b.distinct_keys,b.num_rows/b.distinct_keys avg_row_per_key
      2  ,b.distinct_keys/b.num_rows SELECTIVITY,b.last_analyzed,b.stale_stats from dba_indexes a,dba_ind_statistics b
      3   where a.owner=b.owner and a.index_name=b.index_name and a.index_name='WF_IDX_PARTICIPANT';

    OWNER INDEX_NAME           INDEX_TYPE PARTITIONED   NUM_ROWS DISTINCT_KEYS AVG_ROW_PER_KEY SELECTIVITY LAST_ANALYZED STALE_STATS
    ----- -------------------- ---------- ----------- ---------- ------------- --------------- ----------- ------------- -----------
    TSSA  WF_IDX_PARTICIPANT   NORMAL     NO               57820          4073 14.195924380063 0.070442753 2010-4-15 17: NO

    注意观察,表一共有57820列,但是索引的列上面只有4073个不同的值,也就是说索引选择性为7%,很显然,这列数据分布不均衡。于是猜测该问题和直方图有关。现在查看列上面有没有直方图:

    SQL>  select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
      2  where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';

    OWNER TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT HISTOGRAM       NUM_BUCKETS
    ----- ------------------------------ ------------------------------ ------------ --------------- -----------
    TSSA  WFWIPARTICIPANT                PARTICIPANT                            4073 NONE                      1

    NUM_BUCKETS为1表示该列没有直方图,恩,这里用ANALYZE命令收集的统计信息里面没有直方图信息,于是改用

    DBMS_STATS收集统计信息,看看是否有直方图统计信息:

    SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);

    PL/SQL procedure successfully completed
    SQL>  select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
      2  where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';

    OWNER TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT HISTOGRAM       NUM_BUCKETS
    ----- ------------------------------ ------------------------------ ------------ --------------- -----------
    TSSA  WFWIPARTICIPANT                PARTICIPANT                            4073 HEIGHT BALANCED         254

    注意观察,NUM_BUCKETS=254,Oracle自动的对该列收集了直方图统计信息,于是怀疑直方图的存在影响了执行计划

    ,现在我删掉直方图的统计信息:

    SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,DEGREE=>16,method_opt=>'for columns size 1 PARTICIPANT',CASCADE=>TRUE);

    PL/SQL procedure successfully completed
    SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
      2  where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';

    OWNER TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT HISTOGRAM       NUM_BUCKETS
    ----- ------------------------------ ------------------------------ ------------ --------------- -----------
    TSSA  WFWIPARTICIPANT                PARTICIPANT                            4073 NONE                      1

    再次运行该查询语句

    SQL> SELECT WORKITEMID
      2    FROM WFWIPARTICIPANT
      3   WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');

    已选择60行。


    执行计划
    ----------------------------------------------------------
    Plan hash value: 1217134846

    ---------------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |                    |    71 |   994 |    64   (0)| 00:00:01 |
    |   1 |  INLIST ITERATOR             |                    |       |       |            |          |
    |   2 |   TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT    |    71 |   994 |    64   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN          | WF_IDX_PARTICIPANT |    71 |       |     6   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
                  "PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')

    当删掉直通图统计信息之后,优化器选择了我们期望的访问路径

    这个案例提醒我们,对表收集统计信息的时候,要写好参数,另外,我们在对访问路径做优化的时候,首先应该查看的就是索引的选择率,索引的类型,以及列上面关于直方图的统计信息,有了直方图的统计信息并不总是会给我们带来好处,当然也不总是会对我们带来坏处,具体问题具体对待。另外一个值得注意的就是,关于统计信息收集的方式,一定要写好参数,如果Oracle自动去收集了直方图统计信息,而我们不知道,这样对于性能诊断会带来麻烦的。

  • 相关阅读:
    设计模式之工厂模式 练习
    c++智能指针(1)
    记录下 UTF6 GBK 转换函数
    ip白名单 通过* ? 检测IP匹配 轻量级
    stl学习记录(2)
    boost 学习(1)
    stl string 小练习
    stl string 使用指定的分隔符分割成数个子字符串
    [open source] skinbuilder发布
    Builder模式实例分析(C语言版)
  • 原文地址:https://www.cnblogs.com/hehe520/p/6330598.html
Copyright © 2011-2022 走看看