zoukankan      html  css  js  c++  java
  • SQL调优:带函数的谓词导致CBO Cardinality计算误差

    今天处理了这样一问题,where条件中存在函数fun(date)<to_date('9999-01-01','YYYY-MM-DD')这样的无实际意义谓词,导致CBO计算基数时cardinality远小于实际情况,导致优化器认为2个源数据集的基数都不大,从而选择了HASH JOIN Right SEMI+SORT ORDER BY的执行计划,但是由于实际基数远大于computed 计算值所以变成了大的数据集做HASH JOIN并全数据排序,而实际该SQL只要求返回几十行数据而已,使用NESTED LOOP SEMI JOIN可以立即返回排序的前20行数据。 这里就需要解释带函数的谓词时CBO如何计算基数,我们通过下面的例子来说明:    
    create or replace function check_date( RDATE in date)  return date is 
    begin
    IF rdate< to_date('2099-01-01','YYYY-MM-DD') then   return rdate;   ELSIF  rdate >=to_date('2099-01-01','YYYY-MM-DD') then 
     return to_date('2000-01-01');
     end if;
     end check_date;
     /
    
     SQL> select check_date (sysdate) from dual;
    
    CHECK_DAT
    ---------
    06-DEC-12
    
    drop table tab1;
    
    SQL> create table tab1 tablespace users as select * from dba_objects where rownum create view vtab1 as select object_id as id , object_name as name, object_type as type , check_date(created) cdata from tab1;
    
    View created.
    
    SQL> select count(distinct cdata) from vtab1;
    
    COUNT(DISTINCTCDATA)
    --------------------
                     130
    
    SQL> exec dbms_stats.gather_table_stats('','TAB1', method_opt=>'FOR ALL COLUMNS SIZE 254');
    
    PL/SQL procedure successfully completed.
        因为我们指定收集了直方图所以若直接以"created"为条件查询时可以获得较好的计算基数
       SQL> select count(*) from tab1 where  created  >= to_date('0001-10-10','YYYY-MM-DD');
    
      COUNT(*)
    ----------
         10000
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1117438016
    
    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |     1 |     8 |    40   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
    |*  2 |   TABLE ACCESS FULL| TAB1 | 10000 | 80000 |    40   (0)| 00:00:01 |
    ---------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter("CREATED">=TO_DATE(' 0001-10-10 00:00:00', 'syyyy-mm-dd
                  hh24:mi:ss'))
    
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
            133  consistent gets
              0  physical reads
              0  redo size
            526  bytes sent via SQL*Net to client
            523  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
        在以上查询中>= to_date('0001-10-10','YYYY-MM-DD'); 这样的过滤条件实际无意义,在直接使用 "created"列作为谓词的情况下,CBO可以获得很好的基数10000。    
    SQL> select * from TABLE(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST'));
    
    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID  6zy2k9dy4cv73, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */ count(*) from vtab1 where  cdata
    >= to_date('0001-10-10','YYYY-MM-DD')
    
    Plan hash value: 1117438016
    
    -------------------------------------------------------------------------------------
    | Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    -------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.25 |     154 |
    |   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.25 |     154 |
    |*  2 |   TABLE ACCESS FULL| TAB1 |      1 |    500 |  10000 |00:00:00.31 |     154 |
    -------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter("CHECK_DATE"("CREATED")>=TO_DATE(' 0001-10-10 00:00:00',
                  'syyyy-mm-dd hh24:mi:ss'))
    
    21 rows selected.
        通过gather_plan_statistics HINT,我们得到E-Rows 即CBO评估的基数,和 A-Rows实际的基数,可以看到这里E-Rows=500, 即在谓词左边存在使用内部函数或隐身装换的情况下,CBO无法通过现有统计信息的DISTINCT、DENSITY和HISTOGRAM获得较好的Cardinality,其基数总是统计信息中表的总行数/20,如上例中的 10000/20=500。 这就会引入不少的麻烦,因为开发人员有时候为了方便会在视图字段中嵌入自定义的函数,之后若在查询中使用该字段作为谓词条件,则可能导致CBO为相应表计算的基数偏少,是本身应当成本非常高的执行计划的COST变低,而容易被优化器选择。 对于上述问题可选的常见方案是若有这样问题的SQL较少则考虑加HINT或者SQL PROFILE,若较多还是需要考虑减少这种谓词左边有函数的现象。 implicit data_type conversion functions in Filter Predicates. Review Execution Plans. If Filter Predicatesinclude unexpected INTERNAL_FUNCTION to perform an implicit data_type conversion, be sure it is not preventing a column from being used as an Access Predicate.
  • 相关阅读:
    June 26th 2017 Week 26th Monday
    June 25th 2017 Week 26th Sunday
    June 24th 2017 Week 25th Saturday
    June 23rd 2017 Week 25th Friday
    June 22nd 2017 Week 25th Thursday
    2018最佳网页设计:就是要你灵感爆棚!!!
    图片素材类Web原型制作分享-Pexels
    想要打动HR的心,UX设计师求职信究竟应该怎么写?
    【UXPA大赛企业专访】Mockplus:“设计替代开发”将成为现实
    2018年最好的医疗网站设计及配色赏析
  • 原文地址:https://www.cnblogs.com/macleanoracle/p/2968127.html
Copyright © 2011-2022 走看看