zoukankan      html  css  js  c++  java
  • 主查询谓词无法内推导致的问题

    这个测试是在11g R2测试的,主查询的谓词无法内推到子查询的第二层导致的性能问题。
    由于下面两个SQL性能问题太明显,这里只分析执行计划。
    重现过程如下:
    --测试环境
    create table t1 as select object_id,object_name from dba_objects;
    create table t2 as select object_id,object_name from user_objects;
    create table t3 as select rownum object_id,table_name object_name from user_tables;
    analyze table t1 compute statistics for table for all columns;
    analyze table t2 compute statistics for table for all columns;
    analyze table t3 compute statistics for table for all columns;
    t1--49830;t2--37;t3--16
    开始测试:
    1.比较简单的子查询:子查询中表直接和主表就有关联关系
    select /*+gather_plan_statistics*/
    (select t2.object_name from t2 where t2.object_id = t1.object_id) object_name
    from t1;
    执行计划:
    ---------------------------------------------------------------------------------------------
    | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
    ---------------------------------------------------------------------------------------------
    |* 1 | TABLE ACCESS FULL| T2 | 49830 | 1 | 36 |00:00:01.75 | 149K| 0 |
    | 2 | TABLE ACCESS FULL| T1 | 1 | 49830 | 49830 |00:00:00.30 | 3546 | 231 |
    ---------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("T2"."OBJECT_ID"=:B1)
    这个执行计划很简单,首先执行t1,然后将t1中返回的object_id在t2中进行filter,t2的执行次数由t1返回行数决定(t1和t2的关联类似于嵌套循环,而t2则相当于内部表)。
    其实这种SQL的写法是比较低效的,可以将子查询中表t2放到from后面跟t1做left outer join,这样可以远远提升执行效率。
    2.复杂点的子查询:主查询中的表跟子查询的第二层有关联关系
    select /*+gather_plan_statistics*/
    (select t2.object_name
    from t2
    where t2.object_name in
    (select t3.object_name from t3 where t1.object_id = t3.object_id)) object_name
    from t1;
    执行计划:
    ----------------------------------------------------------------------------------------------
    | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
    ----------------------------------------------------------------------------------------------
    |* 1 | FILTER | | 49830 | | 15 |00:00:39.89 | 5680K| 2 |
    | 2 | TABLE ACCESS FULL| T2 | 49830 | 37 | 1843K|00:00:07.89 | 149K| 0 |
    |* 3 | TABLE ACCESS FULL| T3 | 1843K| 1 | 15 |00:00:27.95 | 5531K| 2 |
    | 4 | TABLE ACCESS FULL | T1 | 1 | 49830 | 49830 |00:00:00.20 | 3546 | 0 |
    ----------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter( EXISTS (SELECT /*+ */ 0 FROM "T3" "T3" WHERE
    "T3"."OBJECT_NAME"=:B1 AND "T3"."OBJECT_ID"=:B2))
    3 - filter("T3"."OBJECT_NAME"=:B1 AND "T3"."OBJECT_ID"=:B2)
    首先执行t1,然后执行t2,最后执行t3(从predicate中3可以看出,t3依赖于t1传过来的参数object_id和t2传过来的参数object_name)。
    仔细分析下Starts和A-Rows发现,执行t1返回49830行,然后执行t2,由于t2和t1没有任何关联条件,导致执行t2后返回的行数是t2数据行数与t2执行次数(t1返回行数)相乘;
    这也就直接影响到t3执行次数。也就是说不管t3中有多少行满足条件的数据,都会对表t3扫描t1和t2行数相乘次。
    特别是,如果t1、t2的行数足够大时,性能就更可想而知了。
    可以将该SQL做如下改写:
    select t2.object_name
    from t1
    left outer join (select distinct t3.object_id, t2.object_name
    from t2, t3
    where t2.object_name = t3.object_name) temp
    on t1.object_id = t3.object_id
     
  • 相关阅读:
    Java虚拟机一览表
    Java程序员的10道XML面试题
    bzoj 1644: [Usaco2007 Oct]Obstacle Course 障碍训练课【spfa】
    bzoj 1703: [Usaco2007 Mar]Ranking the Cows 奶牛排名【bitset+Floyd传递闭包】
    bzoj 1664: [Usaco2006 Open]County Fair Events 参加节日庆祝【dp+树状数组】
    bzoj 2100: [Usaco2010 Dec]Apple Delivery【spfa】
    bzoj 2015: [Usaco2010 Feb]Chocolate Giving【spfa】
    bzoj 1741: [Usaco2005 nov]Asteroids 穿越小行星群【最大点覆盖】
    bzoj 1645: [Usaco2007 Open]City Horizon 城市地平线【线段树+hash】
    bzoj 2060: [Usaco2010 Nov]Visiting Cows 拜访奶牛【树形dp】
  • 原文地址:https://www.cnblogs.com/zhaoshuangshuang/p/3236682.html
Copyright © 2011-2022 走看看