zoukankan      html  css  js  c++  java
  • Oracle :Insert ways.

    You Asked

    I'am working on enhancing the performance of some long running 
    queries in our application. The existing code uses the NOT EXISTS 
    clause extensively and I wondered if it would be a good idea to 
    use OUTER JOINS instead, wherever possible. For example, all else 
    being equal, which of the following would execute faster ?
    
    Insert into Title(......)
    select .....
    from rollup R
    where not exists ( select 'Found' from title T 
                       where R.source_id = T.Title_ID);
    
    Insert into Title(.......)
    select ......
    from title T, rollup R
    where R.source_id = T.Title_id(+)
    and T.Title_id is null;
    
    Note that Rollup has about 5 million rows and Title about 
    1 million rows and both are indexed on the join fields.
    In trials, I did notice that the latter was about 30% faster 
    but does that have a theoritical basis ?
    
    Would appreciate an early reply.
    
    Thanks,
    
    Sunder 


     

    and we said...

    There are at least 3 ways to do this above (minus might be a 4'th depending on your 
    circumstances).
    
    The way with NOT exists works well for small OUTER tables (or outer queries that have 
    other predicates the result in "smallish" result sets before we have to do the correlated 
    subquery).  A NOT exists works something like this:
    
      for x in ( select * from rollup ) loop
         if ( not exists ( that query ) ) then
                OUTPUT
         end if;
      end
    
    it in effect runs the subquery over and over and over....
    
    The outer join, method number 2, allows us to do more large scale, bulk type operations.  
    Anything we do in bulk, we can probably do faster.  So, by not running that subquery over 
    and over, we gained some efficiency.  It'll depend on your indexes and such as well how 
    well this goes.
    
    The bulk operations we do -- thats your "theroretical basis".
    
    A 3'rd alternative is a NOT IN query.  As long as you understand that NOT IN and NOT 
    EXISTS are semantically different, this might run even faster yet.  NOT IN will not work 
    correctly if the subquery returns a NULL (the result set will be empty if it does).  So, 
    we have to add a filter "where title_id is NOT NULL" to the subquery to make this be the 
    same.
    
    Here is a very very small test showing the three methods.  Existence (or lack of) indexes 
    may effect this test and so on.  You might consider the NOT IN with the HASH_AJ hint 
    (although using the CBO should have this happen automatically -- if the NOT IN runs 
    really really slow, its not using the HASH_AJ -- you'd have to hint it in that case).
    
    this test is not a benchmark, it is just illustrative of the three methods and contains 
    some comments on their plans.
    
    
    DOC>create table t as select * from all_objects;
    DOC>create index t_idx on t(object_id);
    
    DOC>create table t2 as select * from all_objects where mod(object_id,50) = 0;
    DOC>create index t2_idx on t2(object_id);
    
    DOC>*/
    
    ops$tkyte@DEV8I.WORLD> set autotrace on explain
    ops$tkyte@DEV8I.WORLD> select count(object_name)
      2    from t
      3   where NOT EXISTS( select null
      4                       from t2
      5                      where t2.object_id = t.object_id )
      6  /
    
    COUNT(OBJECT_NAME)
    ------------------
                 22251
    
    Elapsed: 00:00:00.85
    
    Execution Plan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE
       1    0   SORT (AGGREGATE)
       2    1     FILTER
       3    2       TABLE ACCESS (FULL) OF 'T'
       4    2       INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
    
    Here, it full scans T and does that subquery for each row -- thats the index probe we 
    see in there, the subquery running
    
    ops$tkyte@DEV8I.WORLD> 
    ops$tkyte@DEV8I.WORLD> select /*+ HASH_AJ */ count(object_name)
      2    from t
      3   where object_id NOT IN ( select object_id
      4                              from t2 
      5                             where object_id IS NOT NULL )
      6  /
    
    COUNT(OBJECT_NAME)
    ------------------
                 22251
    
    Elapsed: 00:00:00.67
    
    Execution Plan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 
       1    0   SORT (AGGREGATE)
       2    1     FILTER
       3    2       INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) 
       4    2       INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) 
    
    There is the "anti join" method with may or may not run a little better in your 
    environment.  Its worth trying though
    
    ops$tkyte@DEV8I.WORLD> select count(t.object_name)
      2    from t, t2
      3   where t.object_id = t2.object_id(+)
      4     and t2.object_id is NULL
      5  /
    
    COUNT(T.OBJECT_NAME)
    --------------------
                   22251
    
    Elapsed: 00:00:00.75
    
    Execution Plan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=CHOOSE
       1    0   SORT (AGGREGATE)
       2    1     FILTER
       3    2       NESTED LOOPS (OUTER)
       4    3         TABLE ACCESS (FULL) OF 'T'
       5    3         INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
    
    And then the outer join method -- in this PARTICULAR example it fell between the not 
    exists and the NOT IN.  Your results may vary ;)
    
    ops$tkyte@DEV8I.WORLD> set autotrace off 
  • 相关阅读:
    SpringBoot集成Redis
    独享锁 & 共享锁
    公平锁与非公平锁
    如何上传本地代码到码云
    SpringBoot+Mybatis+Pagehelper分页
    SpringBoot集成Mybatis(0配置注解版)
    高并发下接口幂等性解决方案
    SpringBoot全局配置文件
    干货
    Spring读取外部的资源配置文件—@PropertySource和@Value实现资源文件配置
  • 原文地址:https://www.cnblogs.com/tracy/p/2110399.html
Copyright © 2011-2022 走看看