zoukankan      html  css  js  c++  java
  • Oracle Update语句中多表关联中被关联表多次全表扫描问题

    Oracle Update语句中多表关联中被关联表多次全表扫描问题

    前言

    最近优化了一个update语句中,多表关联导致表多次全表扫描的性能问题。

    尝试用merge into改写后发现原来不知道多久能运行完的语句达到秒级别执行完,因为merge into可以避免多次的全表扫描。

    比较好模拟,接下来模拟一下,也好记录一下有些遇到的小问题。

    事故模拟

    这个是我实际生产优化的数据和语句的模拟。

    drop table t1 purge;
    drop table t2 purge;
    create table t1 as select * from dba_objects where rownum<=100;
    create table t2 as select * from dba_objects where rownum<=50;
    View Code
    10:34:30 SYS@zkm(1)> drop table t1 purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.01
    10:35:27 SYS@zkm(1)> drop table t2 purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.01
    10:35:27 SYS@zkm(1)> create table t1 as select * from dba_objects where rownum<=100;
    
    Table created.
    
    Elapsed: 00:00:00.01
    10:35:27 SYS@zkm(1)> create table t2 as select * from dba_objects where rownum<=50;
    
    Table created.
    
    Elapsed: 00:00:00.01

    update的语句如下:

    update t1
       set (t1.owner, t1.object_name) =
           (select t2.owner, t2.object_name
              from t2
             where t1.object_id = t2.object_id)
     where exists (select 1 from t2 where t1.object_id = t2.object_id);

    开启statistics_level为all,看看执行后对被关联表t2的执行次数是多少次。

    10:43:41 SYS@zkm(1)> set pagesize 9999 long 9999 line 500 timing on
    10:45:14 SYS@zkm(1)> alter session set statistics_level=all;
    
    Session altered.
    
    Elapsed: 00:00:00.00
    10:45:14 SYS@zkm(1)> update t1
    10:45:19   2     set (t1.owner, t1.object_name) =
    10:45:19   3         (select t2.owner, t2.object_name
    10:45:19   4            from t2
    10:45:19   5           where t1.object_id = t2.object_id)
    10:45:19   6   where exists (select 1 from t2 where t1.object_id = t2.object_id);
    
    50 rows updated.
    
    Elapsed: 00:00:00.00
    10:45:20 SYS@zkm(1)> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------
    SQL_ID  aq3ptnvdvmnpq, child number 0
    -------------------------------------
    update t1    set (t1.owner, t1.object_name) =        (select t2.owner,
    t2.object_name           from t2          where t1.object_id =
    t2.object_id)  where exists (select 1 from t2 where t1.object_id =
    t2.object_id)
    
    Plan hash value: 1571583455
    
    ----------------------------------------------------------------------------------------------------------------------
    | Id  | Operation             | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    ----------------------------------------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT      |         |      1 |        |      0 |00:00:00.01 |     209 |       |       |          |
    |   1 |  UPDATE               | T1      |      1 |        |      0 |00:00:00.01 |     209 |       |       |          |
    |*  2 |   HASH JOIN RIGHT SEMI|         |      1 |    100 |     50 |00:00:00.01 |       7 |  2440K|  2440K| 1515K (0)|
    |   3 |    VIEW               | VW_SQ_1 |      1 |     50 |     50 |00:00:00.01 |       3 |       |       |          |
    |   4 |     TABLE ACCESS FULL | T2      |      1 |     50 |     50 |00:00:00.01 |       3 |       |       |          |
    |   5 |    TABLE ACCESS FULL  | T1      |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
    |*  6 |   TABLE ACCESS FULL   | T2      |     50 |      1 |     50 |00:00:00.01 |     150 |       |       |          |
    ----------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."OBJECT_ID"="ITEM_1")
       6 - filter("T2"."OBJECT_ID"=:B1)
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    
    31 rows selected.
    
    Elapsed: 00:00:00.01
    10:45:22 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.01
    10:45:24 SYS@zkm(1)> 

    可以看到id=6的starts的值为50,表示id为6的t2全表扫描被执行了50次,在生产事故中,t2表是比较大的,并且扫描次数相对较多,因此update执行时间是非常久的。

    这里id=1的子步骤为id=2和id=6,此处的id=1的UPDATE操作类似和“NESTED LOOPS”一样,id=2出有多少条符合条件的记录,则id=6就执行多少次。

    比如这里的,t1被update的记录通过“where exists (select 1 from t2 where t1.object_id = t2.object_id)”后,仍有50行符合可以被update,因此这50行中每行都会去全表扫t2一次。

    10:56:00 SYS@zkm(1)> select count(*) from t1 where exists (select 1 from t2 where t1.object_id = t2.object_id);
    
      COUNT(*)
    ----------
            50
    
    Elapsed: 00:00:00.00

    使用merge into改造后,效率变高(数据量小看不出执行时间的差别,你可以自己构造巨量数据去比较,这里主要看t2全表扫描的次数),

    merge into t1
    using t2
    on (t1.object_id = t2.object_id)
    when matched then
      update set t1.owner = t2.owner, t1.object_name = t2.object_name;
    View Code
    11:00:23 SYS@zkm(1)> merge into t1
    11:00:23   2  using t2
    11:00:23   3  on (t1.object_id = t2.object_id)
    11:00:23   4  when matched then
    11:00:23   5    update set t1.owner = t2.owner, t1.object_name = t2.object_name;
    
    50 rows merged.
    
    Elapsed: 00:00:00.00
    11:00:23 SYS@zkm(1)> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------------------------------------------------
    SQL_ID  agt8fym6y8hug, child number 0
    -------------------------------------
    merge into t1 using t2 on (t1.object_id = t2.object_id) when matched
    then   update set t1.owner = t2.owner, t1.object_name = t2.object_name
    
    Plan hash value: 2683531971
    
    ------------------------------------------------------------------------------------------------------------------
    | Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    ------------------------------------------------------------------------------------------------------------------
    |   0 | MERGE STATEMENT      |      |      1 |        |      0 |00:00:00.01 |      59 |       |       |          |
    |   1 |  MERGE               | T1   |      1 |        |      0 |00:00:00.01 |      59 |       |       |          |
    |   2 |   VIEW               |      |      1 |        |     50 |00:00:00.01 |       7 |       |       |          |
    |*  3 |    HASH JOIN         |      |      1 |     50 |     50 |00:00:00.01 |       7 |   870K|   870K| 1289K (0)|
    |   4 |     TABLE ACCESS FULL| T2   |      1 |     50 |     50 |00:00:00.01 |       3 |       |       |          |
    |   5 |     TABLE ACCESS FULL| T1   |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
    ------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    
    27 rows selected.
    
    Elapsed: 00:00:00.01
    11:00:26 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00

    对比可以知道,merge into的t2只扫描了一次,并且其他维度比如Buffers,OMem,1Mem,Used-Mem都减少了。

    merge into的限制

    重新构造另外的数据环境。

    drop table t1 purge;
    drop table t2 purge;
    
    create table t1 ( id number,name varchar2(20));
    create table t2 ( id number,name varchar2(20));
    
    insert into t1 values (1,'t1a');
    insert into t1 values (2,'t1b');
    insert into t1 values (3,'t1c');
    insert into t1 values (4,'t1d');
    insert into t1 values (5,'t1e');
    insert into t1 values (6,'t1f');
     
    
    insert into t2 values (2,'t2a');
    insert into t2 values (3,'t2b');
    insert into t2 values (4,'t2c');
    insert into t2 values (5,'t2d');
    insert into t2 values (6,'t2e');
    insert into t2 values (7,'t2f');
    commit;
    View Code
    11:04:36 SYS@zkm(1)> drop table t1 purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.01
    11:04:42 SYS@zkm(1)> drop table t2 purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.01
    11:04:43 SYS@zkm(1)> create table t1 ( id number,name varchar2(20));
    
    Table created.
    
    Elapsed: 00:00:00.01
    11:04:49 SYS@zkm(1)> create table t2 ( id number,name varchar2(20));
    
    Table created.
    
    Elapsed: 00:00:00.00
    11:04:49 SYS@zkm(1)> insert into t1 values (1,'t1a');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:52 SYS@zkm(1)> insert into t1 values (2,'t1b');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:52 SYS@zkm(1)> insert into t1 values (3,'t1c');
    
    1 row created.
    
    Elapsed: 00:00:00.01
    11:04:52 SYS@zkm(1)> insert into t1 values (4,'t1d');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:52 SYS@zkm(1)> insert into t1 values (5,'t1e');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:52 SYS@zkm(1)> insert into t1 values (6,'t1f');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:53 SYS@zkm(1)> insert into t2 values (2,'t2a');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> insert into t2 values (3,'t2b');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> insert into t2 values (4,'t2c');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> insert into t2 values (5,'t2d');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> insert into t2 values (6,'t2e');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> insert into t2 values (7,'t2f');
    
    1 row created.
    
    Elapsed: 00:00:00.00
    11:04:56 SYS@zkm(1)> commit;
    
    Commit complete.
    
    Elapsed: 00:00:00.00
    11:04:57 SYS@zkm(1)> select * from t1;
    
            ID NAME
    ---------- ------------------------------------------------------------
             1 t1a
             2 t1b
             3 t1c
             4 t1d
             5 t1e
             6 t1f
    
    6 rows selected.
    
    Elapsed: 00:00:00.00
    11:05:15 SYS@zkm(1)> select * from t2;
    
            ID NAME
    ---------- ------------------------------------------------------------
             2 t2a
             3 t2b
             4 t2c
             5 t2d
             6 t2e
             7 t2f
    
    6 rows selected.
    
    Elapsed: 00:00:00.00

    对于update语句,

    update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id) where exists (select t2.id,t2.name from t2 where t1.id=t2.id);

     无法改造成merge into,如下:

    merge into t1
    using t2
    on (t1.id = t2.id)
    when matched then
      update set t1.id = t2.id, t1.name = t2.name;

    因为merge into中不能对匹配字段id进行update,会报ORA-38104,

    11:12:44 SYS@zkm(1)> merge into t1
    11:13:18   2  using t2
    11:13:18   3  on (t1.id = t2.id)
    11:13:18   4  when matched then
    11:13:18   5    update set t1.id = t2.id, t1.name = t2.name;
    on (t1.id = t2.id)
        *
    ERROR at line 3:
    ORA-38104: Columns referenced in the ON Clause cannot be updated: "T1"."ID"
    
    
    Elapsed: 00:00:00.00

    这种情况下,我只能想到在t2的id字段创建索引,避免多次的全表扫描。

    11:15:33 SYS@zkm(1)> set pagesize 9999 long 9999 line 500 timing on
    11:15:34 SYS@zkm(1)> alter session set statistics_level=all;
    
    Session altered.
    
    Elapsed: 00:00:00.00
    11:15:34 SYS@zkm(1)> update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id) where exists (select t2.id,t2.name from t2 where t1.id=t2.id);
    
    5 rows updated.
    
    Elapsed: 00:00:00.00
    11:15:35 SYS@zkm(1)> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID  c124snsjck4d6, child number 1
    -------------------------------------
    update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where
    t1.id=t2.id) where exists (select t2.id,t2.name from t2 where
    t1.id=t2.id)
    
    Plan hash value: 2611650519
    
    -------------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    -------------------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT             |        |      1 |        |      0 |00:00:00.01 |      22 |
    |   1 |  UPDATE                      | T1     |      1 |        |      0 |00:00:00.01 |      22 |
    |*  2 |   FILTER                     |        |      1 |        |      5 |00:00:00.01 |       9 |
    |   3 |    TABLE ACCESS FULL         | T1     |      1 |      6 |      6 |00:00:00.01 |       3 |
    |*  4 |    INDEX RANGE SCAN          | IDX_ID |      6 |      1 |      5 |00:00:00.01 |       6 |
    |   5 |   TABLE ACCESS BY INDEX ROWID| T2     |      5 |      1 |      5 |00:00:00.01 |      10 |
    |*  6 |    INDEX RANGE SCAN          | IDX_ID |      5 |      1 |      5 |00:00:00.01 |       5 |
    -------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter( IS NOT NULL)
       4 - access("T2"."ID"=:B1)
       6 - access("T2"."ID"=:B1)
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    
    31 rows selected.
    
    Elapsed: 00:00:00.01
    11:15:36 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00

    注意点

    一开始调试的时候,我以为where条件是可以去掉的,但其实不对。

    什么意思呢?比如

    update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id) where exists (select t2.id,t2.name from t2 where t1.id=t2.id);

     我以为是和

    update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id);

     等价的,但是其实不等价。

    实际执行看看就知道了,

    11:19:34 SYS@zkm(1)> drop index idx_id;
    
    Index dropped.
    
    Elapsed: 00:00:00.01
    11:22:37 SYS@zkm(1)> update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id) where exists (select t2.id,t2.name from t2 where t1.id=t2.id);
    
    5 rows updated.
    
    Elapsed: 00:00:00.01
    11:22:39 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00
    11:22:42 SYS@zkm(1)> update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id);
    
    6 rows updated.
    
    Elapsed: 00:00:00.00
    11:22:46 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00

    为什么?

    where exists实际上是锁定其批量更新数据的范围,update的时候是根据被update的表t1的每一条数据去做更新的。

    比如语句update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id),没有任何条件的话那么肯定是全部的行都要被进行update的,在子查询中如果不符合t1.id=t2.id的行,被更新字段是会被值为null的,如下:

    11:37:31 SYS@zkm(1)> select * from t1;
    
            ID NAME
    ---------- ------------------------------------------------------------
             1 t1a
             2 t1b
             3 t1c
             4 t1d
             5 t1e
             6 t1f
    
    6 rows selected.
    
    Elapsed: 00:00:00.00
    11:37:33 SYS@zkm(1)> update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id);
    
    6 rows updated.
    
    Elapsed: 00:00:00.00
    11:37:36 SYS@zkm(1)> select * from t1;
    
            ID NAME
    ---------- ------------------------------------------------------------
    
             2 t2a
             3 t2b
             4 t2c
             5 t2d
             6 t2e
    
    6 rows selected.
    
    Elapsed: 00:00:00.00
    11:37:39 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00

    因为id=1在t2中匹配不到(可以理解为t2表中为null),但是又必须被更新(前边说了没限定条件是要全部被更新的),所以被设置为null。

    可以这么看,当取出t1表中第一行(id=1,name=t1a)后,将该行中t1.id和t2.id进行等价匹配(这次进行一次t2的全表扫描),发现没有符合条件的数据,因此只能都设置为null。

    因此,where exists的限定是必须要有的,我们来看看,

    11:40:55 SYS@zkm(1)> select * from t1;
    
            ID NAME
    ---------- ------------------------------------------------------------
             1 t1a
             2 t1b
             3 t1c
             4 t1d
             5 t1e
             6 t1f
    
    6 rows selected.
    
    Elapsed: 00:00:00.00
    11:41:09 SYS@zkm(1)> update t1 set (t1.id,t1.name)=(select t2.id,t2.name from t2 where t1.id=t2.id) where exists (select t2.id,t2.name from t2 where t1.id=t2.id);
    
    5 rows updated.
    
    Elapsed: 00:00:00.00
    11:41:10 SYS@zkm(1)> select * from t1;
    
            ID NAME
    ---------- ------------------------------------------------------------
             1 t1a
             2 t2a
             3 t2b
             4 t2c
             5 t2d
             6 t2e
    
    6 rows selected.
    
    Elapsed: 00:00:00.00
    11:41:14 SYS@zkm(1)> rollback;
    
    Rollback complete.
    
    Elapsed: 00:00:00.00

    可以这么看,对t1全表扫描后,取出的数据首先判断是否满足where exists的条件。

    比如第一行(id=1,name=t1a)不满足where exists (select t2.id,t2.name from t2 where t1.id=t2.id),因此,不对这一行做update。

    剩下的5行全部满足条件,所以进行update。

  • 相关阅读:
    DropDownList判断值是否存在下拉列表中
    postgre教程
    Cookie seesion 赋值
    Winform定时启动
    ASP.NET数据绑定控件
    ASP.NET常用数据绑定控件优劣总结
    Cards and Joy (dp好题)
    River Hopscotch (二分)
    剪花布条(KMP入门)
    GCD (区间数的质因子打表+容斥原理)
  • 原文地址:https://www.cnblogs.com/PiscesCanon/p/13527949.html
Copyright © 2011-2022 走看看