zoukankan      html  css  js  c++  java
  • oracle 验证流水存在性火箭试优化

    在生产中经常遇到“select * from  tbl_IsExist where date=?”的SQL,经与开发人员沟通得知此SQL是验证流水存在性,若不存在则插入,若存在退出。

    前台根据list的size值来判断符合条件的数据是否存在。我们构建测试表分析验证流水存在性的优化方向。

    1、构建测试表

    --创建测试表
    drop table tbl_IsExist purge;
    create table tbl_IsExist
    (id number not null ,in_create date,in_remark varchar2(1000),
    primary key(id)
    );
    --插入测试数据
    insert into tbl_IsExist
    select rownum,to_date(trunc(dbms_random.value(to_number(to_char(sysdate-5,'J')),to_number(to_char(sysdate,'J')))),'J'),rpad('*',1000,'*') from dual connect by rownum<20000;
    --根据生产创建索引
    SQL> create index ind_isexist_create on tbl_isexist(in_create);
    
    Index created.
    View Code

    由于生产环境date的数据是均匀分配的,对于in_create字段的处理符合生产规则。执行生产中捕获的SQL,查看其执行计划。

    2、发现验证存在性的SQL执行计划是全表扫描,代价很高:

    --设置输出参数
    SQL> set linesize 2000
    SQL> set autotrace on;
    --执行查询脚本
    SQL> select * from tbl_isexist where in_create=TO_DATE('2020-09-24 00:00:00','YYYY-MM-DD HH24:MI:SS');
    
    3962 rows selected.
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2271097464
    
    ---------------------------------------------------------------------------------
    | Id  | Operation              | Name            | Rows    | Bytes | Cost (%CPU)| Time    |
    ---------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |                |  3269 |  1672K|   777   (1)| 00:00:01 |
    |*  1 |  TABLE ACCESS FULL    | TBL_ISEXIST     |  3269 |  1672K|   777   (1)| 00:00:01 |
    ---------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("IN_CREATE"=TO_DATE(' 2020-09-24 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
    
    Note
    -----
       - dynamic statistics used: dynamic sampling (level=2)
    
    
    Statistics
    ----------------------------------------------------------
          0      recursive calls
          0      db block gets
          3098  consistent gets
          0      physical reads
          0      redo size
          90787 bytes sent via SQL*Net to client
          3362  bytes received via SQL*Net from client
          266      SQL*Net roundtrips to/from client
          0      sorts (memory)
          0      sorts (disk)
          3962  rows processed
    View Code

    表上有供选择的索引,但是SQL并没有执行,此SQL的执行计划是全表扫描,cost为777,逻辑读3098数据块。若表中的记录很多,全表扫描的执行计划将耗用大量的逻辑读。
    3、开始优化,验证流水存在性SQL

    ①优化思路1:

    表中有可用索引,优化器未采纳是否由于代价高,进一步验证:

    用hint强制走索引,查看执行计划
    SQL> select /*+ INDEX(T IND_ISEXIST_CREATE)*/* from tbl_isexist T where in_create=TO_DATE('2020-09-24 00:00:00','YYYY-MM-DD HH24:MI:SS');
    
    3962 rows selected.
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1453123986
    
    ----------------------------------------------------------------------------------------------------------
    | Id  | Operation                            | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |                      |  3269 |  1672K|  2004   (1)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TBL_ISEXIST          |  3269 |  1672K|  2004   (1)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN                    | IND_ISEXIST_CREATE |  3269 |          |    10   (0)| 00:00:01 |
    ----------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("IN_CREATE"=TO_DATE(' 2020-09-24 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
    
    Note
    -----
       - dynamic statistics used: dynamic sampling (level=2)
    
    
    Statistics
    ----------------------------------------------------------
          0          recursive calls
          0          db block gets
          2648      consistent gets
          0          physical reads
          0          redo size
          4126831   bytes sent via SQL*Net to client
          3397      bytes received via SQL*Net from client
          266          SQL*Net roundtrips to/from client
          0          sorts (memory)
          0          sorts (disk)
          3962      rows processed
    View Code

    强制走索引cost是2004比走全表扫描代价更大,oracle是基于代价的优化器,故舍弃索引扫描的执行计划

    ②优化思路2:

    在行数上做文章,验证流水存在性是否需要将所有记录都返回。因为在java中根据list的值不为0判断存在性,设想满足条件的第一条记录存在,岂不已经验证存在性。

    现改写SQL查看其执行计划:

    --只取第一条符合条件的记录,加限制条件rownum=1
    SQL> select * from tbl_isexist where in_create=TO_DATE('2020-09-24 00:00:00','YYYY-MM-DD HH24:MI:SS') and rownum=1;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1281543153
    
    -----------------------------------------------------------------------------------------------------------
    | Id  | Operation                             | Name               | Rows  | Bytes | Cost (%CPU)| Time    |
    -----------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                      |                    |    1  |   524 |    2   (0)| 00:00:01 |
    |*  1 |  COUNT STOPKEY                        |                    |       |       |           |          |
    |   2 |   TABLE ACCESS BY INDEX ROWID BATCHED | TBL_ISEXIST        |  3269 |  1672K|    2   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN                   | IND_ISEXIST_CREATE |       |       |    1   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter(ROWNUM=1)
       3 - access("IN_CREATE"=TO_DATE(' 2020-09-24 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
    
    Note
    -----
       - dynamic statistics used: dynamic sampling (level=2)
    
    
    Statistics
    ----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
       1725  bytes sent via SQL*Net to client
        471  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

    我们由取全部符合条件的记录,改为只取第一行符合条件的记录,cost由777降为2,逻辑读由3098降为3,提速几百倍,甚是欣喜。高兴之余,试想是否还有可优化空间。分析执行计划:INDEX RANGE SCAN花去了1个cost,TABLE ACCESS BY INDEX ROWID BATCHED也就是回表花去了1个cost(2-1)。

    回表的原理:查询的列不在索引中,需要通过索引的rowid定位表记录,把需要的字段取出,展示给客户端。

    ③优化思路3:

    不回表降低cost,在SQL返回的列上做文章。只显示索引“IND_ISEXIST_CREATE”的列in_create,既能降低cost也能满足需求,完全可以。

    SQL> select in_create from tbl_isexist where in_create=TO_DATE('2020-09-24 00:00:00','YYYY-MM-DD HH24:MI:SS') and rownum=1;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3010836204
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation          | Name                | Rows  | Bytes | Cost (%CPU)  | Time    |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |                     |     1 |     9 |     1     (0)| 00:00:01|
    |*  1 |  COUNT STOPKEY     |                     |       |       |              |         |
    |*  2 |   INDEX RANGE SCAN | IND_ISEXIST_CREATE  |  3269 | 29421 |     1     (0)| 00:00:01|
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter(ROWNUM=1)
       2 - access("IN_CREATE"=TO_DATE(' 2020-09-24 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
    
    Note
    -----
       - dynamic statistics used: dynamic sampling (level=2)
    
    
    Statistics
    ----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        556  bytes sent via SQL*Net to client
        479  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

    至此优化结束,代价从777降到1,逻辑读从3098降到2,可谓是火箭试的提速!

    其他问题请关注目录:https://www.cnblogs.com/handhead/

  • 相关阅读:
    Java Lambda表达式初探
    解开lambda最强作用的神秘面纱
    常用正则表达式
    js,java时间处理
    Java 8新特性探究(二)深入解析默认方法
    Java 8里面lambda的最佳实践
    lambda表达式和闭包
    Lambda语法篇
    lambda表达式
    依赖注入和控制反转的理解
  • 原文地址:https://www.cnblogs.com/handhead/p/13755640.html
Copyright © 2011-2022 走看看