zoukankan      html  css  js  c++  java
  • 【转】oracle in和exists、not in和not exists原理和性能探究

    转自http://www.2cto.com/database/201310/251176.html

    对于in和exists、not in和not exists还是有很多的人有疑惑,更有甚者禁用not in,所有的地方都要用not exists,它真的高效吗?

    【实验1 in和exists原理及性能比较】

    准备数据

    create table test1 as select * from dba_objects where rownum <=1000;

    create table test2 as select * from dba_objects;

    exec dbms_stats.gather_table_stats(user,'test1');

    exec dbms_stats.gather_table_stats(user,'test2');

    set autotrace traceonly

    in 查询

    select * from test1 t1 where t1.object_id in (select t2.object_id from test2 t2);

    执行计划

    ----------------------------------------------------------
    Plan hash value: 3819917785

    ----------------------------------------------------------------------------
    | Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time |
    ----------------------------------------------------------------------------
    | 0 | SELECT STATEMENT   |     | 1000 | 90000 | 307 (1)  | 00:00:04 |
    |* 1 | HASH JOIN SEMI    |     | 1000 | 90000 | 307 (1)  | 00:00:04 |
    | 2 | TABLE ACCESS FULL  | TEST1 | 1000 | 85000 | 6 (0)   | 00:00:01 |
    | 3 | TABLE ACCESS FULL  | TEST2 | 73119 | 357K| 301 (1)  | 00:00:04 |
    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")


    统计信息
    ----------------------------------------------------------
    1 recursive calls
    0 db block gets
    98 consistent gets
    0 physical reads
    0 redo size
    50936 bytes sent via SQL*Net to client
    1226 bytes received via SQL*Net from client
    68 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    1000 rows processed

    exists 查询

    select * from test1 t1 where exists(select 1 from test2 t2 where t1.object_id=t2.object_id);

    执行计划
    ----------------------------------------------------------
    Plan hash value: 3819917785

    ----------------------------------------------------------------------------
    | Id | Operation        | Name | Rows | Bytes | Cost (%CPU)| Time |
    ----------------------------------------------------------------------------
    | 0 | SELECT STATEMENT   |     | 1000 | 90000 | 307 (1)  | 00:00:04 |
    |* 1 | HASH JOIN SEMI    |     | 1000 | 90000 | 307 (1)  | 00:00:04 |
    | 2 | TABLE ACCESS FULL  | TEST1 | 1000 | 85000 | 6 (0)   | 00:00:01 |
    | 3 | TABLE ACCESS FULL  | TEST2 | 73119 | 357K| 301 (1)  | 00:00:04 |
    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")


    统计信息
    ----------------------------------------------------------
    1 recursive calls
    0 db block gets
    98 consistent gets
    0 physical reads
    0 redo size
    50936 bytes sent via SQL*Net to client
    1226 bytes received via SQL*Net from client
    68 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    1000 rows processed

    结论:

    在oracle 11g中,in和exists 其实是一样的,原理就是两张表做HASH JOIN SEMI。也可以通过10053事件看到两条sql语句最终转换成同一条sql。

    【实验2 not in和not exists原理及性能比较】

    not exists 比 not in 效率高的例子(按照转载文章实验,执行计划和文章不符,结果是效率相同,可能是由于本人使用版本11g高于原文章缘故)

    保持test1 和 test2 数据不变,分别是 1000、70000+

    select count(*) from test1 where object_id not in (select object_id from test2);

    select count(*) from test1 t1 where not exists(select 1 from test2 t2 where t1.object_id=t2.object_id);

    执行计划相同,此处就省略了。

    执行计划相同;效率一样

    not in 比 not exists 效率高的例子(依然和转载文章结果不符,结果还是效率相同,后来我用hint改变了not in的执行计划才能显示出not in的优势)

    准备数据

    创建表t1和t2,结构和test1、test2一样,但是t1数据量为5条,t2数据量为20W+

    select count(*) from t1 where object_id not in (select /*+ no_unnest */ object_id from t2);

    --注意:如果不用hint来改变执行计划,两个语句仍然是一样的执行计划;


    执行计划
    ----------------------------------------------------------
    Plan hash value: 59119136

    ----------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    ----------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 3 | 755 (1)| 00:00:10 |
    | 1 | SORT AGGREGATE | | 1 | 3 | | |
    |* 2 | FILTER | | | | | |
    | 3 | TABLE ACCESS FULL| T1 | 5 | 15 | 3 (0)| 00:00:01 |
    |* 4 | TABLE ACCESS FULL| T2 | 2 | 10 | 301 (1)| 00:00:04 |
    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    2 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T2" "T2"
    WHERE LNNVL("OBJECT_ID"<>:B1)))
    4 - filter(LNNVL("OBJECT_ID"<>:B1))


    统计信息
    ----------------------------------------------------------
    1 recursive calls
    0 db block gets
    23 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    500 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    1 rows processed

    select count(*) from t1 where not exists (select 1 from t2 where t1.object_id=t2.object_id);

    执行计划
    ----------------------------------------------------------
    Plan hash value: 1513027705

    ----------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    ----------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 8 | 2376 (1)| 00:00:29 |
    | 1 | SORT AGGREGATE | | 1 | 8 | | |
    |* 2 | HASH JOIN ANTI | | 1 | 8 | 2376 (1)| 00:00:29 |
    | 3 | TABLE ACCESS FULL| T1 | 5 | 15 | 3 (0)| 00:00:01 |
    | 4 | TABLE ACCESS FULL| T2 | 584K| 2856K| 2371 (1)| 00:00:29 |
    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")


    统计信息
    ----------------------------------------------------------
    1 recursive calls
    0 db block gets
    8599 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    500 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    1 rows processed

    结论

    在11g版本中,数据量如我制造类似情况下,in和exists,not in和not exists的执行计划已经基本一致了,更倾向于使用HASH JOIN,但是当外表非常小,内表非常大的情况下,通过hint改变执行计划,filter的性能可以更优于HASH JOIN,也说明了not in不一定性能比not exists 差。

  • 相关阅读:
    salesforce零基础学习(八十七)Apex 中Picklist类型通过Control 字段值获取Dependent List 值
    salesforce lightning零基础学习(一) lightning简单介绍以及org开启lightning
    salesforce零基础学习(八十五)streaming api 简单使用(接近实时获取你需要跟踪的数据的更新消息状态)
    salesforce零基础学习(八十四)配置篇: 自定义你的home page layout
    salesforce零基础学习(八十三)analytics:reportChart实现Dashboard(仪表盘)功能效果
    salesforce零基础学习(八十二)审批邮件获取最终审批人和审批意见
    salesforce零基础学习(八十一)更改标准字段的label名称(Admin)
    salesforce零基础学习(八十)使用autoComplete 输入内容自动联想结果以及去重实现
    salesforce零基础学习(七十九)简单排序浅谈 篇一
    第三百五十一节,Python分布式爬虫打造搜索引擎Scrapy精讲—将selenium操作谷歌浏览器集成到scrapy中
  • 原文地址:https://www.cnblogs.com/dudu-java/p/5711540.html
Copyright © 2011-2022 走看看