oracle in和exist的区别 not in 和not exist的区别

zoukankan html css js c++ java

oracle in和exist的区别 not in 和not exist的区别

in 是把外表和内表作hash join，而exists是对外表作loop，每次loop再对内表进行查询。一般大家都认为exists比in语句的效率要高，这种说法其实是不准确的，这个是要区分环境的。

exists对外表用loop逐条查询，每次查询都会查看exists的条件语句，当 exists里的条件语句能够返回记录行时(无论记录行是的多少，只要能返回)，条件就为真，返回当前loop到的这条记录，反之如果exists里的条件语句不能返回记录行，则当前loop到的这条记录被丢弃，exists的条件就像一个bool条件，当能返回结果集则为true，不能返回结果集则为 false。

例如：

select * from user where exists (select 1);

对user表的记录逐条取出，由于子条件中的select 1永远能返回记录行，那么user表的所有记录都将被加入结果集，所以与 select * from user;是一样的

又如下

select * from user where exists (select * from user where userId = 0);

可以知道对user表进行loop时，检查条件语句(select * from user where userId = 0),由于userId永远不为0，所以条件语句永远返回空集，条件永远为false，那么user表的所有记录都将被丢弃

not exists与exists相反，也就是当exists条件有结果集返回时，loop到的记录将被丢弃，否则将loop到的记录加入结果集

总的来说，如果A表有n条记录，那么exists查询就是将这n条记录逐条取出，然后判断n遍exists条件

in查询相当于多个or条件的叠加，这个比较好理解，比如下面的查询

select * from user where userId in (1, 2, 3);

等效于

select * from user where userId = 1 or userId = 2 or userId = 3;

not in与in相反，如下

select * from user where userId not in (1, 2, 3);

等效于

select * from user where userId != 1 and userId != 2 and userId != 3;

总的来说，in查询就是先将子查询条件的记录全都查出来，假设结果集为B，共有m条记录，然后在将子查询条件的结果集分解成m个，再进行m次查询

值得一提的是，in查询的子条件返回结果必须只有一个字段，例如

select * from user where userId in (select id from B);

而不能是

select * from user where userId in (select id, age from B);

而exists就没有这个限制

下面来考虑exists和in的性能：

对于以上两种情况，in是在内存里遍历比较，而exists需要查询数据库，所以当B表数据量较大时，exists效率优于in。

考虑如下SQL语句

select * from A where exists (select * from B where B.id = A.id);

select * from A where A.id in (select id from B);

1、select * from A where exists (select * from B where B.id = A.id);

exists()会执行A.length次，它并不缓存exists()结果集，因为exists()结果集的内容并不重要，重要的是其内查询语句的结果集空或者非空，空则返回false，非空则返回true。
它的查询过程类似于以下过程：

for ($i = 0; $i < count(A); $i++) {

　　$a = get_record(A, $i); #从A表逐条获取记录

　　if (B.id = $a[id]) #如果子条件成立

　　　　$result[] = $a;

}

return $result;

当B表比A表数据大时适合使用exists()，因为它没有那么多遍历操作，只需要再执行一次查询就行。
如：A表有10000条记录，B表有1000000条记录，那么exists()会执行10000次去判断A表中的id是否与B表中的id相等。
如：A表有10000条记录，B表有100000000条记录，那么exists()还是执行10000次，因为它只执行A.length次，可见B表数据越多，越适合exists()发挥效果。
再如：A表有10000条记录，B表有100条记录，那么exists()还是执行10000次，还不如使用in()遍历10000*100次，因为in()是在内存里遍历比较，而exists()需要查询数据库，我们都知道查询数据库所消耗的性能更高，而内存比较很快。

结论：exists()适合B表比A表数据大的情况

2、select * from A where id in (select id from B);

in()只执行一次，它查出B表中的所有id字段并缓存起来。之后，检查A表的id是否与B表中的id相等，如果相等则将A表的记录加入结果集中，直到遍历完A表的所有记录。

它的查询过程类似于以下过程：

Array A=(select * from A);

Array B=(select id from B);



for(int i=0;i<a.length;i++) {   </a.length;i++) {  <>

   for(int j=0;j<b.length;j++) {   </b.length;j++) {  <>

      if(A[i].id==B[j].id) {

         resultSet.add(A[i]);

         break;

      }

   }

}

return resultSet;

可以看出，当B表数据较大时不适合使用in()，因为它会B表数据全部遍历一次
如：A表有10000条记录，B表有1000000条记录，那么最多有可能遍历10000*1000000次，效率很差。
再如：A表有10000条记录，B表有100条记录，那么最多有可能遍历10000*100次，遍历次数大大减少，效率大大提升。

结论：in()适合B表比A表数据小的情况

当A表数据与B表数据一样大时，in与exists效率差不多，可任选一个使用。

在插入记录前，需要检查这条记录是否已经存在，只有当记录不存在时才执行插入操作，可以通过使用 EXISTS 条件句防止插入重复记录。
insert into A (name,age) select name,age from B where not exists (select 1 from A where A.id=B.id);

    EXISTS与IN的使用效率的问题，通常情况下采用exists要比in效率高，因为IN不走索引。但要看实际情况具体使用：IN适合于外表大而内表小的情况；EXISTS适合于外表小而内表大的情况。

下面再看not exists 和 not in

1、select * from A where not exists (select * from B where B.id = A.id);

2、select * from A where A.id not in (select id from B);

看查询1，还是和上面一样，用了B的索引；而对于查询2，可以转化成如下语句

select * from A where A.id != 1 and A.id != 2 and A.id != 3;

可以知道not in是个范围查询，这种!=的范围查询无法使用任何索引,等于说A表的每条记录，都要在B表里遍历一次，查看B表里是否存在这条记录

not in 和not exists：如果查询语句使用了not in 那么内外表都进行全表扫描，没有用到索引；而not extsts 的子查询依然能用到表上的索引。所以无论那个表大，用not exists都比not in要快，故not exists比not in效率高。

in 与 =的区别

select name from student where name in ('zhang','wang','li','zhao');

与

select name from student where name='zhang' or name='li' or name='wang' or name='zhao'

的结果是相同的。

在我们一般的观点中，总是认为使用EXISTS(或NOT EXISTS)通常将提高查询的效率，所以一般推荐使用exists来代替in。但实际情况是不是这个样子呢？我们分别在两种不同的优化器模式下用实际的例子来看一下：

SEIANG@seiang11g>create table wjq1 as select * from dba_objects;

Table created.

SEIANG@seiang11g>create table wjq2 as select * from dba_tables ;

Table created.

SEIANG@seiang11g>create index idx_object_name on wjq1(object_name);

Index created.

SEIANG@seiang11g>create index idx_table_name on wjq2(table_name);

Index created.

SEIANG@seiang11g>select count(*) from wjq1;

COUNT(*)

----------

     86976

SEIANG@seiang11g>select count(*) from wjq2;

COUNT(*)

----------

      2868

一、内查询结果集比较小，而外查询较大的时候的情况

1、在CBO模式下：

SEIANG@seiang11g>select * from wjq1 where object_name in (select table_name from wjq2 where table_name like 'M%');

815 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1638414738

---------------------------------------------------------------------------------------

| Id | Operation            | Name           | Rows | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |                | 1238 |   270K|   354   (1)| 00:00:05 |

|* 1 | HASH JOIN RIGHT SEMI|                | 1238 |   270K|   354   (1)| 00:00:05 |

|* 2 |   INDEX RANGE SCAN   | IDX_TABLE_NAME |   772 | 13124 |     7   (0)| 00:00:01 |

|* 3 |   TABLE ACCESS FULL | WJQ1           | 5503 | 1112K|   347   (1)| 00:00:05 |

---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("OBJECT_NAME"="TABLE_NAME")

   2 - access("TABLE_NAME" LIKE 'M%')

       filter("TABLE_NAME" LIKE 'M%')

   3 - filter("OBJECT_NAME" LIKE 'M%')

Note

-----

   - dynamic sampling used for this statement (level=2)

Statistics

----------------------------------------------------------

         17 recursive calls

          0 db block gets

       1462 consistent gets

       1256 physical reads

          0 redo size

      46140 bytes sent via SQL*Net to client

       1117 bytes received via SQL*Net from client

         56 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

        815 rows processed

SEIANG@seiang11g>select * from wjq1 where exists (select 1 from wjq2 where wjq1.object_name=wjq2.table_name and wjq2.table_name like 'M%');

815 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1638414738

---------------------------------------------------------------------------------------

| Id | Operation            | Name           | Rows | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |                | 1238 |   270K|   354   (1)| 00:00:05 |

|* 1 | HASH JOIN RIGHT SEMI|                | 1238 |   270K|   354   (1)| 00:00:05 |

|* 2 |   INDEX RANGE SCAN   | IDX_TABLE_NAME |   772 | 13124 |     7   (0)| 00:00:01 |

|* 3 |   TABLE ACCESS FULL | WJQ1           | 5503 | 1112K|   347   (1)| 00:00:05 |

---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("WJQ1"."OBJECT_NAME"="WJQ2"."TABLE_NAME")

   2 - access("WJQ2"."TABLE_NAME" LIKE 'M%')

       filter("WJQ2"."TABLE_NAME" LIKE 'M%')

   3 - filter("WJQ1"."OBJECT_NAME" LIKE 'M%')

Note

-----

   - dynamic sampling used for this statement (level=2)

Statistics

----------------------------------------------------------

         13 recursive calls

          0 db block gets

       1462 consistent gets

       1242 physical reads

          0 redo size

      46140 bytes sent via SQL*Net to client

       1117 bytes received via SQL*Net from client

         56 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

        815 rows processed

通过上面执行计划对比发现：
在CBO模式下，我们可以看到这两者的执行计划完全相同，统计数据也相同。

我们再来看一下RBO模式下的情况，这种情况相对复杂一些。

2、在RBO模式下：

SEIANG@seiang11g>select /*+ rule*/ * from wjq1 where object_name in (select table_name from wjq2 where table_name like 'M%');

815 rows selected.

Elapsed: 00:00:00.01

Execution Plan

----------------------------------------------------------

Plan hash value: 144941173

--------------------------------------------------------

| Id | Operation                    | Name            |

--------------------------------------------------------

|   0 | SELECT STATEMENT             |                 |

|   1 | NESTED LOOPS                |                 |

|   2 |   NESTED LOOPS               |                 |

|   3 |    VIEW                      | VW_NSO_1        |

|   4 |     SORT UNIQUE              |                 |

|* 5 |      INDEX RANGE SCAN        | IDX_TABLE_NAME |

|* 6 |    INDEX RANGE SCAN          | IDX_OBJECT_NAME |

|   7 |   TABLE ACCESS BY INDEX ROWID| WJQ1            |

--------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   5 - access("TABLE_NAME" LIKE 'M%')

       filter("TABLE_NAME" LIKE 'M%')

   6 - access("OBJECT_NAME"="TABLE_NAME")

Note

-----

   - rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

          0 recursive calls

          0 db block gets

        698 consistent gets

          0 physical reads

          0 redo size

      55187 bytes sent via SQL*Net to client

       1117 bytes received via SQL*Net from client

         56 SQL*Net roundtrips to/from client

          1 sorts (memory)

          0 sorts (disk)

        815 rows processed

SEIANG@seiang11g>select /*+ rule*/ * from wjq1 where exists (select 1 from wjq2 where wjq1.object_name=wjq2.table_name and wjq2.table_name like 'M%');

815 rows selected.

Elapsed: 00:00:00.15

Execution Plan

----------------------------------------------------------

Plan hash value: 3545670754

---------------------------------------------

| Id | Operation          | Name           |

---------------------------------------------

|   0 | SELECT STATEMENT   |                |

|* 1 | FILTER            |                |

|   2 |   TABLE ACCESS FULL| WJQ1           |

|* 3 |   INDEX RANGE SCAN | IDX_TABLE_NAME |

---------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM "WJQ2" "WJQ2" WHERE

              "WJQ2"."TABLE_NAME"=:B1 AND "WJQ2"."TABLE_NAME" LIKE 'M%'))

   3 - access("WJQ2"."TABLE_NAME"=:B1)

       filter("WJQ2"."TABLE_NAME" LIKE 'M%')

Note

-----

   - rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

          0 recursive calls

          0 db block gets

      91002 consistent gets

       1242 physical reads

          0 redo size

      46140 bytes sent via SQL*Net to client

       1117 bytes received via SQL*Net from client

         56 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

        815 rows processed

通过上面两个执行计划的对比发现：
  在这里，我们可以看到实际上，使用in效率比exists效率更高。我们可以这样来理解这种情况：
  对于in，RBO优化器选择的内存查询的结果作为驱动表来进行nest loops连接，所以当内存查询的结果集比较小的时候，这个in的效率还是比较高的。
  对于exists，RBO优化器则是利用外查询表的全表扫描结果集过滤内查询的结果集，当外查询的表比较大的时候，相对效率比较低。

二、内查询结果集比较大，而外查询较小的时候的情况

1、在CBO模式下：

SEIANG@seiang11g>select * from wjq2 where table_name in (select object_name from wjq1 where object_name like 'S%');

278 rows selected.

Elapsed: 00:00:00.03

Execution Plan

----------------------------------------------------------

Plan hash value: 1807911610

--------------------------------------------------------------------------------------

| Id | Operation          | Name            | Rows | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |                 |   278 |   164K|    55   (0)| 00:00:01 |

|* 1 | HASH JOIN SEMI    |                 |   278 |   164K|    55   (0)| 00:00:01 |

|* 2 |   TABLE ACCESS FULL| WJQ2            |   278 |   146K|    31   (0)| 00:00:01 |

|* 3 |   INDEX RANGE SCAN | IDX_OBJECT_NAME | 4435 |   285K|    24   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("TABLE_NAME"="OBJECT_NAME")

   2 - filter("TABLE_NAME" LIKE 'S%')

   3 - access("OBJECT_NAME" LIKE 'S%')

       filter("OBJECT_NAME" LIKE 'S%')

Note

-----

   - dynamic sampling used for this statement (level=2)

Statistics

----------------------------------------------------------

         67 recursive calls

          0 db block gets

        403 consistent gets

        446 physical reads

          0 redo size

      22852 bytes sent via SQL*Net to client

        721 bytes received via SQL*Net from client

         20 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

        278 rows processed

SEIANG@seiang11g>

SEIANG@seiang11g>select * from wjq2 where exists (select 1 from wjq1 where wjq1.object_name=wjq2.table_name and wjq1.object_name like 'S%');

278 rows selected.

Elapsed: 00:00:00.02

Execution Plan

----------------------------------------------------------

Plan hash value: 1807911610

--------------------------------------------------------------------------------------

| Id | Operation          | Name            | Rows | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |                 |   278 |   164K|    55   (0)| 00:00:01 |

|* 1 | HASH JOIN SEMI    |                 |   278 |   164K|    55   (0)| 00:00:01 |

|* 2 |   TABLE ACCESS FULL| WJQ2            |   278 |   146K|    31   (0)| 00:00:01 |

|* 3 |   INDEX RANGE SCAN | IDX_OBJECT_NAME | 4435 |   285K|    24   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("WJQ1"."OBJECT_NAME"="WJQ2"."TABLE_NAME")

   2 - filter("WJQ2"."TABLE_NAME" LIKE 'S%')

   3 - access("WJQ1"."OBJECT_NAME" LIKE 'S%')

       filter("WJQ1"."OBJECT_NAME" LIKE 'S%')

Note

-----

   - dynamic sampling used for this statement (level=2)

Statistics

----------------------------------------------------------

         13 recursive calls

          0 db block gets

        295 consistent gets

          2 physical reads

          0 redo size

      22852 bytes sent via SQL*Net to client

        721 bytes received via SQL*Net from client

         20 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

        278 rows processed

通过上面两个执行计划的对比发现：
虽然他们的执行计划相同，但是使用exists比使用in的物理读和逻辑读明显小很多，所以使用exists效率更高一下

查看全文

相关阅读:
[leetcode]40. Combination Sum II
二分法查找，如果找到就返回索引，找不到就返回插入该数后的坐标
 [leetcode]39combinationsum回溯法找几个数的和为目标值
 [leetcode33Search in Rotated Sorted Array]在排序旋转后序列中找目标值
 leetcode Add to List 31. Next Permutation找到数组在它的全排列中的下一个
 3sum
Security and Risk Management(5)
Lawrence HDU
Print Article HDU
Batch Scheduling POJ

原文地址：https://www.cnblogs.com/UUUz/p/10118035.html

oracle in和exist的区别 not in 和not exist的区别

一、内查询结果集比较小，而外查询较大的时候的情况

1、在CBO模式下：

2、在RBO模式下：

二、内查询结果集比较大，而外查询较小的时候的情况

1、在CBO模式下：