传统的SQL优化思想认为IN、Exists这2种写法不同数据量的环境中各擅胜场,Developer应当根据实际情况合理运用IN或Exists。
实际我认为,这种认识对于现代SQL优化器Optimizer(10g以后)不再适用。 CBO优化器中本身包含了查询转换(Query Transformation)的功能。
为什么CBO要做查询转换(Query Transformation)?
回答: 用户提交给SQL引擎的语句可能不是执行查询最有效的语句
查询转换(Query Transformation)做了什么?
回答: 查询转换按照固定的语法将语句重写为等意的SQL,但是优化了其SQL结构
查询转换(Query Transformation)包括启发式地查询转换和基于成本的查询转换。
对于IN、Exists , CBO optimizer 的查询装换效果如何?:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for 64-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select count(1) from joinA A where A.CUST_ID in ( select B.cust_id from joinb B where B.CUST_FIRST_NAME like 'C%');
COUNT(1)
----------
39381
SQL> alter system flush shared_pool;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> alter session set events '10053 trace name context forever,level 1';
Session altered.
SQL> select count(1) from joinA A where A.CUST_ID in ( select B.cust_id from joinb B where B.CUST_FIRST_NAME like 'C%');
COUNT(1)
----------
39381
SQL> oradebug setmypid
Statement processed.
SQL> oradebug tracefile_name
D:\APP\ML\diag\rdbms\testem\testem\trace\testem_ora_6776.trc
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(1)"
FROM (SELECT "B"."CUST_ID" "CUST_ID"
FROM "SYS"."JOINB" "B"
WHERE "B"."CUST_FIRST_NAME" LIKE 'C%') "VW_NSO_1",
"SYS"."JOINA" "A"
WHERE "A"."CUST_ID" = "VW_NSO_1"."CUST_ID"
kkoqbc: optimizing query block SEL$683B0107 (#2)
SQL> alter session set events '10053 trace name context forever,level 1';
Session altered.
SQL> select count(1) from joinA A where exists ( select 1 from joinb B where B.CUST_FIRST_NAME like 'C%' and B.cust_id=A.cust_id );
COUNT(1)
----------
39381
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(1)"
FROM (SELECT "B"."CUST_ID" "ITEM_1"
FROM "SYS"."JOINB" "B"
WHERE "B"."CUST_FIRST_NAME" LIKE 'C%') "VW_SQ_1",
"SYS"."JOINA" "A"
WHERE "VW_SQ_1"."ITEM_1" = "A"."CUST_ID"
如以上10053 trace 所显示 对于2个分别使用了IN和Exists的查询, CBO Optimizer查询转换后得到的结果十分类似:
IN:
SELECT COUNT(*) "COUNT(1)"
FROM (SELECT "B"."CUST_ID" "CUST_ID"
FROM "SYS"."JOINB" "B"
WHERE "B"."CUST_FIRST_NAME" LIKE 'C%') "VW_NSO_1",
"SYS"."JOINA" "A"
WHERE "A"."CUST_ID" = "VW_NSO_1"."CUST_ID"
Exists:
SELECT COUNT(*) "COUNT(1)"
FROM (SELECT "B"."CUST_ID" "ITEM_1"
FROM "SYS"."JOINB" "B"
WHERE "B"."CUST_FIRST_NAME" LIKE 'C%') "VW_SQ_1",
"SYS"."JOINA" "A"
WHERE "VW_SQ_1"."ITEM_1" = "A"."CUST_ID"
CBO在完成Query transformations之后才会对转换后的语句进行APA(access path analysis)和后续的优化树构建、成本比较等等,一般来说如果你使用10g以后的CBO ,可以认为语义等价的IN或Exists在性能上也不会有太大的区别。