曾经这样要求开发人员写SQL:要将表的限制条件放到与该表同一级别的where限制条件中。
今天经过测试,在有外连接时,要将表的限制条件放到与表同一级别的on限制中(除了下面给出的例子,还包括对inner join的表限制条件放到外连接的on限制条件中)。
看下面这个例子:
测试数据:
create table t1 as select * from dba_objects;--13436行
create table t2 as select * from dba_objects;--13435行
create table t3 as select * from t2;--13434行
insert into t2 select * from t2;--多执行几个这个sql,增加t2行到107480行
create index i_t2_id on t2(object_id);
收集统计信息:
analyze table t1 compute statistics for table;
analyze table t2 compute statistics for table for all indexes for all indexed columns;
analyze table t3 compute statistics for table;
对比下面两个SQL的执行计划:
SQL1:
SQL> select t1.object_type, count(*)
2 from t1
3 left outer join t2
4 on t1.object_id = t2.object_id
5 left outer join t3
6 on t2.object_id = t3.object_id
7 and t2.object_id in (11316, 11317, 11318, 11319, 11320)
8 where t1.created < trunc(sysdate)
9 group by t1.object_type;
已选择37行。
已用时间: 00: 00: 00.34
执行计划
----------------------------------------------------------
Plan hash value: 1458016722
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107K| 3883K| 5722K (1)| 19:04:26 |
| 1 | HASH GROUP BY | | 107K| 3883K| 5722K (1)| 19:04:26 |
| 2 | NESTED LOOPS OUTER | | 107K| 3883K| 5722K (1)| 19:04:26 |
|* 3 | HASH JOIN OUTER | | 107K| 3883K| 468 (1)| 00:00:06 |
|* 4 | TABLE ACCESS FULL | T1 | 13436 | 432K| 54 (2)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T2 | 107K| 419K| 414 (1)| 00:00:05 |
| 6 | VIEW | | 1 | | 53 (0)| 00:00:01 |
|* 7 | FILTER | | | | | |
|* 8 | TABLE ACCESS FULL| T3 | 1 | 13 | 53 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID"(+))
4 - filter("T1"."CREATED"<TRUNC(SYSDATE@!))
7 - filter("T2"."OBJECT_ID"=11316 OR "T2"."OBJECT_ID"=11317 OR
"T2"."OBJECT_ID"=11318 OR "T2"."OBJECT_ID"=11319 OR
"T2"."OBJECT_ID"=11320)
8 - filter("T2"."OBJECT_ID"="T3"."OBJECT_ID" AND
("T3"."OBJECT_ID"=11316 OR "T3"."OBJECT_ID"=11317 OR
"T3"."OBJECT_ID"=11318 OR "T3"."OBJECT_ID"=11319 OR
"T3"."OBJECT_ID"=11320))
Note
-----
- dynamic sampling used for this statement (level=4)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
8803 consistent gets
0 physical reads
0 redo size
1368 bytes sent via SQL*Net to client
437 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
37 rows processed
SQL2:
SQL> select t1.object_type, count(*)
2 from t1
3 left outer join t2
4 on t1.object_id = t2.object_id
5 and t2.object_id in (11316, 11317, 11318, 11319, 11320)
6 left outer join t3
7 on t2.object_id = t3.object_id
8 where t1.created < trunc(sysdate)
9 group by t1.object_type;
已选择37行。
已用时间: 00: 00: 00.03
执行计划
----------------------------------------------------------
Plan hash value: 2889713393
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13436 | 656K| 115 (3)| 00:00:02 |
| 1 | HASH GROUP BY | | 13436 | 656K| 115 (3)| 00:00:02 |
|* 2 | HASH JOIN RIGHT OUTER | | 13436 | 656K| 114 (2)| 00:00:02 |
| 3 | TABLE ACCESS FULL | T3 | 13435 | 170K| 53 (0)| 00:00:01 |
|* 4 | HASH JOIN RIGHT OUTER| | 13436 | 485K| 60 (2)| 00:00:01 |
| 5 | INLIST ITERATOR | | | | | |
|* 6 | INDEX RANGE SCAN | I_T2_ID | 40 | 160 | 6 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | T1 | 13436 | 432K| 54 (2)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T2"."OBJECT_ID"="T3"."OBJECT_ID"(+))
4 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID"(+))
6 - access("T2"."OBJECT_ID"(+)=11316 OR "T2"."OBJECT_ID"(+)=11317 OR
"T2"."OBJECT_ID"(+)=11318 OR "T2"."OBJECT_ID"(+)=11319 OR
"T2"."OBJECT_ID"(+)=11320)
7 - filter("T1"."CREATED"<TRUNC(SYSDATE@!))
Note
-----
- dynamic sampling used for this statement (level=4)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
365 consistent gets
0 physical reads
0 redo size
1356 bytes sent via SQL*Net to client
437 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
37 rows processed
分析:
SQL1中执行计划可以看到,当有外连接时,Oracle优化器是很单纯的按照SQL中关联顺序选择连接顺序的,本来t2对object_id的限制条件写到left outer join的t3表中,
结果对t2表访问时没有应用到object_id上的索引,而是直接进行全表扫描。
SQL2中,很自然在访问表t2时就应用到了object_id上的索引,进行了索引扫描。
可以看到,SQL2的逻辑读也比SQL1有明显下降。
规范:
1、在有外连接时,一定要将对表本身做的限制放到该表后面的on限制中。除了本例外,就算对t2进行inner join,如果将t2的限制条件object_id仍然放到外连接后,仍然是种低效的SQL。
2、将表的限制条件放到与该表同级别的where条件中,既为性能也为可读性。
3、外连接时,where会在连接的最后进行过滤。所以,如果是对关联中间进行过滤,就将限制条件放到on后,如果是对整个关联结果进行过滤,那就需要将限制条件写到where后了。