原文链接(阅读原文提升阅读体验):https://www.modb.pro/db/23257?cyn
摘要:外连接有OR关联条件只能走NL,若驱动表的结果集较大,则会产生大量的关联,会产生性能问题,需要进行优化。
外连接有OR关联条件只能走NL,若驱动表的结果集较大,则会产生大量的关联,会产生性能问题,需要进行优化。
在两个表做外连接时存在几种情况:
1、在进行外连接时,使用nl,此时主表被固定成驱动表,无法通过hint进行驱动表的调整
2、在进行外连接时,使用hash,可以通过hint调整驱动表和被驱动表
针对外连接的情况做以下的实验:
1、当执行计划是nl,t1是驱动表(主表),t2是被驱动表,调整t2为驱动表,t1为被驱动表。
2、当执行计划是nl,t1是驱动表(主表),t2是被驱动表,调整执行计划为hash。
3、当执行计划是hash,t1是驱动表(主表),t2是被驱动表,调整t2为驱动表,t1为被驱动表。
4、当执行计划是hash,t1是驱动表(主表),t2是被驱动表,调整执行计划为nl。
5、当外连接有OR关联条件(T1.ID = T2.ID OR T1.AGE = T2.ID),进行等价改写。
执行计划是nl的情况
有如下SQL:
SELECT T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
T1_ID T1_NAME T1_AGE T2_ID T2_NAME
---------- ---------- ---------- ---------- ----------
1 a 1 1 a
2 b 2 2 b
3 c 5 3 c
4 d 1
5 e 3
6 f 6
执行计划:
Plan hash value: 3645848104
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 11 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 11 | 2048 | 2048 | 2048 (0)|
| 2 | NESTED LOOPS OUTER | | 1 | 6 | 6 |00:00:00.01 | 11 | | | |
| 3 | TABLE ACCESS FULL | T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| T2 | 6 | 1 | 3 |00:00:00.01 | 4 | | | |
|* 5 | INDEX RANGE SCAN | IDX_ID_T2_01 | 6 | 1 | 3 |00:00:00.01 | 3 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,走了nl,并且t1是驱动表。
1、当执行计划是nl,t1是驱动表(主表),t2是被驱动表,调整t2为驱动表,t1为被驱动表。
在内连接中,可以实现驱动表和被驱动表的调整,但是在外连接中不能调整驱动表的顺序
SELECT /*+ leading(t2 t1) use_hash(t1)*/T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 2391546071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 14 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
|* 2 | HASH JOIN OUTER | | 1 | 6 | 6 |00:00:00.01 | 14 | 1753K| 1753K| 900K (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,驱动表还是t1,并没有改变执行顺序,因此在执行计划是nl的外连接中无法进行驱动表和被驱动表的调整。
不能调整的原因:
在进行外连接时,t1作为主表,左外连接t2,因此需要返回t1的全部数据。嵌套循环需要传值,主表传值给从表之后,如果发现从表没有关联上,直接显示为 NULL 即可;但是如果是从表传值给主表,没关联上的数据不能传值给主表,不可能传 NULL 给主表,所以两表关联是外连接的时候,走嵌套循环驱动表只能固定为主表。
2、当执行计划是nl,t1是驱动表(主表),t2是被驱动表,调整执行计划为hash。
SELECT /*+ leading(t1 t2) use_nl(t2) swap_join_inputs(t1) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
SELECT /*+ leading(t1 t2) use_nl(t2) no_swap_join_inputs(t2) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
SELECT /*+ leading(t2 t1) use_nl(t1) swap_join_inputs(t2) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
SELECT /*+ leading(t2 t1) use_nl(t2) no_swap_join_inputs(t1) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 3645848104
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 11 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 11 | 2048 | 2048 | 2048 (0)|
| 2 | NESTED LOOPS OUTER | | 1 | 6 | 6 |00:00:00.01 | 11 | | | |
| 3 | TABLE ACCESS FULL | T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| T2 | 6 | 1 | 3 |00:00:00.01 | 4 | | | |
|* 5 | INDEX RANGE SCAN | IDX_ID_T2_01 | 6 | 1 | 3 |00:00:00.01 | 3 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T1"."ID"="T2"."ID")
以上正常调整执行计划为hash的方法在对外连接查询时并不能生效,对nl的外连接并不能通过hint可以改变执行计划。
3、当执行计划是hash,t1是驱动表(主表),t2是被驱动表,调整t2为驱动表,t1为被驱动表。
SELECT T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 2391546071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 14 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
|* 2 | HASH JOIN OUTER | | 1 | 6 | 6 |00:00:00.01 | 14 | 1753K| 1753K| 900K (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,走的是hash,并且t1表是驱动表。
通过hint调整驱动表和被驱动表顺序
SELECT /*+ leading(t2 t1) use_hash(t1)*/T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 2391546071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 14 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
|* 2 | HASH JOIN OUTER | | 1 | 6 | 6 |00:00:00.01 | 14 | 1753K| 1753K| 900K (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,驱动表还是t1,并没有改变执行顺序。
注意:
需要在加上一个hint
SWAP_JOIN_INPUTS :说明连接当中谁做内建表(驱动表)
NO_SWAP_JOIN_INPUTS :说明连接中谁做探测表(被驱动表)
SELECT /*+ leading(t2 t1) use_hash(t1) swap_join_inputs(t2) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 2146067096
--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 14 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
|* 2 | HASH JOIN RIGHT OUTER| | 1 | 6 | 6 |00:00:00.01 | 14 | 2061K| 2061K| 937K (0)|
| 3 | TABLE ACCESS FULL | T2 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS FULL | T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
--------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,此时驱动表已经变成了t2,被驱动表变成了t1,同时可以看到id=2的操作,从原来的HASH JOIN OUTER 变成了HASH JOIN RIGHT OUTER,这部分是等价的,相当于t1左外连接t2改写为t2右外连接t1。
4、当执行计划是hash,t1是驱动表(主表),t2是被驱动表,调整执行计划为nl。
SELECT /*+ leading(t2 t1) use_nl(t1) */T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
ORDER BY 1;
Plan hash value: 2391546071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 14 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 14 | 2048 | 2048 | 2048 (0)|
|* 2 | HASH JOIN OUTER | | 1 | 6 | 6 |00:00:00.01 | 14 | 1753K| 1753K| 903K (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
通过执行计划可以看到,t1是驱动表,t2是驱动表,并且还是走的hash,不能通过hint调整执行计划为nl。
5、当外连接有OR关联条件,进行等价改写
SELECT T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON (T1.ID = T2.ID OR T1.AGE = T2.ID)
ORDER BY 1;
T3_ID T3_NAME T3_AGE T4_ID T4_NAME
---------- ---------- ---------- ---------- ----------
1 a 1 1 a
2 b 2 2 b
3 c 5 3 c
4 d 1 1 a
5 e 3 3 c
6 f 6
Plan hash value: 3004654521
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 49 | | | |
| 1 | SORT ORDER BY | | 1 | 6 | 6 |00:00:00.01 | 49 | 2048 | 2048 | 2048 (0)|
| 2 | NESTED LOOPS OUTER | | 1 | 6 | 6 |00:00:00.01 | 49 | | | |
| 3 | TABLE ACCESS FULL | T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | VIEW | | 6 | 1 | 5 |00:00:00.01 | 42 | | | |
|* 5 | TABLE ACCESS FULL| T2 | 6 | 1 | 5 |00:00:00.01 | 42 | | | |
------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter(("T1"."ID"="T2"."ID" OR "T1"."AGE"="T2"."ID"))
当外连接有OR关联条件时,通过执行计划可以看到,走的是nl,并且驱动表是t1,并且不管被驱动表的连接列是否有索引(t2表的连接列id没有索引),但是优化器把其转化为一个view,然后进行nl。此时当t1的结果集很大时,
会造成很大的性能问题。
针对上面的SQL进行分析:
为什么走nl,不走hash?
这时候t1和t2做外连接,主表是t1,t1在进行连接的时候,需要拿出一行记录去和t2的id进行匹配(假设从第一行记录开始匹配)。
当t1拿出第一条记录,会把id和age同时传给传给t2,t2需要去判断给的id值是否等于t2的id值,如果等于则返回记录,不管age是否满足;
如果age值也满足t2的id,那么也返回这条记录,由于此时的id和age值在一行记录,那么只返回一条记录。
然后拿出第二行记录把id和age的值传给t2,若id满足t1.id=t2.id,则返回该记录,不管age是否满足(若id和age都满足则只返回一条记录);
然后拿出第三行记录把id和age的值传给t2,若age满t1.age=t2.age,则返回该记录,不管id是否满足(若id和age都满足则只返回一条记录);
…
一直到把整个t1的数据都和t2做匹配,匹配成功的都返回,匹配失败的,t2的记录都为null。
需要一行一行的去和t2匹配,所以采用nl,可以一条一条的传值。
这时候的执行计划肯定是有问题的:
1、被驱动表是全表扫描,连接列没有索引,t1传出一条数据,t2就需要全表扫描一次。
2、一般来说,走nl是小表在前,大表在后,但是在外连接中,走了nl,或者确定了主表,那么他就一定是驱动表,这里的主表可以是一个表,也可以是一个过滤完的结果集,因此当主表的结果集很大的时候,驱动表就需要被驱动很多次,做了大量的join操作,耗费很多的资源。
几种情况:
t1是小表,t2是大表,但是t2列没有索引,都是全表扫描;
t1是小表,t2是小表,但是t2列没有索引,都是全表扫描;
t1是大表,t2是大表,但是t2列没有索引,都是全表扫描;
t1是大表,t2是小表,但是t2列没有索引,都是全表扫描;
以上的操作都是有问题,走的是nl,但是被驱动表都是全表扫描。
还有其他情况,t2表的连接列有索引
t1是小表,t2是大表,但是t2列有索引;
t1是小表,t2是小表,但是t2列有索引;
t1是大表,t2是大表,但是t2列有索引;
t1是大表,t2是小表,但是t2列有索引;
以上的操作相比较全表扫描而言性能有所提高,但是也是存在大量的join。
当t2的id列有索引时
SELECT T1.ID T1_ID
,T1.NAME T1_NAME
,T1.AGE T1_AGE
,T2.ID T2_ID
,T2.NAME T2_NAME
FROM T1
LEFT JOIN T2
ON (T1.ID = T2.ID OR T1.AGE = T2.ID)
ORDER BY 1;
Plan hash value: 4262054773
-------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 24 | | | |
| 1 | SORT ORDER BY | | 1 | 12 | 6 |00:00:00.01 | 24 | 2048 | 2048 | 2048 (0)|
| 2 | NESTED LOOPS OUTER | | 1 | 12 | 6 |00:00:00.01 | 24 | | | |
| 3 | TABLE ACCESS FULL | T1 | 1 | 6 | 6 |00:00:00.01 | 7 | | | |
| 4 | VIEW | | 6 | 2 | 5 |00:00:00.01 | 17 | | | |
| 5 | CONCATENATION | | 6 | | 5 |00:00:00.01 | 17 | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 6 | 1 | 4 |00:00:00.01 | 10 | | | |
|* 7 | INDEX RANGE SCAN | IDX_ID_T2_01 | 6 | 2 | 4 |00:00:00.01 | 6 | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| T2 | 6 | 1 | 1 |00:00:00.01 | 7 | | | |
|* 9 | INDEX RANGE SCAN | IDX_ID_T2_01 | 6 | 2 | 1 |00:00:00.01 | 6 | | | |
-------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("T1"."AGE"="T2"."ID")
9 - access("T1"."ID"="T2"."ID")
filter(LNNVL("T1"."AGE"="T2"."ID"))
因为连接条件都是对t2的id进行关联,在t2的连接条件上有索引时,会使用索引,但是会进行两次索引扫描,然后回表,然后把这个结果集作为一个视图。t1给一条记录,则扫描一次视图,这样也是有问题的。
使用上述操作时存在以下问题:
1、访问方式被固定,只能使用nl,不管被驱动表的连接列是否有索引
2、当驱动表很大,被驱动表很小,使用nl的效率很低,被驱动表需要访问t1的行记录数(结果集)。
需要进行等价改写,使得这样的查询执行计划不走nl,或者可以更改驱动表(不可能,前面提过,nl的外连接无法更改驱动表)。
因此只考虑等价改写,用来消除or的影响。
在进行等价改写时,又分为两种情况:
1、t2的id字段没有重复值
2、t2的id字段有重复值
当t2的id字段没有重复值,进行等价改写(感谢郭老师指导):
(这里采用t3代替t1;t4代替t2,并且t4的id列有主键)
SELECT *
FROM (SELECT T.*
,ROW_NUMBER() OVER(PARTITION BY T.T3_RID ORDER BY T.T4_ID) RN
FROM (SELECT T3.ID T3_ID
,T3.NAME T3_NAME
,T3.AGE T3_AGE
,T4.ID T4_ID
,T4.NAME T4_NAME
,T3.ROWID T3_RID
FROM T3
LEFT JOIN T4
ON T3.ID = T4.ID
UNION ALL
SELECT T3.ID T3_ID
,T3.NAME T3_NAME
,T3.AGE T3_AGE
,T4.ID T4_ID
,T4.NAME T4_NAME
,T3.ROWID T3_RID
FROM T3
LEFT JOIN T4
ON T3.AGE = T4.ID) T)
WHERE RN = 1
order by 1;
T3_ID T3_NAME T3_AGE T4_ID T4_NAME
---------- ---------- ---------- ---------- ----------
1 a 1 1 a
2 b 2 2 b
3 c 5 3 c
4 d 1 1 a
5 e 3 3 c
6 f 6
Plan hash value: 2223420675
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 7 |00:00:00.01 | 28 | | | |
| 1 | SORT ORDER BY | | 1 | 14 | 7 |00:00:00.01 | 28 | 2048 | 2048 | 2048 (0)|
|* 2 | VIEW | | 1 | 14 | 7 |00:00:00.01 | 28 | | | |
|* 3 | WINDOW SORT PUSHED RANK| | 1 | 14 | 14 |00:00:00.01 | 28 | 2048 | 2048 | 2048 (0)|
| 4 | VIEW | | 1 | 14 | 14 |00:00:00.01 | 28 | | | |
| 5 | UNION-ALL | | 1 | | 14 |00:00:00.01 | 28 | | | |
|* 6 | HASH JOIN OUTER | | 1 | 7 | 7 |00:00:00.01 | 14 | 1321K| 1321K| 891K (0)|
| 7 | TABLE ACCESS FULL | T3 | 1 | 7 | 7 |00:00:00.01 | 7 | | | |
| 8 | TABLE ACCESS FULL | T4 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
|* 9 | HASH JOIN OUTER | | 1 | 7 | 7 |00:00:00.01 | 14 | 1321K| 1321K| 891K (0)|
| 10 | TABLE ACCESS FULL | T3 | 1 | 7 | 7 |00:00:00.01 | 7 | | | |
| 11 | TABLE ACCESS FULL | T4 | 1 | 3 | 3 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RN"=1)
3 - filter(ROW_NUMBER() OVER ( PARTITION BY "T"."T3_RID" ORDER BY "T"."T4_ID")<=1)
6 - access("T3"."ID"="T4"."ID")
9 - access("T3"."AGE"="T4"."ID")
通过执行计划可以看到,不走nl,都走了hash,并且通过Starts列可以看到,对每个表的访问次数都是1,达到了通过改写SQL把nl调整为hash的效果。
当t2的id字段有重复值,进行等价改写:
(这里采用t0代替t01;t2代替t02,并且t4的id列没有主键,并且存在重复值)
SELECT T01.ID T01_ID
,T01.NAME T01_NAME
,T01.AGE T01_AGE
,T02.ID T02_ID
,T02.NAME T02_NAME
FROM T01
LEFT JOIN T02
ON (T01.ID = T02.ID OR T01.AGE = T02.ID)
ORDER BY 1;
T01_ID T01_NAME T01_AGE T02_ID T02_NAME
---------- ---------- ---------- ---------- ----------
1 a 1 1 x
1 a 1 1 y
2 b 2 2 z
3 c 5 3 o
4 d 1 1 x
4 d 1 1 y
5 e 3 3 o
6 f 6
等价改写(感谢刘老师指导)
WITH TMP_A AS
(SELECT ID
,NAME
,AGE
,0 AS FLAG
FROM T01
UNION ALL
SELECT AGE
,NAME
,ID
,NULL
FROM T01
WHERE LNNVL(ID = AGE)),
TMP_B AS
(SELECT A.ID
,A.NAME
,A.AGE
,A.FLAG
,B.ID AS BID
,B.NAME AS BNAME
FROM TMP_A A
LEFT JOIN T02 B
ON A.ID = B.ID),
TMP_C AS
(SELECT NVL2(FLAG, ID, AGE) AS ID
,NAME
,NVL2(FLAG, AGE, ID) AS AGE
,BID
,BNAME
,FLAG
,DENSE_RANK() OVER(PARTITION BY NVL2(FLAG, ID, AGE), NAME, NVL2(FLAG, AGE, ID) ORDER BY NVL2(BID, 1, NULL) NULLS LAST) AS DRN
FROM TMP_B)
SELECT ID
,NAME
,AGE
,BID
,BNAME --,drn,flag
FROM TMP_C
WHERE DRN = 1
AND (FLAG IS NOT NULL OR BID IS NOT NULL)
ORDER BY 1
,2
,3
,4
,5;
ID NAME AGE BID BNAME
---------- ---------- ---------- ---------- ----------
1 a 1 1 x
1 a 1 1 y
2 b 2 2 z
3 c 5 3 o
4 d 1 1 x
4 d 1 1 y
5 e 3 3 o
6 f 6
Plan hash value: 225818522
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 8 |00:00:00.01 | 21 | 1 | | | |
| 1 | SORT ORDER BY | | 1 | 12 | 8 |00:00:00.01 | 21 | 1 | 2048 | 2048 | 2048 (0)|
|* 2 | VIEW | | 1 | 12 | 8 |00:00:00.01 | 21 | 1 | | | |
|* 3 | WINDOW SORT PUSHED RANK| | 1 | 12 | 11 |00:00:00.01 | 21 | 1 | 2048 | 2048 | 2048 (0)|
|* 4 | HASH JOIN OUTER | | 1 | 12 | 11 |00:00:00.01 | 21 | 1 | 1645K| 1645K| 1133K (0)|
| 5 | VIEW | | 1 | 9 | 9 |00:00:00.01 | 14 | 0 | | | |
| 6 | UNION-ALL | | 1 | | 9 |00:00:00.01 | 14 | 0 | | | |
| 7 | TABLE ACCESS FULL | T01 | 1 | 6 | 6 |00:00:00.01 | 7 | 0 | | | |
|* 8 | TABLE ACCESS FULL | T01 | 1 | 3 | 3 |00:00:00.01 | 7 | 0 | | | |
| 9 | TABLE ACCESS FULL | T02 | 1 | 4 | 4 |00:00:00.01 | 7 | 1 | | | |
--------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("DRN"=1 AND ("FLAG" IS NOT NULL OR "BID" IS NOT NULL)))
3 - filter(DENSE_RANK() OVER ( PARTITION BY NVL2("A"."FLAG","A"."ID","A"."AGE"),"A"."NAME",NVL2("A"."FLAG","A"."AGE",
"A"."ID") ORDER BY NVL2("B"."ID",1,NULL))<=1)
4 - access("A"."ID"="B"."ID")
8 - filter(LNNVL("ID"="AGE"))
通过执行计划可以看到,不走nl,都走了hash,并且通过Starts列可以看到,对每个表的访问次数都是1,达到了通过改写SQL把nl调整为hash的效果。
更多Oracle性能优化、故障处理文章:https://www.modb.pro/u/2221?cyn