Oracle where条件中substr(字段,1,?)='XXX...'建议改写为like
前言
类似where中substr(name,1,2)='AB'会导致name上的普通索引一定无法使用。
案例
原SQL语句如下(经过处理):
INSERT INTO XXXX_XXXXXXXXX_TMP SELECT XXXXXXX_ID1, XXXX_ID2, XXX_ID3, XXXX_XXXXXXXXX_SEQ_NO.NEXTVAL, XXXX_TYPE1, XXXX_TYPE2, XXXX_NAME, FILE_SIZE, FILE_FLAG, SENDER, SUBSTR(RECEIVE, 1, 7) || '28' RECEIVE, XX_TIME, IN_POSITION, NULL, NULL, 0, 'Failed_HZ', SAVE_POSITION, KEYWORDS, FUNCTION_CODE FROM XXXX_XXXXXXXXX X WHERE X.SEQ_NO IN (SELECT MAX(T.SEQ_NO) SEQ_NO FROM XXXX_XXXXXXXXX T WHERE T.XX_TIME >= SYSDATE - 5 AND T.XXXX_TYPE1 = '处理' AND T.XXXX_TYPE2 = '处理处理' AND SUBSTR(T.XXXX_NAME, 1, 1) = 'F' GROUP BY UPPER(SUBSTR(T.XXXX_NAME, INSTR(T.XXXX_NAME, 'CN'), INSTR(T.XXXX_NAME, '_', -1) - 8) || '.xml') HAVING COUNT(*) = '2');
该语句跑完会导致XXXX_XXXXXXXXX 至少全表扫描1次以上。
下边是select部分执行完的执行计划信息:
Plan hash value: 733887443 ------------------------------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:56.11 | 1260K| 1260K| | | | | 1 | SEQUENCE | XXXX_XXXXXXXXX_SEQ_NO | 1 | | 0 |00:00:56.11 | 1260K| 1260K| | | | | 2 | NESTED LOOPS | | 1 | 3 | 0 |00:00:56.11 | 1260K| 1260K| | | | | 3 | NESTED LOOPS | | 1 | 3 | 0 |00:00:56.11 | 1260K| 1260K| | | | | 4 | VIEW | VW_NSO_1 | 1 | 3 | 0 |00:00:56.11 | 1260K| 1260K| | | | |* 5 | FILTER | | 1 | | 0 |00:00:56.11 | 1260K| 1260K| | | | | 6 | HASH GROUP BY | | 1 | 3 | 4 |00:00:56.11 | 1260K| 1260K| 846K| 846K| 727K (0)| | 7 | PARTITION RANGE ITERATOR | | 1 | 227 | 4 |00:00:56.11 | 1260K| 1260K| | | | |* 8 | TABLE ACCESS FULL | XXXX_XXXXXXXXX | 1 | 227 | 4 |00:00:56.11 | 1260K| 1260K| | | | |* 9 | INDEX UNIQUE SCAN | PK_XXXX_XXXXXXXXX | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | | | 10 | TABLE ACCESS BY GLOBAL INDEX ROWID| XXXX_XXXXXXXXX | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | | ------------------------------------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 5 - filter(COUNT(*)=2) 8 - filter(("T"."XXXX_TYPE2"='处理处理' AND "T"."XXXX_TYPE1"='处理' AND SUBSTR("T"."XXXX_NAME",1,1)='F' AND "T"."XX_TIME">=SYSDATE@!-5)) 9 - access("X"."SEQ_NO"="SEQ_NO")
语句执行56s左右,根据历史执行情况,执行时间在1min~2min不等。
其中,表高达62G,每次执行全表都是物理读(直接路径读)。
由于分区字段问题,除了每个月前5天,均只会访问1个分区,产生的物理读为1260000*8/1024/1024=9.7G。
而每个月前5天,由于条件为T.XX_TIME >= SYSDATE - 5,因此会导致访问2个分区,那么物理读只会更大。
另外,T.XXXX_NAME是有索引的,且第一个字符是'F'的可选择率非常小。
原语句写法为SUBSTR(T.XXXX_NAME, 1, 1) = 'F'会导致该字段上的索引无法使用。
因此建议该条件改为:AND T.FILE_NAME LIKE 'F%':
INSERT INTO FILE_TRANSFERS_TMP SELECT CUSTOMS_ID, PORT_ID, DOCK_ID, FILE_TRANSFERS_SEQ_NO.NEXTVAL, BUSI_TYPE, FILE_TYPE, FILE_NAME, FILE_SIZE, FILE_FLAG, SENDER, SUBSTR(RECEIVE, 1, 7) || '28' RECEIVE, IN_TIME, IN_POSITION, NULL, NULL, 0, 'Failed_HZ', SAVE_POSITION, KEYWORDS, FUNCTION_CODE FROM FILE_TRANSFERS X WHERE X.SEQ_NO IN (SELECT MAX(T.SEQ_NO) SEQ_NO FROM FILE_TRANSFERS T WHERE T.IN_TIME >= SYSDATE - 5 AND T.BUSI_TYPE = '平文' AND T.FILE_TYPE = '回执处理' AND T.FILE_NAME LIKE 'F%' GROUP BY UPPER(SUBSTR(T.FILE_NAME, INSTR(T.FILE_NAME, 'CN'), INSTR(T.FILE_NAME, '_', -1) - 8) || '.xml') HAVING COUNT(*) = '2');
新的语句select部分秒出,执行计划消耗信息如下:
Plan hash value: 4213418537 ------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.02 | 1048 | | | | | 1 | SEQUENCE | XXXX_XXXXXXXXX_SEQ_NO | 1 | | 0 |00:00:00.02 | 1048 | | | | | 2 | NESTED LOOPS | | 1 | 1 | 0 |00:00:00.02 | 1048 | | | | | 3 | NESTED LOOPS | | 1 | 1 | 0 |00:00:00.02 | 1048 | | | | | 4 | VIEW | VW_NSO_1 | 1 | 1 | 0 |00:00:00.02 | 1048 | | | | |* 5 | FILTER | | 1 | | 0 |00:00:00.02 | 1048 | | | | | 6 | HASH GROUP BY | | 1 | 1 | 4 |00:00:00.02 | 1048 | 846K| 846K| 743K (0)| |* 7 | TABLE ACCESS BY GLOBAL INDEX ROWID| XXXX_XXXXXXXXX | 1 | 18 | 4 |00:00:00.02 | 1048 | | | | |* 8 | INDEX RANGE SCAN | IDX_XXXX_NAME | 1 | 2807 | 7781 |00:00:00.01 | 112 | | | | |* 9 | INDEX UNIQUE SCAN | PK_XXXX_XXXXXXXXX | 0 | 1 | 0 |00:00:00.01 | 0 | | | | | 10 | TABLE ACCESS BY GLOBAL INDEX ROWID | XXXX_XXXXXXXXX | 0 | 1 | 0 |00:00:00.01 | 0 | | | | ------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - filter(COUNT(*)=2) 7 - filter(("T"."XXXX_TYPE"='处理处理' AND "T"."XXXX_TYPE"='处理' AND "T"."XX_TIME">=SYSDATE@!-5)) 8 - access("T"."XXXX_NAME" LIKE 'F%') filter("T"."XXXX_NAME" LIKE 'F%') 9 - access("X"."SEQ_NO"="SEQ_NO")
没有了物理读,逻辑读也仅为8.2M。
建议
一般情况下,substr(字段,1,?)='XXX...'均写为“字段 like 'XXX...%'”,为可能使用索引创造条件。