zoukankan      html  css  js  c++  java
  • 利用递归WITH子查询进行优化的实例

    利用递归WITH子查询进行优化的实例

    递归WITH子查询我已经用过不少了,不过都是当作玩具,没有在实践中用过。昨天碰到了一个实用例子。
    在一个OLTP中有一张表,主键是随着创建时间递增的。每天产生大约1万条记录,全表大约有几百万,没有分区,创建时间没有索引。
    现在要求取出最近两天的数据并且从中过滤出部分记录。过滤条件上也没有索引。
    原来的查询为全表扫描,效率太低,有没有办法改善?前提:因为原表索引够多了,不能新增索引;原查询来自JAVA程序,修改之后必须还是一个SQL, 交由JAVA程序执行。
    测试数据:
    CREATE TABLE items AS
    SELECT 3000000 - LEVEL AS item_id
    ,SYSDATE-LEVEL/10000 AS created_date
    ,TRUNC(DBMS_RANDOM.VALUE(1,101)) AS item_type
    ,SYS_GUID() AS DESCRIPTION
    FROM DUAL
    CONNECT BY LEVEL<=1000000;
    ALTER TABLE items ADD CONSTRAINT items_pk PRIMARY KEY (item_id) USING INDEX;
    EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'ITEMS');
    原表要复杂得多,现在简化为100万的一张表。查询大约相当于:
    SELECT * FROM items WHERE created_date >= TRUNC(SYSDATE)-2 AND item_type=14;
    294 rows selected.
    Elapsed: 00:00:00.90
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 446380563
    ---------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    ---------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 258 | 8772 | 1501 (1)| 00:00:19 |
    |* 1 | TABLE ACCESS FULL| ITEMS | 258 | 8772 | 1501 (1)| 00:00:19 |
    ---------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("ITEM_TYPE"=14 AND "CREATED_DATE">=TRUNC(SYSDATE@!)-2)
    Statistics
    ----------------------------------------------------------
    0 recursive calls
    0 db block gets
    5429 consistent gets
    5406 physical reads
    0 redo size
    13636 bytes sent via SQL*Net to client
    625 bytes received via SQL*Net from client
    21 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    294 rows processed
    例子表简化了很多, 所以只花了0.9秒,5429个一致读。实际表要胖得多,查询需要几十秒。
    改写:
    因为主键是随着时间递增的,所以“取最近的数据”可转换为:取ID最大的N条数据,如果其中最早的一条还是两天之内就增大N, 直到日期落在两天之外停止查询。
    WITH t (item_id,cnt) AS (
    SELECT max(item_id),1 FROM items ---先取最近的
    UNION ALL
    select (SELECT MIN(item_id) FROM (SELECT item_id FROM items ORDER BY item_id DESC) WHERE ROWNUM<=t.cnt+2000) ---- 跳跃取2000行之后的ID
    ,cnt+2000 ----- 当日期还在区间内则递增取ID的范围。根据每天的数据两选取合适的步长,这里定为2000
    FROM t
    WHERE (SELECT MAX(created_date) FROM items WHERE item_id=t.item_id)>=TRUNC(SYSDATE)-2) ---- 当取到的ID落在区间外则停止递归
    CYCLE item_id SET cycle_flag TO 'Y' DEFAULT 'N' ---- 虽然ID都不重复但是ORACLE会报告有循环数据,所以在这里加上CYCLE语句
    select *
    FROM items
    WHERE item_id>=(SELECT min(item_id) from t) ----- 利用前面的搜索结果
    AND item_type=14
    AND created_date >= TRUNC(SYSDATE)-2;
    294 rows selected.
    Elapsed: 00:00:00.16
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1844953721
    ---------------------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    ---------------------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 13 | 442 | 81 (0)| 00:00:01 |
    |* 1 | TABLE ACCESS BY INDEX ROWID | ITEMS | 13 | 442 | 72 (0)| 00:00:01 |
    |* 2 | INDEX RANGE SCAN | ITEMS_PK | 9000 | | 23 (0)| 00:00:01 |
    | 3 | SORT AGGREGATE | | 1 | 13 | | |
    | 4 | VIEW | | 2 | 26 | 9 (0)| 00:00:01 |
    | 5 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
    | 6 | SORT AGGREGATE | | 1 | 6 | | |
    | 7 | INDEX FULL SCAN (MIN/MAX) | ITEMS_PK | 1 | 6 | 3 (0)| 00:00:01 |
    | 8 | SORT AGGREGATE | | 1 | 13 | | |
    |* 9 | COUNT STOPKEY | | | | | |
    | 10 | VIEW | | 1000K| 12M| 2238 (1)| 00:00:27 |
    | 11 | INDEX FULL SCAN DESCENDING | ITEMS_PK | 1000K| 5859K| 2238 (1)| 00:00:27 |
    |* 12 | FILTER | | | | | |
    | 13 | RECURSIVE WITH PUMP | | | | | |
    | 14 | SORT AGGREGATE | | 1 | 14 | | |
    | 15 | TABLE ACCESS BY INDEX ROWID | ITEMS | 1 | 14 | 3 (0)| 00:00:01 |
    |* 16 | INDEX UNIQUE SCAN | ITEMS_PK | 1 | | 2 (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("ITEM_TYPE"=14 AND "CREATED_DATE">=TRUNC(SYSDATE@!)-2)
    2 - access("ITEM_ID">= (SELECT MIN("ITEM_ID") FROM "T" "T"))
    9 - filter(ROWNUM<=:B1+2000)
    12 - filter( (SELECT MAX("CREATED_DATE") FROM "ITEMS" "ITEMS" WHERE
    "ITEM_ID"=:B1)>=TRUNC(SYSDATE@!)-2)
    16 - access("ITEM_ID"=:B1)
    Statistics
    ----------------------------------------------------------
    0 recursive calls
    0 db block gets
    722 consistent gets
    0 physical reads
    0 redo size
    13636 bytes sent via SQL*Net to client
    625 bytes received via SQL*Net from client
    21 SQL*Net roundtrips to/from client
    15 sorts (memory)
    0 sorts (disk)
    294 rows processed
    主键索引用得上了,一致读下降为722。

  • 相关阅读:
    C语言寒假大作战04
    C语言寒假大作战03
    C语言寒假大作战02
    C语言寒假大作战01
    C语言I作业12—学期总结
    C语言I博客作业11
    C语言I博客作业10
    C语言I博客作业09
    C语言I博客作业08
    c语言||作业01
  • 原文地址:https://www.cnblogs.com/tracy/p/2082914.html
Copyright © 2011-2022 走看看