zoukankan      html  css  js  c++  java
  • [Oracle 9i] Subquery Factoring in 9i (用With语句做公共子查询提取)

    Subquery Factoring, 其实就是平常比较熟悉的With语句,Adrian Billington 在他的网站上写了一篇介绍Subquery Factoring很好的文章,见这里。这篇Blog同样是对他的这篇文章的笔记。

    With语句一般有两种用途,一种就是用来把复杂的SQL语句简单化:复杂的SQL语句一般都会嵌套很多层次,无论是写起来还是读起来都很困难,通过用With语句,把子查询语句抽取出来,这样可以使得SQL语句“扁平化”,写起来会很方便,读起来也很容易。 With的这种用法其实就是相当于把复杂语句中的inline-view给提取出来作为一个单独的查询语句,并赋予一个名字,这样用起来就跟访问一个视图一样。

    With的另外一种用处是可以用来优化SQL语句,如果一个复杂的SQL语句需要重复访问一张表(最好是数据量比较大的表),这个时候如果用With把这部分需要重复访问底层表的SQL语句提取出来(Oracle往往会把这部分数据“物化”到一个临时表中),之后就不用重复多次访问底层表,从而可以提升SQL语句的执行效率。

    下面主要看看With在SQL优化方面的一个例子,(注意, 我得到的结果跟Adrian很不一样, 我用的是10g来run他的例子的!)

    首先创建测试用例:

    SQL> show user
    USER is "FRANK"
    SQL> CREATE TABLE sales
      2  NOLOGGING
      3  AS
      4     SELECT al.owner    AS region
      5     ,      al.object_type AS product
      6     ,      al.object_id   AS order_amt
      7     FROM   all_objects al
      8     ,      all_objects a2
      9     WHERE  ROWNUM <= 1000000;
     
    Table created.
     
    SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'sales');
     
    PL/SQL procedure successfully completed.

    首先看看不用With语句的效果,

    SQL> set autotrace on
    SQL> set timing on
    SQL>
    SQL> SELECT region
      2  ,      total_sales
      3  FROM   (
      4          SELECT region
      5          ,      NVL(SUM(order_amt), 0) AS total_sales
      6          FROM   sales
      7          GROUP  BY
      8                 region
      9          ) ilv
     10  WHERE  total_sales > ( SELECT SUM(order_amt)/3 AS one_third_sales
     11                         FROM  sales);
     
    REGION                         TOTAL_SALES
    ------------------------------ -----------
    SYS                               27760438
     
    Elapsed: 00:00:01.68
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 302444457
     
    -----------------------------------------------------------------------------
    | Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT    |       |     1 |     7 |   753  (26)| 00:00:10 |
    |*  1 |  FILTER             |       |       |       |            |          |
    |   2 |   HASH GROUP BY     |       |     1 |     7 |   753  (26)| 00:00:10 |
    |   3 |    TABLE ACCESS FULL| SALES |  1006K|  6877K|   597   (7)| 00:00:08 |
    |   4 |   SORT AGGREGATE    |       |     1 |     3 |            |          |
    |   5 |    TABLE ACCESS FULL| SALES |  1006K|  2947K|   597   (7)| 00:00:08 |
    -----------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       1 - filter(NVL(SUM("ORDER_AMT"),0)> (SELECT SUM("ORDER_AMT")/3 FROM
                  "SALES" "SALES"))
     
     
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
           5034  consistent gets
             18  physical reads
              0  redo size
            477  bytes sent via SQL*Net to client
            396  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
     
    SQL>

    从执行计划可以看到,确实出现了两次访问底层表sales!

    这条SQL语句的执行时间是1.68秒,cost 是753, 物理I/O为18.  (这跟Adrian的测试结果差了很多,可能是由于数据量的问题)

    然后再看看用With 语句把重复访问表的部分抽出来作为临时表的效果,

    SQL> WITH region_sales AS
      2  (  SELECT region
      3     ,      NVL(SUM(order_amt), 0) AS total_sales
      4     FROM   sales
      5     GROUP  BY
      6            region
      7  )
      8  SELECT region
      9  ,      total_sales
     10  FROM   region_sales
     11  WHERE  total_sales > (SELECT SUM(total_sales)/3 AS one_third_sales
     12                        FROM   region_sales);
     
    REGION                         TOTAL_SALES
    ------------------------------ -----------
    SYS                               27760438
     
    Elapsed: 00:00:02.26
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1988045888
     
    --------------------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |                           |     1 |    30 |   757  (26)| 00:00:10 |
    |   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
    |   2 |   LOAD AS SELECT           |                           |       |       |            |          |
    |   3 |    HASH GROUP BY           |                           |     1 |     7 |   753  (26)| 00:00:10 |
    |   4 |     TABLE ACCESS FULL      | SALES                     |  1006K|  6877K|   597   (7)| 00:00:08 |
    |*  5 |   VIEW                     |                           |     1 |    30 |     2   (0)| 00:00:01 |
    |   6 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6612_10B3A7 |     1 |     7 |     2   (0)| 00:00:01 |
    |   7 |    SORT AGGREGATE          |                           |     1 |    13 |            |          |
    |   8 |     VIEW                   |                           |     1 |    13 |     2   (0)| 00:00:01 |
    |   9 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6612_10B3A7 |     1 |     7 |     2   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       5 - filter("TOTAL_SALES"> (SELECT SUM("TOTAL_SALES")/3 FROM  (SELECT /*+ CACHE_TEMP_TABLE
                  ("T1") */ "C0" "REGION","C1" "TOTAL_SALES" FROM "SYS"."SYS_TEMP_0FD9D6612_10B3A7" "T1")
                  "REGION_SALES"))
     
     
    Statistics
    ----------------------------------------------------------
            102  recursive calls
             11  db block gets
           2540  consistent gets
             43  physical reads
           1472  redo size
            477  bytes sent via SQL*Net to client
            396  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
     
    SQL>

    从执行计划可以看出,这次是访问一次底层表(创建临时表),然后是两次对临时表的访问(代价很明显小多了)

    时间花销是2.26秒, 代价是757,物理I/O为43,每项数据都比不用With要多! (这个跟Adrian的差得太远了!)这也说明了一点,With语句并不一定会提升性能,从With语句的执行计划中可以看出有一步叫Temp Table Transformation,Oracle会创建一个global temporary table, 并从底层表sales 装载数据到这个临时表中,这些从执行计划都可以看到。这一步带来的代价是很高的(753),所以如果直接访问底层多次的代价比创建临时表的代价还要小的话,用With语句就没有多大价值了。相反,如果底层表的数据量很大,多次访问地层表的代价就很可能超过一次创建临时表的代价,这样With语句的优势就显现出来了!

    还有一点要注意的是,用With语句并不代表Oracle一定会创建临时表,因为查询优化器很“聪明”,它会分析值不值得这么做。这个时候,如果我们想明确告诉Oracle,我们需要它来创建这样一个临时表,我们可以用一个hint – materialize .  但是由于这是一个没有官方文档说明的hint, 所以很难保证这个hint会一直起作用,所以用这个hint也要很小心才行!

     

    关于With语句用法的一些限制性,Adrian说得很清楚,可以参考他的文章

  • 相关阅读:
    转:每个架构师都应该研究下康威定律
    使用OpenShiftFQ上外网
    关系模式设计
    数据库应用系统工程过程
    数据库系统
    四种常见 Git 工作流比较
    Git 进阶指南
    C#高性能TCP服务的多种实现方式
    浮动广告
    <span></span>
  • 原文地址:https://www.cnblogs.com/fangwenyu/p/1639527.html
Copyright © 2011-2022 走看看