zoukankan      html  css  js  c++  java
  • 数据仓库中的分区修剪

    Partition Pruning

    在数据仓库中分区修剪是一种非常有效的性能特性。分析修剪分析SQL中的WHERE FROM字句,从而在查询中消除不不必要分区。分区修剪技术能大大的减少从磁盘上读取的数据量,从而缩短运行时间,改善查询性能,减少资源浪费。即使你的索引分区和表分区不同,分区修剪也可以在索引上生效(global partition index),从而消除不必要的索引分区。

    分区修剪的特性依赖SQL语句,Oracle 有两种分区修剪:动态修剪和静态修剪。静态修剪发生在编译时期,在执行计划指定的时候,已经知道那些分区会被使用。而动态修剪发生在运行时,也就是说在运行的时候,才会知道那些分区会被用到。例如,WHERE字句里面包含一个函数或者子查询用于返回分区键的值。

    Information That Can Be Used for Partition Pruning

    Oracle分区修剪在你使用rangelike=inlist等谓词在range或者list分区的时候生效,以及使用=inlist谓词在hash 分区时。

    对于复合分区对象,Oracle能在每个level都实现分区修剪。例如下面的SQL sales_range_hash按字段s_saledate做范围分区,按s_productid字段做hash子分区:

    CREATE TABLE sales_range_hash(

    s_productid NUMBER,

    s_saledate   DATE,

    s_custid   NUMBER,

    s_totalprice NUMBER)

    PARTITION BY RANGE (s_saledate)

    SUBPARTITION BY HASH (s_productid) SUBPARTITIONS 8

    (PARTITION sal99q1 VALUES LESS THAN

      (TO_DATE('01-APR-1999', 'DD-MON-YYYY')),

    PARTITION sal99q2 VALUES LESS THAN

      (TO_DATE('01-JUL-1999', 'DD-MON-YYYY')),

    PARTITION sal99q3 VALUES LESS THAN

      (TO_DATE('01-OCT-1999', 'DD-MON-YYYY')),

    PARTITION sal99q4 VALUES LESS THAN

      (TO_DATE('01-JAN-2000', 'DD-MON-YYYY')));



    SELECT * FROM sales_range_hash

    WHERE s_saledate BETWEEN (TO_DATE('01-JUL-1999', 'DD-MON-YYYY'))

    AND (TO_DATE('01-OCT-1999', 'DD-MON-YYYY')) AND s_productid = 1200;



    Oracle的分区修剪过程如下:

    Oracle访问partitions sal99q2 sal99q3
    Oracle访问子partition 通过s_productid=1200

    How to Identify Whether Partition Pruning has been Used

    EXPAIN PLAN中可以看出分区修剪是否生效。查看PLAN TABLE的字段PSTART (PARTITION_START) and PSTOP (PARTITION_STOP)

    Static Partition Pruning

    大多情况下,Oracle在编译的时候判断分区的访问方式。当你使用静态的谓词的时候即发生静态分区,除了下面这些情况:

    分区修剪的条件来至一个子查询的结果
    优化器利用星型转换重写了查询,而分区修剪发生在转换以后
    最有效的执行计划是一个NESTED LOOP

    这三种情况其实就是动态修剪。

    请看下面的例子:

    SQL> explain plan for select * from sales where time_id = to_date('01-jan-2001', 'dd-mon-yyyy');

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    ----------------------------------------------------------------------------------------------

    Plan hash value: 3971874201

    ----------------------------------------------------------------------------------------------

    | Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time   | Pstart| Pstop |

    ----------------------------------------------------------------------------------------------

    | 0 | SELECT STATEMENT     |     | 673 | 19517 | 27     (8)| 00:00:01 |     |     |

    | 1 | PARTITION RANGE SINGLE|     | 673 | 19517 | 27     (8)| 00:00:01 | 17   | 17   |

    |* 2 |   TABLE ACCESS FULL   | SALES | 673 | 19517 | 27     (8)| 00:00:01 | 17   | 17   |

    ----------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

      2 - filter("TIME_ID"=TO_DATE('2001-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

    执行计划显示Oracle访问的分区号为17PSTART PSTOP)。有一点例外的是,执行计划在显示对一个间隔分区的全表扫描时候,PSTART1PSTOP1048575,而不是实际的分区数量。

    Dynamic Partition Pruning

    动态分区发生在如果静态分区修剪无法生效的时:

    Dynamic Pruning with Bind Variables

    使用绑定变量会发生分区修剪. 例如:

    SQL> explain plan for select * from sales s where time_id in ( :a, :b, :c, :d);

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    ---------------------------------------------------------------------------------------------------

    Plan hash value: 513834092

    ---------------------------------------------------------------------------------------------------

    | Id | Operation                 |   Name |Rows|Bytes|Cost (%CPU)| Time | Pstart| Pstop|

    ---------------------------------------------------------------------------------------------------

    | 0 | SELECT STATEMENT             |       |2517|72993|   292 (0)|00:00:04|     |     |

    | 1 | INLIST ITERATOR             |       |   |   |       |     |     |     |

    | 2 |   PARTITION RANGE ITERATOR     |       |2517|72993|   292 (0)|00:00:04|KEY(I) |KEY(I)|

    | 3 |   TABLE ACCESS BY LOCAL INDEX ROWID| SALES |2517|72993|   292 (0)|00:00:04|KEY(I) |KEY(I)|

    | 4 |   BITMAP CONVERSION TO ROWIDS   |       |   |   |       |     |     |     |

    |* 5 |     BITMAP INDEX SINGLE VALUE   |SALES_TIME_BIX| |   |       |     |KEY(I) |KEY(I)|

    ---------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

    5 - access("TIME_ID"=:A OR "TIME_ID"=:B OR "TIME_ID"=:C OR "TIME_ID"=:D)

    对于并行执行计划来说,只有分区STARTSTOP字段包含分区修剪信息。Operation字段包含的是并行操作的信息,如下例子:



    SQL> explain plan for select * from sales where time_id in (:a, :b, :c, :d);

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    -------------------------------------------------------------------------------------------------

    Plan hash value: 4058105390

    -------------------------------------------------------------------------------------------------

    | Id| Operation       | Name |Rows|Bytes|Cost(%CP| Time |Pstart| Pstop| TQ |INOUT| PQ Dis|

    -------------------------------------------------------------------------------------------------

    | 0| SELECT STATEMENT   |     |2517|72993| 75(36)|00:00:01|     |     |   |   |     |

    | 1| PX COORDINATOR   |     |   |   |     |     |     |     |   |   |     |

    | 2| PX SEND QC(RANDOM)|:TQ10000|2517|72993| 75(36)|00:00:01|     |     |Q1,00| P->S|QC(RAND|

    | 3|   PX BLOCK ITERATOR|     |2517|72993| 75(36)|00:00:01|KEY(I)|KEY(I)|Q1,00| PCWC|     |

    |* 4|   TABLE ACCESS FULL| SALES |2517|72993| 75(36)|00:00:01|KEY(I)|KEY(I)|Q1,00| PCWP|     |

    -------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

    4 - filter("TIME_ID"=:A OR "TIME_ID"=:B OR "TIME_ID"=:C OR "TIME_ID"=:D)

    Dynamic Pruning with Subqueries

    子查询使用动态修剪的例子:

    SQL> explain plan for select sum(amount_sold) from sales where time_id in

      (select time_id from times where fiscal_year = 2000);

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    PLAN_TABLE_OUTPUT

    ----------------------------------------------------------------------------------------------------

    Plan hash value: 3827742054



    ----------------------------------------------------------------------------------------------------

    | Id | Operation             | Name | Rows | Bytes | Cost (%CPU)| Time   | Pstart| Pstop |

    ----------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT       |     |   1 |   25 |   523   (5)| 00:00:07 |     |     |

    |   1 | SORT AGGREGATE         |     |   1 |   25 |         |       |     |     |

    |* 2 |   HASH JOIN           |     |   191K| 4676K|   523   (5)| 00:00:07 |     |     |

    |* 3 |   TABLE ACCESS FULL     | TIMES |   304 | 3648 |   18   (0)| 00:00:01 |     |     |

    |   4 |   PARTITION RANGE SUBQUERY|     |   918K|   11M|   498   (4)| 00:00:06 |KEY(SQ)|KEY(SQ)|

    |   5 |   TABLE ACCESS FULL     | SALES |   918K|   11M|   498   (4)| 00:00:06 |KEY(SQ)|KEY(SQ)|

    ----------------------------------------------------------------------------------------------------



    Predicate Information (identified by operation id):

    ---------------------------------------------------



      2 - access("TIME_ID"="TIME_ID")

      3 - filter("FISCAL_YEAR"=2000)

    Dynamic Pruning with Star Transformation

    星型转换和分区修剪的例子:

    SQL> explain plan for select p.prod_name, t.time_id, sum(s.amount_sold)

      from sales s, times t, products p

      where s.time_id = t.time_id and s.prod_id = p.prod_id and t.fiscal_year = 2000

      and t.fiscal_week_number = 3 and p.prod_category = 'Hardware'

      group by t.time_id, p.prod_name;

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    ------------------------------------------------------------------------------------------------------

    Plan hash value: 4020965003



    ------------------------------------------------------------------------------------------------------

    | Id | Operation                   | Name           | Rows | Bytes | Pstart| Pstop |

    ------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT               |               |   1 |   79 |     |     |

    |   1 | HASH GROUP BY                 |               |   1 |   79 |     |     |

    |* 2 |   HASH JOIN                   |               |   1 |   79 |     |     |

    |* 3 |   HASH JOIN                 |               |   2 |   64 |     |     |

    |* 4 |   TABLE ACCESS FULL           | TIMES           |   6 |   90 |     |     |

    |   5 |   PARTITION RANGE SUBQUERY       |               |   587 | 9979 |KEY(SQ)|KEY(SQ)|

    |   6 |     TABLE ACCESS BY LOCAL INDEX ROWID| SALES           |   587 | 9979 |KEY(SQ)|KEY(SQ)|

    |   7 |     BITMAP CONVERSION TO ROWIDS   |               |     |     |     |     |

    |   8 |     BITMAP AND               |               |     |     |     |     |

    |   9 |       BITMAP MERGE             |               |     |     |     |     |

    | 10 |       BITMAP KEY ITERATION       |               |     |     |     |     |

    | 11 |       BUFFER SORT           |               |     |     |     |     |

    |* 12 |         TABLE ACCESS FULL       | TIMES           |   6 |   90 |     |     |

    |* 13 |       BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX     |     |     |KEY(SQ)|KEY(SQ)|

    | 14 |       BITMAP MERGE             |               |     |     |     |     |

    | 15 |       BITMAP KEY ITERATION       |               |     |     |     |     |

    | 16 |       BUFFER SORT           |               |     |     |     |     |

    | 17 |         TABLE ACCESS BY INDEX ROWID| PRODUCTS         |   14 |   658 |     |     |

    |* 18 |         INDEX RANGE SCAN       | PRODUCTS_PROD_CAT_IX |   14 |     |     |     |

    |* 19 |       BITMAP INDEX RANGE SCAN   | SALES_PROD_BIX     |     |     |KEY(SQ)|KEY(SQ)|

    | 20 |   TABLE ACCESS BY INDEX ROWID     | PRODUCTS         |   14 |   658 |     |     |

    |* 21 |   INDEX RANGE SCAN             | PRODUCTS_PROD_CAT_IX |   14 |     |     |     |

    ------------------------------------------------------------------------------------------------------



    Predicate Information (identified by operation id):

    ---------------------------------------------------



      2 - access("S"."PROD_ID"="P"."PROD_ID")

      3 - access("S"."TIME_ID"="T"."TIME_ID")

      4 - filter("T"."FISCAL_WEEK_NUMBER"=3 AND "T"."FISCAL_YEAR"=2000)

    12 - filter("T"."FISCAL_WEEK_NUMBER"=3 AND "T"."FISCAL_YEAR"=2000)

    13 - access("S"."TIME_ID"="T"."TIME_ID")

    18 - access("P"."PROD_CATEGORY"='Hardware')

    19 - access("S"."PROD_ID"="P"."PROD_ID")

    21 - access("P"."PROD_CATEGORY"='Hardware')



    Note

    -----

      - star transformation used for this statement

    Dynamic Pruning with Nested Loop Joins

    NESTED LOOP JOIN和分区修剪的例子:

    SQL> explain plan for select t.time_id, sum(s.amount_sold)

      from sales s, times t

      where s.time_id = t.time_id and t.fiscal_year = 2000 and t.fiscal_week_number = 3

      group by t.time_id;

    Explained.



    SQL> select * from table(dbms_xplan.display);

    PLAN_TABLE_OUTPUT

    ----------------------------------------------------------------------------------------------------

    Plan hash value: 50737729



    ----------------------------------------------------------------------------------------------------

    | Id | Operation             | Name | Rows | Bytes | Cost (%CPU)| Time   | Pstart| Pstop |

    ----------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT       |     |   6 |   168 |   126   (4)| 00:00:02 |     |     |

    |   1 | HASH GROUP BY         |     |   6 |   168 |   126   (4)| 00:00:02 |     |     |

    |   2 |   NESTED LOOPS         |     | 3683 |   100K|   125   (4)| 00:00:02 |     |     |

    |* 3 |   TABLE ACCESS FULL     | TIMES |   6 |   90 |   18   (0)| 00:00:01 |     |     |

    |   4 |   PARTITION RANGE ITERATOR|     |   629 | 8177 |   18   (6)| 00:00:01 |   KEY |   KEY |

    |* 5 |   TABLE ACCESS FULL     | SALES |   629 | 8177 |   18   (6)| 00:00:01 |   KEY |   KEY |

    ----------------------------------------------------------------------------------------------------



    Predicate Information (identified by operation id):

    ---------------------------------------------------



      3 - filter("T"."FISCAL_WEEK_NUMBER"=3 AND "T"."FISCAL_YEAR"=2000)

      5 - filter("S"."TIME_ID"="T"."TIME_ID")

  • 相关阅读:
    install jqdatasdk
    分布式唯一ID自增(雪花算法)
    JVM内存布局及GC知识
    double涉及大数据的时候会变成科学计数法
    IDEA中的JUNIT测试
    spring boot 整合 swagger2
    springboot 报错nested exception is java.lang.IllegalStateException: Failed to check the status of the service xxxService No provider available for the service
    解决jar包依赖冲突(idea)
    IDEA中springboot的热部署
    Exception in thread "main" java.lang.AbstractMethodError: org.springframework.boot.context.config.ConfigFileApplicationListener.supportsSourceType(Ljava/lang/Class;)Z
  • 原文地址:https://www.cnblogs.com/sopost/p/2190064.html
Copyright © 2011-2022 走看看