zoukankan      html  css  js  c++  java
  • 10g中HASH GROUP BY引起的临时表空间不足

    今天早上应用人员反映一个原本在9i上可以顺利完成的CTAS脚本,迁移到10g后运行总是报“ORA-1652: unable to extend temp segment by 128 in tablespace TS_HQY1_TEMP "无法扩展临时表空间的错误。应用人员表示该脚本涉及的数据量在迁移前后变化不大,而且令人匪夷所思的是在新的10g库上临时表空间大小已达40多个G,要远大于原9i库。很显然这不是由于临时表空间过小导致的该问题,更多的原因肯定是出在迁移后Oracle不同的行为方式上。 该脚本每月执行一次用以汇总数据,其中一个单表接近4亿行记录,GROUP BY操作涉及到的数据量十分庞大。我们来具体看一下这个SQL:
    create table gprs_bill.zou_201007_cell_id as
    select /* g_all_cdr01,60 */
     calling_num mobile_number,
     lac,
     lpad(cell_id, 5, '0') cell_id,
     count(*) c,
     sum(call_duration) call_duration,
     sum(decode(record_type, '00', 1, 0) * call_duration) moc_call_duration,
     sum(decode(record_type, '01', 1, 0) * call_duration) mtc_call_duarion
      from gprs_bill.g_all_cdr01
     where substr(calling_num, 1, 7) in
           (select mobile_prefix from gprs_bill.zou_mobile_prefix)
     group by calling_num, lac, lpad(cell_id, 5, '0');
    
    SQL> set autotrace traceonly exp
    SQL> select /* g_all_cdr01,60 */
      2  calling_num mobile_number,
      3  lac,
      4  lpad(cell_id,5,'0') cell_id,
      5  count(*) c,
      6  sum(call_duration) call_duration,
      7  sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
      8  sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
      9  from  gprs_bill.g_all_cdr01
     10  where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
     11  group by
     12  calling_num ,
     13  lac,
     14  lpad(cell_id,5,'0');
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 212866585
    
    --------------------------------------------------------------------------------
    -------------------
    
    | Id  | Operation             | Name              | Rows  | Bytes |TempSpc| Cost
     (%CPU)| Time     |
    
    --------------------------------------------------------------------------------
    -------------------
    
    |   0 | SELECT STATEMENT      |                   |   229K|  9880K|       |  103
    3K  (3)| 03:26:41 |
    
    |   1 |  HASH GROUP BY        |                   |   229K|  9880K|    22M|  103
    3K  (3)| 03:26:41 |
    
    |*  2 |   HASH JOIN RIGHT SEMI|                   |   229K|  9880K|       |  103
    0K  (3)| 03:26:10 |
    
    |   3 |    TABLE ACCESS FULL  | ZOU_MOBILE_PREFIX |  1692 | 13536 |       |    1
    1   (0)| 00:00:01 |
    
    |   4 |    TABLE ACCESS FULL  | G_ALL_CDR01       |   388M|    13G|       |  102
    6K  (2)| 03:25:21 |
    
    --------------------------------------------------------------------------------
    -------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("MOBILE_PREFIX"=SUBSTR("CALLING_NUM",1,7))
    可以看到Oracle使用了HASH GROUP BY 算法以实现数据分组;HASH算法是10g中新引入的分组算法。 下面我们来详细介绍下10g中数据分组的改动: 在10g中GROUP BY操作仍将引发排序操作,但10g中引入了新的算法,这些算法都不保证返回的数据行有序排列;在10g中如果想保证"GROUP BY"后返回的数据有序排列则需要强制使用"ORDER BY"子句,这点和9i是截然不同的。若你没有指定"ORDER BY"子句,则不能保证返回的结果正确排序。 在10g中"GROUP BY"子句更倾向于使用一种HASH算法而非原先的SORT算法来分组数据,HASH算法的CPU COST要低于原先的SORT算法。但这2种算法在10g中都不保证返回数据正常排序,当采用SORT算法时可能"碰巧"出现返回正常排序数据的状况。 MOS建议,如果迁移中出现大量不利的变化,则可以通过修改参数来确保沿用原先的算法。但需要注意的是,即便采用了以下参数仍不能保证10g后"GROUP BY"后返回的数据如9i中那样排序,你需要做的是加入显式的"ORDER BY"子句以保证Oracle为你做到这一点。
    alter session set "_gby_hash_aggregation_enabled" = false;
    alter session set optimizer_features_enable="9.2.0";
    或者
    alter session set optimizer_features_enable="8.1.7";
    其中_gby_hash_aggregation_enabled隐式参数决定了Oracle是否可以启用新的HASH算法来进行数据分组(也适用于distinct等操作)。 对于以上说法我们通过实验进一步验证:
    在11g中的测试如下:
    SQL> select  * from v$version;
    
    BANNER
    ----------------------------------------------------------------------
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
    PL/SQL Release 11.2.0.1.0 - Production
    CORE    11.2.0.1.0      Production
    TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
    NLSRTL Version 11.2.0.1.0 - Production
    
    SQL> select  *  from youyus;
    
    T1                 T2
    ---------- ----------
    A                  10
    B                  10
    F                  30
    G                  30
    H                  40
    I                  40
    J                  40
    L                  20
    M                  20
    
    已选择9行。
    SQL>  analyze table youyus compute statistics for all columns;
    
    表已分析。
    
    SQL> set autotrace on;
    
    SQL>  select t2,count(*) from youyus group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            30          2
            20          2
            40          3
            10          2
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 2940504347
    
    -----------------------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   1 |  HASH GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
    -----------------------------------------------------------------------------
    /*可以看到使用了hash算法,且返回结果未按t2列大小顺序排列*/
    
    SQL> select t2,count(*) from youyus group by t2 order by t2;
    
            T2   COUNT(*)
    ---------- ----------
            10          2
            20          2
            30          2
            40          3
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 1349668650
    
    -----------------------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
    -----------------------------------------------------------------------------
    /*加入order by子句后,又变回了SORT算法,而且正常排序*/
    SQL> alter session set "_gby_hash_aggregation_enabled" = false;
    
    会话已更改。
    SQL> alter session set optimizer_features_enable="9.2.0";
    
    会话已更改。
    SQL> select t2,count(*) from youyus group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            10          2
            20          2
            30          2
            40          3
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 1349668650
    
    -------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost  |
    -------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |    11 |
    |   1 |  SORT GROUP BY     |        |     4 |     8 |    11 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2 |
    -------------------------------------------------------------
    
    Note
    -----
       - cpu costing is off (consider enabling it)
    /*optimizer_features_enable设置为9.2.0后cpu cost被off了;返回数据正确排序,但我们要记住这是"侥幸"*/
    
    SQL> alter session set optimizer_features_enable="10.2.0.5";
    
    会话已更改。
    SQL> select t2,count(*) from youyus group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            10          2
            20          2
            30          2
            40          3
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 1349668650
    
    -----------------------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
    -----------------------------------------------------------------------------
    /*optimizer_features_enable设为10.2.0.5 一切正常*/
    SQL> alter session set optimizer_features_enable="11.2.0.1";
    
    会话已更改。
    
    SQL> select t2,count(*) from youyus group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            10          2
            20          2
            30          2
            40          3
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 1349668650
    
    -----------------------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   1 |  SORT GROUP BY     |        |     4 |     8 |     3  (34)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     2   (0)| 00:00:01 |
    -----------------------------------------------------------------------------
    /*11.2.0.1中没有变化*/
    SQL> alter session set optimizer_features_enable="8.1.7";
    
    会话已更改。
    
    SQL> alter session set "_gby_hash_aggregation_enabled" =true;
    
    会话已更改。
    /*看看optimizer_features_enable设为8.1.7,而_gby_hash_aggregation_enabled为true,这种"矛盾"情况下的表现*/
    SQL> select t2,count(*) from youyus group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            30          2
            20          2
            40          3
            10          2
    
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 2940504347
    
    -------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost  |
    -------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |     4 |     8 |    10 |
    |   1 |  HASH GROUP BY     |        |     4 |     8 |    10 |
    |   2 |   TABLE ACCESS FULL| YOUYUS |     9 |    18 |     1 |
    -------------------------------------------------------------
    
    Note
    -----
       - cpu costing is off (consider enabling it)
    /*居然仍采用了HASH GROUP BY,看起来类似_gby_hash_aggregation_enabled这类参数优先级要高于optimizer_features_enable*/
    
    9i上的表现如下:
    SQL> select * from v$version;
    
    BANNER
    ----------------------------------------------------------------
    Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
    PL/SQL Release 9.2.0.4.0 - Production
    CORE    9.2.0.3.0       Production
    TNS for Linux: Version 9.2.0.4.0 - Production
    NLSRTL Version 9.2.0.4.0 - Production
    
    SQL> analyze table youyus_9i compute statistics for all columns;
    
    Table analyzed.
    
    SQL> select * from youyus_9i;
    
    T1         T2
    -- ----------
    A          10
    B          10
    F          30
    G          30
    H          40
    I          40
    J          40
    L          20
    M          20
    
    9 rows selected.
    
    SQL> alter session set optimizer_mode=ALL_ROWS;
    
    Session altered.
    
    SQL> select t2,count(*) from youyus_9i group by t2;
    
            T2   COUNT(*)
    ---------- ----------
            10          2
            20          2
            30          2
            40          3
    
    
    Execution Plan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=4 Bytes=8)
       1    0   SORT (GROUP BY) (Cost=4 Card=4 Bytes=8)
       2    1     TABLE ACCESS (FULL) OF 'YOUYUS_9I' (Cost=2 Card=21 Bytes
              =42)
    /*9i下虽然没有指定order by,但我们可以放心返回的数据总是排序的;*/
    
    SQL> alter session set "_gby_hash_aggregation_enabled" =true;
    alter session set "_gby_hash_aggregation_enabled" =true
                      *
    ERROR at line 1:
    ORA-02248: invalid option for ALTER SESSION
    /*9i下不存在_gby_hash_aggregation_enabled隐式参数*/
    
    That's great!
    
    应用脚本没有数据一定要正确排序的强制要求,但使用HASH GROUP BY算法后临时表空间的使用量大幅上升,远大于之前在9i上的使用量,最后导致语句无法顺利完成。首先想到的当然是通过修改_gby_hash_aggregation_enabled参数恢复到原先的SORT算法,并观察其临时表空间使用量:
    SQL> alter session set "_gby_hash_aggregation_enabled"=false;
    Session altered.
    
    SQL> select /* g_all_cdr01,60 */
      2  calling_num mobile_number,
      3  lac,
      4  lpad(cell_id,5,'0') cell_id,
      5  count(*) c,
      6  sum(call_duration) call_duration,
      7  sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
      8  sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
      9  from  gprs_bill.g_all_cdr01
     10  where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
     11  group by
     12  calling_num ,
     13  lac,
     14  lpad(cell_id,5,'0');
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4013005149
    
    --------------------------------------------------------------------------------
    -------------------
    
    | Id  | Operation             | Name              | Rows  | Bytes |TempSpc| Cost
     (%CPU)| Time     |
    
    --------------------------------------------------------------------------------
    -------------------
    
    |   0 | SELECT STATEMENT      |                   |   229K|  9880K|       |  103
    3K  (3)| 03:26:41 |
    
    |   1 |  SORT GROUP BY        |                   |   229K|  9880K|    22M|  103
    3K  (3)| 03:26:41 |
    
    |*  2 |   HASH JOIN RIGHT SEMI|                   |   229K|  9880K|       |  103
    0K  (3)| 03:26:10 |
    
    |   3 |    TABLE ACCESS FULL  | ZOU_MOBILE_PREFIX |  1692 | 13536 |       |    1
    1   (0)| 00:00:01 |
    
    |   4 |    TABLE ACCESS FULL  | G_ALL_CDR01       |   388M|    13G|       |  102
    6K  (2)| 03:25:21 |
    
    --------------------------------------------------------------------------------
    -------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("MOBILE_PREFIX"=SUBSTR("CALLING_NUM",1,7))
    
    /*重新执行出现问题的脚本*/
    create table gprs_bill.zou_201007_cell_id as
        select /* g_all_cdr01,60 */
        calling_num mobile_number,
        lac,
        lpad(cell_id,5,'0') cell_id,
        count(*) c,
        sum(call_duration) call_duration,
        sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
        sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
        from  gprs_bill.g_all_cdr01
        where substr(calling_num,1,7) in (select mobile_prefix from gprs_bill.zou_mobile_prefix)
        group by
        calling_num ,
        lac,
        lpad(cell_id,5,'0');
    可以看到在会话级别设置_gby_hash_aggregation_enabled为false后,Oracle不再采用10g中的HASH分组算法;因为该CTAS SQL脚本运行时间较长,我们通过动态视图V$SORT_USAGE来观察其运行期间的排序段使用量:
    SQL> set time   on;
    14:30:59 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';
    
    TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
    ------------------------------- --------- --------- -------------
    TS_HQY1_TEMP                    TEMPORARY SORT               9349
    
    14:35:59 SQL> /
    
    TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
    ------------------------------- --------- --------- -------------
    TS_HQY1_TEMP                    TEMPORARY SORT              10011
    
    /*5分钟内共用10011-9349=662MB 临时空间*/
    15:02:46 SQL> select target ,totalwork,sofar,time_remaining,elapsed_seconds from v$session_longops where sofar!=totalwork;
    
    TARGET                                                            TOTALWORK      SOFAR TIME_REMAINING ELAPSED_SECONDS
    ---------------------------------------------------------------- ---------- ---------- -------------- ---------------
    GPRS_BILL.G_ALL_CDR01                                               5575890    5435796            143            5557
    
    15:05:10 SQL> select target ,totalwork,sofar,time_remaining,elapsed_seconds from v$session_longops where sofar!=totalwork;
    
    TARGET                                                            TOTALWORK      SOFAR TIME_REMAINING ELAPSED_SECONDS
    ---------------------------------------------------------------- ---------- ---------- -------------- ---------------
    GPRS_BILL.G_ALL_CDR01                                               5575890    5562082             14            5692
    
    15:05:13 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';
    
    TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
    ------------------------------- --------- --------- -------------
    TS_HQY1_TEMP                    TEMPORARY SORT              13835
    
    15:12:22 SQL> select tablespace,contents,segtype,blocks*8/1024 from v$sort_usage where username='GPRS_BILL';
    
    TABLESPACE                      CONTENTS  SEGTYPE   BLOCKS*8/1024
    ------------------------------- --------- --------- -------------
    TS_HQY1_TEMP                    TEMPORARY SORT              13922
    
    /* 排序已经完成,排序段不再增长*/
    该分组操作最后排序段使用量为13922MB,在客户可以接受的范围内。看起来新引入的HASH算法虽然有CPU成本低于SORT算法的优势,但可能消耗大量临时空间,可谓有得有失。
  • 相关阅读:
    第十五周总结
    第二阶段冲刺
    课堂练习
    第十四周总结
    WP8 NavigationInTransition实现页面切换效果
    WP8学习笔记:如何在页面显示前自动转向到其他页面
    WP8简单的计算器
    WP8滑动条(Slider)控件的使用
    Wp8滚动区域(ScrollViewer)控件的使用
    C#中ToString格式大全
  • 原文地址:https://www.cnblogs.com/macleanoracle/p/2967485.html
Copyright © 2011-2022 走看看