zoukankan      html  css  js  c++  java
  • 一次direct path read 故障处理

    说明:
    产生direct path read事件的原因有三种情况:

    Causes

    This situation occurs in the following situations:

    • The sorts are too large to fit in memory and some of the sort data is written out directly to disk. This data is later read back in, using direct reads.

    • Parallel slaves are used for scanning data.

    • The server process is processing buffers faster than the I/O system can return the buffers. This can indicate an overloaded I/O system.

    10.3.4.2 Actions

    The file_id shows if the reads are for an object in TEMP tablespace (sorts to disk) or full table scans by parallel slaves. This wait is the largest wait for large data warehouse sites. However, if the workload is not a Decision Support Systems (DSS) workload, then examine why this situation is happening.

    10.3.4.2.1 Sorts to Disk

    Examine the SQL statement currently being run by the session experiencing waits to see what is causing the sorts. Query V$TEMPSEG_USAGE to find the SQL statement that is generating the sort. Also query the statistics from V$SESSTAT for the session to determine the size of the sort. See if it is possible to reduce the sorting by tuning the SQL statement. If WORKAREA_SIZE_POLICY is MANUAL, then consider increasing the SORT_AREA_SIZE for the system (if the sorts are not too big) or for individual processes. If WORKAREA_SIZE_POLICY is AUTO, then investigate whether to increase PGA_AGGREGATE_TARGET. See .

    10.3.4.2.2 Full Table Scans

    If tables are defined with a high degree of parallelism, then this setting could skew the optimizer to use full table scans with parallel slaves. Check the object being read into using the direct path reads. If the full table scans are a valid part of the workload, then ensure that the I/O subsystem is adequate for the degree of parallelism. Consider using disk striping if you are not already using it or Oracle Automatic Storage Management (Oracle ASM).

    10.3.4.2.3 Hash Area Size

    For query plans that call for a hash join, excessive I/O could result from having HASH_AREA_SIZE too small. If WORKAREA_SIZE_POLICY is MANUAL, then consider increasing the HASH_AREA_SIZE for the system or for individual processes. If WORKAREA_SIZE_POLICY is AUTO, then investigate whether to increasePGA_AGGREGATE_TARGET.


    案例:

    2014年11月17日16:30-17:30时营销数据库(toyx1a)出现在大量的direct path read、direct path read temp、direct path write temp、db file scattered read、read by other session事件。通过分析出现所有的等待事件都集中在表UTL.T_C_L_CUST_INFO_M上。

    ASH 报告中direct path read在整个等待事件里占比为26.24%

    EventEvent Class% EventAvg Active Sessions
    CPU + Wait for CPU CPU 36.40 2.66
    direct path read User I/O 26.24 1.92
    direct path read temp User I/O 8.00 0.58
    direct path write temp User I/O 6.73 0.49
    db file scattered read User I/O 5.94 0.43

    Top Event P1/P2/P3 Values

    Event% EventP1 Value, P2 Value, P3 Value% ActivityParameter 1Parameter 2Parameter 3
    direct path read 26.24 "10","209408","128" 0.03 file number first dba block cnt
    direct path read temp 8.00 "2001","469469","31" 0.01 file number first dba block cnt
    direct path write temp 6.73 "2001","492862","31" 0.01 file number first dba block cnt
    db file scattered read 5.94 "9","461440","49" 0.01 file# block# blocks
    read by other session 4.76 "15","470969","1" 0.01 file# block# class#


    通过direct path read事件的p1,p2,p3值我们可以定位到该事件在哪个对象上发生等待

    • P1: File_id for the read call

    • P2: Start block_id for the read call

    • P3: Number of blocks in the read call

    SQL> select owner,segment_name,segment_type,partition_name from dba_extents where file_id=10 and 209408 between block_id and block_id+blocks-1; 

    owner  segment_name                segment_type   partition_name
    --------- ---------------------       ------------------- ------------------------------
    UTL     T_C_L_CUST_INFO_M   TABLE PARTITION CUSTOM_PROFILE_PART_10    =====>direct path read 发生在表T_C_L_CUST_INFO_M上。

    通过read by other session事件的p1,p2.p3值同样也可以定位到该事件在哪个对象上发生等待

    ParameterDescription
    file# See 
    block# See 
    class# See 


    SQL> select owner,segment_name,segment_type,partition_name from dba_extents where file_id=15 and 470969 between block_id and block_id+blocks-1; 
    OWNER                          SEGMENT_NAME                                                                      SEGMENT_TYPE       PARTITION_NAME
    ------------------------------ --------------------------------------------------------------------------------- ------------------ ------------------------------
    UTL                            T_C_L_CUST_INFO_M                                                                 TABLE PARTITION    CUSTOM_PROFILE_PART_2

     


    通过db file scattered read 的p1,p2,p3同样也可以诊断该事件在哪个对象上发生等待

    Similar to db file sequential read, except that the session is reading multiple data blocks.

    Wait Time: The wait time is the actual time it takes to do all of the I/Os

    ParameterDescription
    file# See 
    block# See 
    blocks The number of blocks that the session is trying to read from the file# starting at block#


    SQL> set lines 200
    SQL> set pages 500
    SQL> select owner,segment_name,segment_type,partition_name from dba_extents where file_id=9 and 461440 between block_id and block_id+blocks-1;


    OWNER                          SEGMENT_NAME                                                                      SEGMENT_TYPE       PARTITION_NAME
    ------------------------------ --------------------------------------------------------------------------------- ------------------ ------------------------------
    UTL                            T_C_L_CUST_INFO_M                                                                 TABLE PARTITION    CUSTOM_PROFILE_PART_2

     

    从上述三种等待事来看,所有的等待都集中在表T_C_L_CUST_INFO_M 上,因此可以断定是由包含此表的相关语句导致的性能问题。
    因此我们在ASH报告里搜索表T_C_L_CUST_INFO_M发现如下语句,且该语句都存在order by 排序,因此也可以断定为什么该语句产生大量的direct path read 、direct path read temp事件了。

    005wu57dp2nvz select * from (select * from (select SERV_NO, GPRS_FEE_2, TOLL_CALL_FEE, GPRS_FEE_1, ARPU, GPRS_PRIV_FEE3, OVERPKG_MARK, JTCT_CALL_DUR, PKG_USED_FLOW, MYZJ_CALL_DUR, PKG_USED_PROP, OVERPKG_FLOW, CALL_PRICE2, VPMN_CALL_DUR, BDCT_CALL_DUR, STABILITY_SCORE, STAR_LEVEL_MARK, VPMNSP_CALL_DUR, ENTERPRISE_MARK, VPMN_FLAG, GPRS_FLOW_M, PKG_ALL_FLOW, ARPU_PRICE, PHOTOMEM_MARK, IS_MIGU_ORDER, IS_LINGXI_ORDER, IS_MOBMAP_ORDER, IS_PIM_ACTIVE, IS_MIGU_ACTIVE, CRING_FLAG, IS_CHEZHU_ACTIVE, IS_PIM_ORDER, IS_FETION_ACTIVE, IS_MOBMAP_ACTIVE, IS_139_5_ACTIVE, IS_MOBREAD_ORDER, IS_MOBGAME_ORDER, IS_FETION_ORDER, IS_139_20_ACTIVE, IS_LINGXI_ACTIVE, IS_139_5_ORDER, IS_139_20_ORDER, TERM_BRAND, TERM_USE_TIME, TERM_CODE, TERM_GET_DATE, TERM_TYPE, COOPERATE_FLEG, OPEN_DATE, YXPH_FREE, TOTAL_SCORE, YX_ZFPH, YX_BAODI, YXPH_HYYJ, YXPH_FEE, YXPH_OBJ, YXPH_TERM, PLAN_NAME, ACTIVE_DEPT_ID, ACTIVE_AREA_ID, GPRS_PRIV_FEE1, ROAM_CALL_FEE, GPRS_PRIV_FEE2, PLAN_ID from T_C_L_CUST_INFO_M where SERV_NO ='13583152372' ord er by DATA_MONTH desc ) where rownum=1 ) t0
    0vyp4gqddp7mk select * from (select * from (select SERV_NO, PAY_TYPE, GPRS_FEE_2, GPRS_FEE_1, ARPU, BRAND_ID, FEE_INCREASE, FEE_DECREASE, GPRS_FLOW_2, BECOME_DUE, UNBALANCE_FLAG, CALL_DURA_M, PRIV_4G_MARK, BECOME_TIME, SPRODUCT_MARK, SC_USER_MARK, TIME_DECREASE, UNIONACCT_MARK, PHOTOMEM_MARK, TERM_BRAND, TERM_USE_TIME, TERM_CODE, TERM_GET_DATE, TERM_TYPE, SEX_ID, USER_STATUS, OPEN_DATE, CITY_ID, PREPAY_FEE, GIVE_FEE, TOTAL_SCORE, YX_BAODI, CREDIT_LEV from T_C_L_CUST_INFO_M where SERV_NO ='13688689651' order by DATA_MONTH desc ) where rownum=1 ) t0
    gyzcb1gqtscms select * from (select * from (select SERV_NO, GPRS_FEE_2, TOLL_CALL_FEE, GPRS_FEE_1, ARPU, GPRS_PRIV_FEE3, OVERPKG_MARK, JTCT_CALL_DUR, PKG_USED_FLOW, MYZJ_CALL_DUR, PKG_USED_PROP, OVERPKG_FLOW, CALL_PRICE2, VPMN_CALL_DUR, BDCT_CALL_DUR, STABILITY_SCORE, STAR_LEVEL_MARK, VPMNSP_CALL_DUR, ENTERPRISE_MARK, VPMN_FLAG, GPRS_FLOW_M, PKG_ALL_FLOW, ARPU_PRICE, PHOTOMEM_MARK, IS_MIGU_ORDER, IS_LINGXI_ORDER, IS_MOBMAP_ORDER, IS_PIM_ACTIVE, IS_MIGU_ACTIVE, CRING_FLAG, IS_CHEZHU_ACTIVE, IS_PIM_ORDER, IS_FETION_ACTIVE, IS_MOBMAP_ACTIVE, IS_139_5_ACTIVE, IS_MOBREAD_ORDER, IS_MOBGAME_ORDER, IS_FETION_ORDER, IS_139_20_ACTIVE, IS_LINGXI_ACTIVE, IS_139_5_ORDER, IS_139_20_ORDER, TERM_BRAND, TERM_USE_TIME, TERM_CODE, TERM_GET_DATE, TERM_TYPE, COOPERATE_FLEG, OPEN_DATE, YXPH_FREE, TOTAL_SCORE, YX_ZFPH, YX_BAODI, YXPH_HYYJ, YXPH_FEE, YXPH_OBJ, YXPH_TERM, PLAN_NAME, ACTIVE_DEPT_ID, ACTIVE_AREA_ID, GPRS_PRIV_FEE1, ROAM_CALL_FEE, GPRS_PRIV_FEE2, PLAN_ID from T_C_L_CUST_INFO_M where SERV_NO ='13969809161' ord er by DATA_MONTH desc ) where rownum=1 ) t0 


    语句查找出来了,哪为啥突然之间会出现性能问题呢?
    首先我们查看该语句的执行计划:
    SQL> select * from table(dbms_xplan.display_awr('005wu57dp2nvz',null,null,'all'));

    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID 005wu57dp2nvz
    --------------------
                               select * from (select * from (select
    SERV_NO,GPRS_FEE_2,TOLL_CALL_FEE,GPRS_FEE_1,ARPU,GPRS_PRIV_FEE3,OVERPKG_
    MARK,JTCT_CALL_DUR,PKG_USED_FLOW,MYZJ_CALL_DUR,PKG_USED_PROP,OVERPKG_FLO
    W,CALL_PRICE2,VPMN_CALL_DUR,BDCT_CALL_DUR,STABILITY_SCORE,STAR_LEVEL_MAR
    K,VPMNSP_CALL_DUR,ENTERPRISE_MARK,VPMN_FLAG,GPRS_FLOW_M,PKG_ALL_FLOW,ARP
    U_PRICE,PHOTOMEM_MARK,IS_MIGU_ORDER,IS_LINGXI_ORDER,IS_MOBMAP_ORDER,IS_P
    IM_ACTIVE,IS_MIGU_ACTIVE,CRING_FLAG,IS_CHEZHU_ACTIVE,IS_PIM_ORDER,IS_FET
    ION_ACTIVE,IS_MOBMAP_ACTIVE,IS_139_5_ACTIVE,IS_MOBREAD_ORDER,IS_MOBGAME_
    ORDER,IS_FETION_ORDER,IS_139_20_ACTIVE,IS_LINGXI_ACTIVE,IS_139_5_ORDER,I


    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    S_139_20_ORDER,TERM_BRAND,TERM_USE_TIME,TERM_CODE,TERM_GET_DATE,TERM_TYP
    E,COOPERATE_FLEG,OPEN_DATE,YXPH_FREE,TOTAL_SCORE,YX_ZFPH,YX_BAODI,YXPH_H
    YYJ,YXPH_FEE,YXPH_OBJ,YXPH_TERM,PLAN_NAME,ACTIVE_DEPT_ID,ACTIVE_AREA_ID,
    GPRS_PRIV_FEE1,ROAM_CALL_FEE,GPRS_PRIV_FEE2,PLAN_ID from
    T_C_L_CUST_INFO_M where SERV_NO ='13583152372' order by DATA_MONTH desc
    )  where rownum=1  ) t0


    Plan hash value: 3173533799


    --------------------------------------------------------------------------------------------------------------
    | Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |


    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    --------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT         |                   |       |       |  2682K(100)|          |       |       |
    |   1 |  VIEW                    |                   |     1 |  1135 |  2682K  (2)| 08:56:34 |       |       |
    |   2 |   COUNT STOPKEY          |                   |       |       |            |          |       |       |
    |   3 |    VIEW                  |                   |     1 |  1135 |  2682K  (2)| 08:56:34 |       |       |
    |   4 |     SORT ORDER BY STOPKEY|                   |     1 |   289 |  2682K  (2)| 08:56:34 |       |       |             ====〉存在排序
    |   5 |      PARTITION RANGE ALL |                   |     1 |   289 |  2682K  (2)| 08:56:34 |     1 |    18 |
    |   6 |       TABLE ACCESS FULL  | T_C_L_CUST_INFO_M |     1 |   289 |  2682K  (2)| 08:56:34 |     1 |    18 |     ====〉该表发生全表扫。
    --------------------------------------------------------------------------------------------------------------


    Query Block Name / Object Alias (identified by operation id):


    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    -------------------------------------------------------------


       1 - SEL$2 / T0@SEL$1
       2 - SEL$2
       3 - SEL$3 / from$_subquery$_002@SEL$2
       4 - SEL$3
       6 - SEL$3 / T_C_L_CUST_INFO_M@SEL$3

    发现该语句的执行计划发生了改变,查看对象索引的创建时间发现创建时间为2014-11-17 17:19:35 因此可以断定此索引刚刚建上。

    SQL> SELECT OWNER,CREATED FROM DBA_OBJECTS WHERE OBJECT_NAME='INX_T_C_L_CUST_INFO_M_SERV_NO';


    OWNER                          CREATED
    ------------------------------ -------------------
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35
    UTL                            2014-11-17 17:19:35

     

    结论:
      最后通过和研发沟通,该索引删除的目的是为了加快sqlldr导数据的速度。

    喜欢请赞赏一下啦^_^

    微信赞赏

    支付宝赞赏

  • 相关阅读:
    Android 开发笔记___复选框__checkbox
    Android 开发笔记___FrameLayout
    Android 开发笔记___RelativeLayout
    Android 开发笔记___初级控件之实战__计算器
    Android 开发笔记___shape
    Android 开发笔记___DateUtil——Time
    改良版 导航栏自动跟随
    简洁 js排序算法
    简单使用rem方案适配移动设备
    导航栏监听页面滚动跟随 简单封装
  • 原文地址:https://www.cnblogs.com/lkj371/p/15148307.html
Copyright © 2011-2022 走看看