zoukankan      html  css  js  c++  java
  • 转://11g之后,通过v$wait_chains视图诊断数据库hang和Contention

    1g之前,通常我们数据库hang住了之后,我们会对数据库做hang analyze来进行分析,在11g之后,我们可以通过一个新的视图v$wait_chains来诊断数据库hang和contention。在11gR1这个版本里面,Oracle通过diag进程实现了一个功能,每隔3秒做一次本地的hang analyze,每隔10秒做一次global的hang analyze。而这些信息会存放在内存里面,Oracle把这一块内存称作”hang analysis cache”。而这一部分内存信息,对我们数据库诊断hang和contention起着非常重要的作用。而数据库还有一些特性及工具也需要使用这块内存区域。比如Hang Management, Resource Manager Idle Blocker Kill, SQL Tune Hang Avoidance和pmon清除,还有一些外部工具如Procwatcher。

    我们看一下v$wait_chains视图的定义。以11gR2为例。

    SQL> desc v$wait_chains
     Name                                      Null?    Type
     ----------------------------------------- -------- ----------------------------
     CHAIN_ID                                           NUMBER
     CHAIN_IS_CYCLE                                     VARCHAR2(5)
     CHAIN_SIGNATURE                                    VARCHAR2(801)
     CHAIN_SIGNATURE_HASH                               NUMBER
     INSTANCE                                           NUMBER
     OSID                                               VARCHAR2(25)
     PID                                                NUMBER
     SID                                                NUMBER
     SESS_SERIAL#                                       NUMBER
     BLOCKER_IS_VALID                                   VARCHAR2(5)
     BLOCKER_INSTANCE                                   NUMBER
     BLOCKER_OSID                                       VARCHAR2(25)
     BLOCKER_PID                                        NUMBER
     BLOCKER_SID                                        NUMBER
     BLOCKER_SESS_SERIAL#                               NUMBER
     BLOCKER_CHAIN_ID                                   NUMBER
     IN_WAIT                                            VARCHAR2(5)
     TIME_SINCE_LAST_WAIT_SECS                          NUMBER
     WAIT_ID                                            NUMBER
     WAIT_EVENT                                         NUMBER
     WAIT_EVENT_TEXT                                    VARCHAR2(64)
     P1                                                 NUMBER
     P1_TEXT                                            VARCHAR2(64)
     P2                                                 NUMBER
     P2_TEXT                                            VARCHAR2(64)
     P3                                                 NUMBER
     P3_TEXT                                            VARCHAR2(64)
     IN_WAIT_SECS                                       NUMBER
     TIME_REMAINING_SECS                                NUMBER
     NUM_WAITERS                                        NUMBER
     ROW_WAIT_OBJ#                                      NUMBER
     ROW_WAIT_FILE#                                     NUMBER
     ROW_WAIT_BLOCK#                                    NUMBER
     ROW_WAIT_ROW#                                      NUMBER

    继续查询该视图的定义。可以发现该数据来自于基表x$ksdhng_chains。因为前面介绍过进程会10秒做一次global的hang,所以这个视图是包含了全局的信息的。虽然它是v$开头的。

    SQL> select * from V$FIXED_VIEW_DEFINITION where view_name like '%WAIT_CHAINS%';
    
    VIEW_NAME            VIEW_DEFINITION
    -------------------- ----------------------------------------------------------------------------------------------------------------------------------
    V$WAIT_CHAINS        select  s.chain_id,  decode(s.chain_is_cycle, 0,'FALSE','TRUE'),  s.chain_signature, s.chain_signature_hash,   s.instance, s.osid,
                          s.pid, s.sid,  s.sess_serial#,   decode(s.blocker_is_valid, 0,'FALSE','TRUE'),  decode(s.blocker_is_valid, 0, to_number(null), s.
                         blocker_instance),  s.blocker_osid,  decode(s.blocker_is_valid, 0, to_number(null), s.blocker_pid),  decode(s.blocker_is_valid, 0,
                          to_number(null), s.blocker_sid),  decode(s.blocker_is_valid, 0, to_number(null), s.blocker_sess_serial#),  decode(s.blocker_chain
                         _id, 0, to_number(null), s.blocker_chain_id),   decode(s.in_wait, 0,'FALSE','TRUE'),   decode(s.in_wait, 0, s.time_since_last_wait
                         _secs, to_number(null)),   decode(s.in_wait, 0, to_number(null), s.wait_id),  decode(s.in_wait, 0, to_number(null), s.wait_event),
                           s.wait_event_text,  decode(s.in_wait, 0, to_number(null), s.p1),  s.p1_text,  decode(s.in_wait, 0, to_number(null), s.p2),  s.p2
                         _text,  decode(s.in_wait, 0, to_number(null), s.p3),  s.p3_text,   decode(s.in_wait, 0, to_number(null), s.in_wait_secs),  decode(
                         s.in_wait, 0, to_number(null), s.time_remaining_secs),  s.num_waiters,   decode(s.in_wait, 0, to_number(null), s.row_wait_obj#),
                         decode(s.in_wait, 0, to_number(null), s.row_wait_file#),  decode(s.in_wait, 0, to_number(null), s.row_wait_block#),  decode(s.in_w
                         ait, 0, to_number(null), s.row_wait_row#)  from X$KSDHNG_CHAINS s

    Oracle在mos上提供了一些脚本来做一些信息诊断。一种是普通版本的都可以使用的,还有一种是11gR2专用的。因为在11gR2的v$session视图中有一个字段叫final_blocking_session,这个字段能够去查看最上层的阻塞者。最终的blocker一般都处于wait_chain的顶端。这样的session才会引起问题。我们先来看看普通的查询.首先随便制造两个session共同更新一行的情况。

    SQL> SELECT chain_id, num_waiters, in_wait_secs, osid, blocker_osid, substr(wait_event_text,1,30)
      FROM v$wait_chains;
    
      CHAIN_ID NUM_WAITERS IN_WAIT_SECS OSID                      BLOCKER_OSID              SUBSTR(WAIT_EVENT_TEXT,1,30)
    ---------- ----------- ------------ ------------------------- ------------------------- ------------------------------------------------------------
                8 31377                     31447                     enq: TX - row lock contention
               15 31447                                               SQL*Net message from client

    通用的查询:

    接下来在执行下一个基础的格式化后的脚本.

    set pages 1000
     set lines 120
     set heading off
     column w_proc format a50 tru
     column instance format a20 tru
     column inst format a28 tru
     column wait_event format a50 tru
     column p1 format a16 tru
     column p2 format a16 tru
     column p3 format a15 tru
     column Seconds format a50 tru
     column sincelw format a50 tru
     column blocker_proc format a50 tru
     column waiters format a50 tru
     column chain_signature format a100 wra
     column blocker_chain format a100 wra
    
     SELECT * 
     FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE, 
     'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)|| 
     ' from Instance '||blocker_instance BLOCKER_PROC,'Number of waiters: '||num_waiters waiters,
     'Wait Event: ' ||wait_event_text wait_event, 'P1: '||p1 p1, 'P2: '||p2 p2, 'P3: '||p3 p3,
     'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
     'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
     '',blocker_chain_id) blocker_chain
     FROM v$wait_chains wc,
     v$instance i
     WHERE wc.instance = i.instance_number (+)
     AND ( num_waiters > 0
     OR ( blocker_osid IS NOT NULL
     AND in_wait_secs > 10 ) )
     ORDER BY chain_id,
     num_waiters DESC)
     WHERE ROWNUM < 101;

    最终结果如下图所示,我们能够清楚的看到进程31447阻塞了进程31377。进程31377在等待enq: TX – row lock contention。

    基于11gR2的查询

    使用final_blocking_session字段,能查到最上端的阻塞进程。

    set pages 1000
    set lines 120
    set heading off
    column w_proc format a50 tru
    column instance format a20 tru
    column inst format a28 tru
    column wait_event format a50 tru
    column p1 format a16 tru
    column p2 format a16 tru
    column p3 format a15 tru
    column Seconds format a50 tru
    column sincelw format a50 tru
    column blocker_proc format a50 tru
    column fblocker_proc format a50 tru
    column waiters format a50 tru
    column chain_signature format a100 wra
    column blocker_chain format a100 wra
    
    SELECT * 
    FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE, 
     'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)|| 
     ' from Instance '||blocker_instance BLOCKER_PROC,
     'Number of waiters: '||num_waiters waiters,
     'Final Blocking Process: '||decode(p.spid,null,'',
     p.spid)||' from Instance '||s.final_blocking_instance FBLOCKER_PROC, 
     'Program: '||p.program image,
     'Wait Event: ' ||wait_event_text wait_event, 'P1: '||wc.p1 p1, 'P2: '||wc.p2 p2, 'P3: '||wc.p3 p3,
     'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
     'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
     '',blocker_chain_id) blocker_chain
    FROM v$wait_chains wc,
     gv$session s,
     gv$session bs,
     gv$instance i,
     gv$process p
    WHERE wc.instance = i.instance_number (+)
     AND (wc.instance = s.inst_id (+) and wc.sid = s.sid (+)
     and wc.sess_serial# = s.serial# (+))
     AND (s.final_blocking_instance = bs.inst_id (+) and s.final_blocking_session = bs.sid (+))
     AND (bs.inst_id = p.inst_id (+) and bs.paddr = p.addr (+))
     AND ( num_waiters > 0
     OR ( blocker_osid IS NOT NULL
     AND in_wait_secs > 10 ) )
    ORDER BY chain_id,
     num_waiters DESC)
    WHERE ROWNUM < 101;

    这里可以看到当前会话是2395在等待enq: TM – contention,而它的顶级阻塞者是2309。通过这些脚本我们能够方便的进行查询,能够方便的找到谁是阻塞者,甚至是最上层的阻塞者。当然在我们的diaghang.sql的脚本里面,我们看到了下列内容。这里Oracle通过我们内存直接访问,从x$ksdhng_chains里面把chain的信息全部获取出来,用于最终的hang分析的诊断。

    -- dump hang analysis chains
    oradebug direct_access enable trace
    oradebug direct_access disable reply
    oradebug direct_access set content_type = 'text/plain'
    oradebug direct_access select * from x$ksdhng_chains
  • 相关阅读:
    全面理解javascript的caller,callee,call,apply概念(修改版)
    动态显示更多信息(toggle_visible函数的运用)
    再论call和apply
    RSS News Module的应用
    准备制作一套全新的DNN皮肤(包括个人定制或企业级定制)
    ControlPanel研究系列二:简单Ajax模式的ControlPanel(SimplAjax)
    New_Skin发布了....
    如何定制dnn的FckEditor
    Blog已迁移到dnnsun.com(最新DotNetNuke咨询平台)
    新DNN皮肤的经验及成果分享
  • 原文地址:https://www.cnblogs.com/zfox2017/p/8584502.html
Copyright © 2011-2022 走看看