zoukankan      html  css  js  c++  java
  • Troubleshooting query v$asm_disk v$asm_diskgroup hang

    今天遇到了同样的问题

    接大神的总结加深下记忆原文:https://www.anbob.com/archives/5713.html

    An Oracle 11.2.0.4.7  4-nodes RAC on linux env,  the fourth instance has this problem just, When the monitoring tool queries the usage of ASM DISKGroup, the query of V$asm_disk and v$asm_diskgroup has not been completed. from ASM INSTANCE found that waiting for ‘enq: DD – contention’ and final blocker session is waiting for ‘GPnP Get Item’ event. but query v$asm_diskgroup_stat work fine.

     
     # DB Instance
    USERNAME           SID EVENT                MACHINE    MODULE               STATUS   LAST_CALL_ET SQL_ID          WAI_SECINW ROW_WAIT_OBJ# SQLTEXT                        BS          CH# OSUSER     HEX
    ----------- ---------- -------------------- ---------- -------------------- -------- ------------ --------------- ---------- ------------- ------------------------------ ---------- ---- ---------- ---------
    SYS               1715 On CPU / runqueue    anbob      sqlplus              ACTIVE           2228 38z4yk2z6z3tm   4:2228                37 select * from v$asm_diskgroup  :             0 oracle       4000000
    SYS               3032 On CPU / runqueue    anbob      sqlplus              ACTIVE           2962 d7r15ajvb30sp   3:2963              5161 select count(*) from  X$KFGRP  :             0 oracle       4000001
    WEEJAR           387 On CPU / runqueue    anbob      SQL*Plus             ACTIVE           2987 40sc4zz430aag   3:2987            122518 select 'datapoint:' || name fr :             0 itmuser      4036f77
    
    SQL> oradebug setorapid 89
    Oracle pid: 89, Unix process pid: 14711, image: oracle@kdhd4 (TNS V1-V3)
    SQL> oradebug short_stack;
    ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-ntpfprd()+117<-nsbasic_brc()+376<-nsbrecv()+69<-nioqrc()+495<-ttcdrv()+1469<-nioqwa()+61<-upirtrc()+1378<-kpurcsc()+98<-kpuexec()+10790<-OCIStmtExecute()+39<-kfgGrpTableCbInternal()+3255<-kfgGrpTableCb()+56<-qerfxFetch()+3141<-opifch2()+2766<-kpoal8()+2833<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230
    SQL> oradebug short_stack;
    ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-ntpfprd()+117<-nsbasic_brc()+376<-nsbrecv()+69<-nioqrc()+495<-ttcdrv()+1469<-nioqwa()+61<-upirtrc()+1378<-kpurcsc()+98<-kpuexec()+10790<-OCIStmtExecute()+39<-kfgGrpTableCbInternal()+3255<-kfgGrpTableCb()+56<-qerfxFetch()+3141<-opifch2()+2766<-kpoal8()+2833<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230
    Parsed short stack:
    ksedsts()+465                    kernel service  error debug dump the call stack short stack 
    ksdxfstk()+32                    kernel service  debug internal errors dump abridged os stack 
    ksdxcb()+1927                    kernel service  debug internal errors ksdx callback for sosd layer signal handler 
    sspuser()+112                    operating system dependent system handle SIGUSR2 for Oracle 
    ntpfprd()+117                    network transport (drivers) ipc ? read 
    nsbasic_brc()+376                network session/transparent network service block(?) basic block receive 
    nsbrecv()+69                     network session/transparent network service block(?) receive 
    nioqrc()+495                     network interface operating system dependent networking receive 
    ttcdrv()+1469                    two task common called when message received 
    nioqwa()+61                      network interface operating system dependent networking wait 
    upirtrc()+1378                   user program interface router with callback 
    kpurcsc()+98                     kernel programmatic interface user remote call with service context remote call with service context, callbacks 
    kpuexec()+10790                  kernel programmatic interface user execute 
    OCIStmtExecute()+39              statement execute 
    kfgGrpTableCbInternal()+3255     kernel automatic storage management diskgroups  [partial hit for: kfg ] 
    kfgGrpTableCb()+56               kernel automatic storage management diskgroups  [partial hit for: kfg ] 
    qerfxFetch()+3141                query execute rowsource fixed table fetch 
    opifch2()+2766                   oracle program interface oracle side of the fetch interface main routine 
    kpoal8()+2833                    kernel programmatic interface oracle V8 bundled execution 
    opiodr()+917                     oracle program interface oracle code request driver, route the current request 
    ttcpip()+2183                    two task common pipe read/write 
    opitsk()+1710                    oracle program interface two task function dispatcher 
    opiino()+969                     oracle program interface initialize opi 
    opiodr()+917                     oracle program interface oracle code request driver, route the current request 
    opidrv()+570                     oracle program interface route current request driver, entry side into two task interface 
    sou2o()+103                      main oracle executable entry point 
    opimai_real()+133                oracle program interface main real oracle start point 
    

    # ASM Instance

    USERNAME           SID EVENT                MACHINE    MODULE               STATUS   LAST_CALL_ET SQL_ID          WAI_SECINW ROW_WAIT_OBJ# SQLTEXT                        B      CH# OSUSER     HEX
    ----------- ---------- -------------------- ---------- -------------------- -------- ------------ --------------- ---------- ------------- ------------------------------ ---------- ---- ---------- ---------
    SYS                752 wait for unread mess anbob      oracle               ACTIVE              1                 0:0                   -1                                :0 grid
    SYS                877 wait for unread mess anbob      oracle               ACTIVE              2                 0:0                   -1                                :0 oracle
    SYS               1008 enq: DD - contention anbob      oracle               ACTIVE         585919 b8a2pdbq3p2mj   0:247938              -1 select grpnum_kfdsk, number_kf 4:2251    0 oracle       4000064
    SYS               1258 enq: DD - contention anbob      oracle               ACTIVE         626533 gztn2jj9jtjnu   0:626532              -1 select name_kfgrp, number_kfgr 4:2251    0 oracle       400572f
    SYS                883 enq: DD - contention anbob      oracle               ACTIVE         627267 88jqaz641rxf6   0:419928              -1 select name_kfgrp, number_kfgr 4:2251    0 oracle       403280f
    SYS                758 enq: DD - contention anbob      oracle               ACTIVE         627292 88jqaz641rxf6   0:627292              -1 select name_kfgrp, number_kfgr 4:2251    0 itmuser      403280e
    SYS                633 enq: DD - contention anbob      oracle               ACTIVE         627490 88jqaz641rxf6   0:349040              -1 select name_kfgrp, number_kfgr 4:2251    0 oracle       403280d
    SYS                508 enq: DD - contention anbob      oracle               ACTIVE         627510 88jqaz641rxf6   0:627510              -1 select name_kfgrp, number_kfgr 4:2251    0 oracle       403280c
    SYS                383 enq: DD - contention anbob      oracle               ACTIVE         627541 88jqaz641rxf6   0:550178              -1 select name_kfgrp, number_kfgr 4:2251    0 oracle       403280b
    
    SQL> @sed dd
    Show wait event descriptions matching %dd%..
    
    EVENT# EVENT_NAME               WAIT_CLASS           PARAMETER1                PARAMETER2                PARAMETER3                ENQUEUE_NAME                   REQ_REASON                       REQ_DESCRIPTION
    ------ ------------------------ -------------------- ------------------------- ------------------------- ------------------------- ------------------------------ -------------------------------- -------------------------------------------------
     1265 enq: DD - contention     Other                name|mode                 disk group                type                      ASM Local Disk Group           contention                       Synchronizes local accesses to ASM disk groups
    ...
    
    
    SQL> select program,command,status,last_call_et,wait_class,event,blocking_session,state from v$session where sid=2251;
    
    PROGRAM                    COMMAND STATUS   LAST_CALL_ET WAIT_CLASS      EVENT                BLOCKING_SESSION STATE
    ----------------------- ---------- -------- ------------ --------------- -------------------- ---------------- -------------------
    oracle@anbob (RBAL)              0 ACTIVE      107650374 Other           GPnP Get Item                         WAITING
    
    grid@anbob:/home/grid>asmcmd
    ASMCMD> lsdg
    State    Type    Rebal  Sector  Block       AU   Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
    MOUNTED  EXTERN  N         512   4096  1048576      10240      9751                0            9751              0             Y  CRSDG/
    MOUNTED  EXTERN  N         512   4096  1048576  100663296      2375                0            2375              0             N  DATADG/
    MOUNTED  EXTERN  N         512   4096  1048576   31457280    228642                0          228642              0             N  DATADG2/
    MOUNTED  EXTERN  N         512   4096  1048576   25165824  16809264                0        16809264              0             N  DATADG3/
    ASMCMD> lsop
    Group_Name  Dsk_Num  State  Power  EST_WORK  EST_RATE  EST_TIME  
    ASMCMD> 
    
    
    SQL> select pid,spid from v$process where addr='00000000A1B3C2A0';
    
           PID SPID
    ---------- ------------------------
            18 28092
    
    SQL> oradebug setorapid 18
    Oracle pid: 18, Unix process pid: 28092, image: oracle@anbob (RBAL)
    SQL> oradebug setorapid 18
    Oracle pid: 18, Unix process pid: 28092, image: oracle@anbob (RBAL)
    SQL> oradebug short_stack;
    ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-__poll()+47<-sgipcwWaitHelper()+5002<-sgipcwWait()+718<-gipcWaitOsd()+387<-gipcInternalWait()+14323<-gipcWaitF()+3225<-gipcInternalRecvSync()+8749<-gipcRecvSyncF()+3334<-clsgpnpm_gipcGets()+297<-clsgpnpm_receiveMsg()+414<-clsgpnpm_exchange()+1059<-clsgpnp_profileCallUrlInt()+3568<-clsgpnp_getProfileEx()+325<-clsgpnp_dbmsGetItem()+263<-kggpnpAttrGet()+926<-kfdParseProfileString()+924<-kfdDiscoverShallow()+2023<-kfgbDriver()+1768<-ksbabs()+771<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230
    SQL> oradebug short_stack;
    ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-__poll()+47<-sgipcwWaitHelper()+5002<-sgipcwWait()+718<-gipcWaitOsd()+387<-gipcInternalWait()+14323<-gipcWaitF()+3225<-gipcInternalRecvSync()+8749<-gipcRecvSyncF()+3334<-clsgpnpm_gipcGets()+297<-clsgpnpm_receiveMsg()+414<-clsgpnpm_exchange()+1059<-clsgpnp_profileCallUrlInt()+3568<-clsgpnp_getProfileEx()+325<-clsgpnp_dbmsGetItem()+263<-kggpnpAttrGet()+926<-kfdParseProfileString()+924<-kfdDiscoverShallow()+2023<-kfgbDriver()+1768<-ksbabs()+771<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230
    
    
    Parsed short stack:
    ksedsts()+465                        kernel service  error debug dump the call stack short stack 
    ksdxfstk()+32                        kernel service  debug internal errors dump abridged os stack 
    ksdxcb()+1927                        kernel service  debug internal errors ksdx callback for sosd layer signal handler 
    sspuser()+112                        operating system dependent system handle SIGUSR2 for Oracle 
    __sighandler()                        (?)  [partial hit for:  ] 
    __poll()+47                           (?)  [partial hit for:  ] 
    sgipcwWaitHelper()+5002              operating system dependent grid inter process calls  [partial hit for: sgipc ] 
    sgipcwWait()+718                     operating system dependent grid inter process calls  [partial hit for: sgipc ] 
    gipcWaitOsd()+387                    grid inter process calls  [partial hit for: gipc ] 
    gipcInternalWait()+14323             grid inter process calls  [partial hit for: gipc ] 
    gipcWaitF()+3225                     grid inter process calls  [partial hit for: gipc ] 
    gipcInternalRecvSync()+8749          grid inter process calls  [partial hit for: gipc ] 
    gipcRecvSyncF()+3334                 grid inter process calls  [partial hit for: gipc ] 
    clsgpnpm_gipcGets()+297              clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    clsgpnpm_receiveMsg()+414            clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    clsgpnpm_exchange()+1059             clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    clsgpnp_profileCallUrlInt()+3568     clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    clsgpnp_getProfileEx()+325           clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    clsgpnp_dbmsGetItem()+263            clusterware integration grid plug and play  [partial hit for: clsgpnp ] 
    kggpnpAttrGet()+926                  kernel generic generic routines  [partial hit for: kgg ] 
    kfdParseProfileString()+924          kernel automatic storage management disk  [partial hit for: kfd ] 
    kfdDiscoverShallow()+2023            kernel automatic storage management disk  [partial hit for: kfd ] 
    kfgbDriver()+1768                    kernel automatic storage management diskgroups background  [partial hit for: kfgb ] 
    ksbabs()+771                         kernel service  background processes action based server 
    ksbrdp()+1045                        kernel service  background processes run a detached background process 
    opirip()+623                         oracle program interface run independent process. invoked from special function sou2o direct call in opimai of detached process invocation. initializes fixed-PGA in RAM and SQL layers. 
    
    SQL> oradebug setmypid
    Statement processed.
    SQL> oradebug unlimit
    Statement processed.
    SQL> oradebug -g all hanganalyze 3
    Hang Analysis in /oracle/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_diag_32198.trc
    
    
    ===============================================================================
    
    Chains most likely to have caused the hang:
     [a] Chain 1 Signature: 'GPnP Get Item'<='rdbms ipc reply'<='enq: DD - contention'
         Chain 1 Signature Hash: 0xec6b634f
     [b] Chain 2 Signature: 'GPnP Get Item'<='rdbms ipc reply'<='enq: DD - contention'
         Chain 2 Signature Hash: 0xec6b634f
     [c] Chain 3 Signature: 'GPnP Get Item'<='rdbms ipc reply'<='enq: DD - contention' 
    Chain 3 Signature Hash: 0xec6b634f
    =============================================================================== 
    
    SQL> show parameter string
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    asm_diskstring                       string      /dev/*-disk*
    
    grid@anbob:/home/grid>kfod asm_diskstring='/dev/*-disk*' disks=all
    --------------------------------------------------------------------------------
     Disk          Size Path                                     User     Group   
    ================================================================================
       1:    1048576 Mb /dev/asm-disk001                         grid     asmadmin
       2:    1048576 Mb /dev/asm-disk002                         grid     asmadmin
       3:    1048576 Mb /dev/asm-disk003                         grid     asmadmin
       4:    1048576 Mb /dev/asm-disk004                         grid     asmadmin
       5:    1048576 Mb /dev/asm-disk005                         grid     asmadmin
       6:    1048576 Mb /dev/asm-disk006                         grid     asmadmin
    ...
    

    Cause
    This is caused by non-published bug:12398300 which is a duplicate of bug:12356910 (also non-published).

    Solution
    Killing the ora.gpnpd on the node when ASM is blocked in the gpnp wait, permits to avoid to stop the ASM instance.

    grid@anbob:/home/grid>ps -ef|grep gpnp
    grid     16952     1  0  2016 ?        09:44:39 /oracle/app/11.2.0.4/grid/bin/gpnpd.bin
    grid     32187 31703  0 18:04 pts/1    00:00:00 grep gpnp
    
    grid@anbob:/home/grid>kill -HUP 16952
    
    grid@anbob:/home/grid>ps -ef|grep gpnp
    grid      6147     1  2 18:04 ?        00:00:00 /oracle/app/11.2.0.4/grid/bin/gpnpd.bin
    grid     11573 31703  0 18:05 pts/1    00:00:00 grep gpnp
    

    Note:
    kill -HUP The HUP signal is sent to a process when its controlling terminal is closed. It was originally designed to notify the process of a serial line drop (HUP stands for “Hang Up”). In modern systems, this signal usually means that the controlling pseudo or virtual terminal has been closed.

    It is recommended to query  usage of ASM DISK[GROUP] using  the V$ASM_DISK[GROUP]_STAT view, the VIEW data is in memory without scanning disk, and has the same column as v$asm_diskgroup.

  • 相关阅读:
    初入博客
    MFC中Enter、ESC的屏蔽及PreTranslateMessage
    数据结构中链表的创建、添加、删除、清空、倒序输出及链表倒置
    数据库重要知识点总结
    云端服务器永久运行node项目的方法!!!!!!!!
    腾讯云centeros mysql相关问题与解决指南!!!!搭建属于自己的前端服务器!!!!
    最全的正则匹配!!!!!!!!!手机号,邮箱
    win10硬盘开启 bitlocker后手动加锁
    myeclipse相同变量的颜色高亮
    一款基于Bootstrap扁平化的后台框架Ace
  • 原文地址:https://www.cnblogs.com/nadian-li/p/13369240.html
Copyright © 2011-2022 走看看