zoukankan      html  css  js  c++  java
  • ora-7445[opiaba],ora-600 [kgh_heap_sizes:ds], ora-600 [17147]

    最近有一套数据库 crash ,环境2nodes RAC 11.2.0.3 on AIX , alert日志中记录了在crash前出现了ora-7445[opiaba],ora-600 [kgh_heap_sizes:ds], ora-600 [17147], 下面记录当时的一些日志。

    # DB alert log。

    Wed Sep 16 23:44:23 2015
    Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xF00000E161B7CD2] [PC:0x10662AFD4, opiaba()+788] [flags: 0x0, count: 1]
    Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_21759124.trc  (incident=191931):
    ORA-07445: 出现异常错误: 核心转储 [opiaba()+788] [SIGSEGV] [ADDR:0xF00000E161B7CD2] [PC:0x10662AFD4] [Address not mapped to object] []
    Incident details in: /u01/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_191931/anbob1_ora_21759124_i191931.trc
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Wed Sep 16 23:46:37 2015
    Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_21759124.trc  (incident=191932):
    ORA-00600: 内部错误代码, 参数: [kgh_heap_sizes:ds], [0x700000E82B30018], [], [], [], [], [], [], [], [], [], []
    ORA-07445: 出现异常错误: 核心转储 [opiaba()+788] [SIGSEGV] [ADDR:0xF00000E161B7CD2] [PC:0x10662AFD4] [Address not mapped to object] []
    Incident details in: /u01/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_191932/anbob1_ora_21759124_i191932.trc
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Wed Sep 16 23:47:14 2015
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Wed Sep 16 23:47:14 2015
    Dumping diagnostic data in directory=[cdmp_20150916234714], requested by (instance=1, osid=21759124), summary=[incident=191932].
    Wed Sep 16 23:56:58 2015
    Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_14549208.trc  (incident=203370):
    ORA-04030: out of process memory when trying to allocate 103504 bytes (QERHJ hash-joi,kllcqc:kllcqslt)
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Wed Sep 16 23:57:02 2015
    Sweep [inc][203370]: completed
    Sweep [inc][191932]: completed
    Sweep [inc][191931]: completed
    Wed Sep 16 23:58:03 2015
    Process 0x700000fd0ac9dc8 appears to be hung while dumping
    Current time = 2507987599, process death time = 2507922725 interval = 60000
    Attempting to kill process 0x700000fd0ac9dc8 with OS pid = 21759124
    OSD kill succeeded for process 700000fd0ac9dc8
    ERROR: Unable to normalize symbol name for the following short stack (at offset 521):
    dbgexProcessError()+180<-dbgeExecuteForError()+72<-dbgePostErrorKGE()+2048<-dbkePostKGE_kgsf()+68<-kgeade()+364<-kgeriv_int()+116<-kgeriv()+36<-kgeric1()+60<-kgherror()+324<-kghfrmrg()+1964<-kghfre()+368<-kghfrh_internal()+552<-kghfrh()+16<-kksFreeHeap()+272<-kksLockDelete()+760<-kksLockOperation()+140<-kgllkdl()+3136<-kgllkds()+104<-kssxdl()+1872<-kssdel()+724<-kssdch()+5732<-ksuxds()+940<-kssxdl()+472<-kssdel()+724<-kssdch()+5732<-ksudlp()+792<-kssxdl()+472<-kssdel()+724<-ksuxdl()+456<-ksuxda_del_proc()+188<-IPRA.$ksucln_dpc_cleanup()+272<-IPRA.$ksucln_dpc_dfs()+268<-ksucln_dpc_main()+892<-ksucln_dpc()+36<-ksucln()+1340<-ksbrdp()+2044<-opirip()+1624<-opidrv()+608<-sou2o()+136<-opimai_real()+188<-ssthrdmain()+268<-main()+204<-__start()+112
    Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_pmon_16253104.trc  (incident=190123):
    ORA-00600: internal error code, arguments: [17147], [0x700000E82B30000], [], [], [], [], [], [], [], [], [], []
    Incident details in: /u01/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_190123/anbob1_pmon_16253104_i190123.trc
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Wed Sep 16 23:58:12 2015
    Dumping diagnostic data in directory=[cdmp_20150916235812], requested by (instance=1, osid=16253104 (PMON)), summary=[incident=190123].
    Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_pmon_16253104.trc:
    ORA-00600: internal error code, arguments: [17147], [0x700000E82B30000], [], [], [], [], [], [], [], [], [], []
    PMON (ospid: 16253104): terminating the instance due to error 472
    Wed Sep 16 23:58:12 2015
    opiodr aborting process unknown ospid (15336054) as a result of ORA-1092
    Wed Sep 16 23:58:14 2015
    ORA-1092 : opitsk aborting process
    Wed Sep 16 23:58:18 2015
    Instance terminated by PMON, pid = 16253104
    

    # anbob1_pmon_16253104.trc

    *** TRACE FILE RECREATED AFTER BEING REMOVED ***
    
    Process 0x700000fd0ac9dc8 appears to be hung while dumping
    Current time = 2507987599, process death time = 2507922725 interval = 60000
    Attempting to kill process 0x700000fd0ac9dc8 with OS pid = 21759124
    OSD kill succeeded for process 700000fd0ac9dc8
    
    *** 2015-09-16 23:58:06.766
    KGX cleanup...
    KGX Atomic Operation Log 70000101e762520
     Mutex 700000ea6aeec70(1300, 0) idn 49899eb8 oper EXCL
     Cursor Parent uid 1300 efd 11 whr 17 slp 0
     oper=OPERATION_DEFAULT pt1=700000ea6aef098 pt2=700000ea6aeebf8 pt3=0
     pt4=0 u41=0 stt=0
    Incident 190123 created, dump file: /u01/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_190123/anbob1_pmon_16253104_i190123.trc
    ORA-00600: internal error code, arguments: [17147], [0x700000E82B30000], [], [], [], [], [], [], [], [], [], []
    
    PMON: fatal error while deleting s.o. 700001011075e88 in this tree:
    ----------------------------------------
    SO: 0x700000fd0ac9dc8, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x700000fd0ac9dc8, name=process, file=ksu.h LINE:12616 ID:, pg=0
    (process) Oracle pid:228, ser:25, calls cur/top: 0x700000f0aca7218/0x700000f0ac85808
              flags : (0x1) DEAD
              flags2: (0x8000),  flags3: (0x10) 
              intr error: 0, call error: 0, sess error: 0, txn error 0
              intr queue: empty
      ksudlp FALSE at location: 0
      Cleanup details:
        Found dead = 5 sec ago
        Total Cleanup attempts = 1, Cleanup time = 0 sec, Cleanup timer = 0.000 sec
        Last attempt (full) started 5 sec ago, Length = in progress
      (post info) last post received: 2147483647 0 2
                  last post received-location: ksl2.h LINE:2362 ID:kslpsr
                  last process to post me: 700000fe0aa3210 1 6
                  last post sent: 0 0 26
                  last post sent-location: ksa2.h LINE:285 ID:ksasnd
                  last process posted by me: 700000fb8ab0d80 1 6
      (latch info) wait_event=0 bits=0
      Process Group: DEFAULT, pseudo proc: 0x700000fc0b81778
      O/S info: user: grid, term: UNKNOWN, ospid: 21759124 (DEAD)
      OSD pid info: Unix process pid: 21759124, image: oracle@4adata1
      ----------------------------------------
      SO: 0x700001020e39a58, type: 4, owner: 0x700000fd0ac9dc8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
       proc=0x700000fd0ac9dc8, name=session, file=ksu.h LINE:12624 ID:, pg=0
      (session) sid: 1300 ser: 54478 trans: 0x0, creator: 0x700000fd0ac9dc8
                flags: (0x100045) USR/- flags_idl: (0x21) BSY/-/-/DEL/-/-
                flags2: (0x40008) -/-/-
                DID: , short-term DID: 
                txn branch: 0x0
                oct: 47, prv: 0, sql: 0x70000100eea0d10, psql: 0x70000100b3773b8, user: 118/HBYDRMS
      ksuxds FALSE at location: 0
      service name: SYS$USERS
      client details:
        O/S info: user: Administrator, term: unknown, ospid: 1234
        machine: WIN-1BBB4BL8USU program: JDBC Thin Client
      Current Wait Stack:
        Not in wait; last wait ended 167 min 17 sec ago 
      Wait State:
        fixed_waits=0 flags=0x21 boundary=0x0/-1
      Session Wait History:
          elapsed time of 167 min 17 sec since last wait
       0: waited for 'latch: shared pool'
          address=0x700000000109838, number=0x133, tries=0x0
          wait_id=2824 seq_num=2825 snap_id=1
          wait times: snap=0.000022 sec, exc=0.000022 sec, total=0.000022 sec
          wait times: max=infinite
          wait counts: calls=0 os=0
          occurred after 129 min 40 sec of elapsed time
       1: waited for 'latch: shared pool'
          address=0x700000000109838, number=0x133, tries=0x0
          wait_id=2823 seq_num=2824 snap_id=1
          wait times: snap=23.825311 sec, exc=23.825311 sec, total=23.825311 sec
          wait times: max=infinite
          wait counts: calls=0 os=0
          occurred after 0.015511 sec of elapsed time
    	  ...
    	  ...
    	  ...
    ORA-00600: internal error code, arguments: [17147], [0x700000E82B30000], [], [], [], [], [], [], [], [], [], []
    kjzduptcctx: Notifying DIAG for crash event
    ----- Abridged Call Stack Trace -----
    ksedsts()+360<-kjzdssdmp()+240<-kjzduptcctx()+228<-kjzdicrshnfy()+100<-ksuitm()+5124<-ksbrdp()+4508<-opirip()+1624<-opidrv()+608<-sou2o()+136<-opimai_real()+188<-ssthrdmain()+268<-main()+204<-__start()+112 
    ----- End of Abridged Call Stack Trace -----
    
    *** 2015-09-16 23:58:12.650
    PMON (ospid: 16253104): terminating the instance due to error 472
    ksuitm: waiting up to [5] seconds before killing DIAG(16843208)
    

    # anbob1_ora_21759124_i191931.trc

    Dump file /u01/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_191931/anbob1_ora_21759124_i191931.trc
    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
    With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
    Data Mining and Real Application Testing options
    ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1
    System name: AIX
    Node name: 4adata1
    Release: 1
    Version: 6
    Machine: 00F8D36E4C00
    Instance name: anbob1
    Redo thread mounted by this instance: 1
    Oracle process number: 228
    Unix process pid: 21759124, image: oracle@4adata1
    
    
    Dump continued from file: /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_21759124.trc
    ORA-07445: 出现异常错误: 核心转储 [opiaba()+788] [SIGSEGV] [ADDR:0xF00000E161B7CD2] [PC:0x10662AFD4] [Address not mapped to object] []
    
    
    *** 2015-09-16 23:44:23.229
    dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
    ----- Current SQL Statement for this session (sql_id=by0ga2x4sm7ps) -----
    begin execute immediate 'TRUNCATE TABLE sm_organization'; insert into sm_organization(groupcode,groupname,parentgroupcode) values(:1 ,:2 ,:3 );
    .............
    ; insert into sm_organization(groupcode,groupname,parentgroupcode) values(:412987 ,:412988 ,:412989 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:412990 ,:412991 ,:412992 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:412993 ,:412994 ,:412995 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:412996 ,:412997 ,:412998 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:412999 ,:413000 ,:413001 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413002 ,:413003 ,:413004 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413005 ,:413006 ,:413007 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413008 ,:413009 ,:413010 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413011 ,:413012 ,:413013 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413014 ,:413015 ,:413016 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413017 ,:413018 ,:413019 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413020 ,:413021 ,:413022 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413023 ,:413024 ,:413025 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413026 ,:413027 ,:413028 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413029 ,:413030 ,:413031 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413032 ,:413033 ,:413034 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413035 ,:413036 ,:413037 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413038 ,:413039 ,:413040 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413041 ,:413042 ,:413043 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413044 ,:413045 ,:413046 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413047 ,:413048 ,:413049 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413050 ,:413051 ,:413052 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413053 ,:413054 ,:413055 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413056 ,:413057 ,:413058 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413059 ,:413060 ,:413061 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413062 ,:413063 ,:413064 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413065 ,:413066 ,:413067 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413068 ,:413069 ,:413070 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413071 ,:413072 ,:413073 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413074 ,:413075 ,:413076 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413077 ,:413078 ,:413079 ); insert into sm_organization(groupcode,groupname,parentgroupcode) values(:413080 ,:413081 ,:413082 ); end; 
    
    ----- Call Stack Trace -----
    skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp
    <- ssexhd <- 48bc <- opiprs <- kksParseChildCursor <- rpiswu2
    <- kksLoadChild <- kxsGetRuntimeLock <- kksfbc <- kkspsc0 <- kksParseCursor
    <- opiosq0 <- kpooprx <- kpoal8 <- opiodr <- ttcpip
    <- opitsk <- opiino <- opiodr <- opidrv <- sou2o
    <- opimai_real <- ssthrdmain <- main <- start 
    

    Note:
    触发ora-7445的PL/SQL是一条拼接的匿名块清空表后再多次insert的调用,绑定变量个数超过了41万, 在MOS中找到了篇note非常符合, instance crashed after ORA-7445 [opiaba] and ORA-600 [17147] (文档 ID 1466343.1),原因是当同一个sql/ plsql文本中使用的绑定变量个数大于65535时触发的bug 12578873, 导致数据库instance crash。
    符合note中记录的在call stack 中含部分或全部下列函数调用:
    opiaba opiprs rpiswu2 kksLoadChild kxsGetRuntimeLock kksfbc

    Solution:
    解决方法是可以打上补丁12578873, 该补丁只是避免类似SQL时宕机, 同时还是要建议修改SQL 使用少于65535的绑定变量个数, 本案例建议使用loop 修改, 另外注意大量的绑定变量使用在SQL解析时需要更久的DB TIME。

  • 相关阅读:
    修改eclipse的背景色(转载)
    c#调用 windows api实现WinForm中嵌入EXE程序
    VS2008序列号
    Microsoft Visual Studio 2005 获取与升级
    Oracle数组一般可以分为固定数组和可变数组
    深圳香港之行杂记
    [难过]小明住院了
    青岛之行杂记
    喀纳斯之行杂记
    亲历北京721大雨
  • 原文地址:https://www.cnblogs.com/shujuyr/p/15355224.html
Copyright © 2011-2022 走看看