zoukankan      html  css  js  c++  java
  • ASMB的BUG(ORA-04030 kfmditer)导致数据库宕机

    ASMB的BUG(ORA-04030 kfmditer)导致数据库宕机
    现象:
    客户的一个重要生产系统RAC的一个实例宕机,查看alert日志:

    Fri Jun 21 17:05:52 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_asmb_11391.trc (incident=31397):
    ORA-04030: out of process memory when trying to allocate 592 bytes (callheap,kfmditer)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31397/jyj1_asmb_11391_i31397.trc

    Fri Jun 21 17:05:55 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_rbal_11389.trc (incident=31389):
    ORA-04030: out of process memory when trying to allocate bytes (,)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31389/jyj1_rbal_11389_i31389.trc
    Fri Jun 21 17:06:14 2013
    Instance terminated by ASMB, pid = 11391

    查看asmb trace文件:
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_asmb_11391.trc (incident=31397):
    ORA-04030: out of process memory when trying to allocate 592 bytes (callheap,kfmditer)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31397/jyj1_asmb_11391_i31397.trc
    Fri Jun 21 17:05:52 2013
    Trace dumping is performing id=[cdmp_20130621170552]
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_asmb_11391.trc:
    ORA-04030: out of process memory when trying to allocate 592 bytes (callheap,kfmditer)
    ASMB (ospid: 11391): terminating the instance due to error 4030
    System state dump is made for local instance
    System State dumped to trace file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_diag_11345.trc
    Fri Jun 21 17:05:53 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_lms0_11363.trc (incident=31301):
    ORA-04030: out of process memory when trying to allocate bytes (,)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31301/jyj1_lms0_11363_i31301.trc
    Fri Jun 21 17:05:53 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_lmon_11359.trc (incident=31277):
    ORA-04030: out of process memory when trying to allocate bytes (,)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31277/jyj1_lmon_11359_i31277.trc
    Fri Jun 21 17:05:53 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_lms1_11367.trc (incident=31309):
    ORA-04030: out of process memory when trying to allocate bytes (,)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31309/jyj1_lms1_11367_i31309.trc
    Fri Jun 21 17:05:54 2013
    ORA-1092 : opitsk aborting process
    Fri Jun 21 17:05:54 2013
    License high water mark = 327
    Fri Jun 21 17:05:55 2013
    Errors in file /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_rbal_11389.trc (incident=31389):
    ORA-04030: out of process memory when trying to allocate bytes (,)
    Incident details in: /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31389/jyj1_rbal_11389_i31389.trc
    Fri Jun 21 17:06:14 2013
    Instance terminated by ASMB, pid

    jyj1_asmb_11391_i31397.trc:

    Dump file /opt/app/diag/rdbms/jyj/jyj1/incident/incdir_31397/jyj1_asmb_11391_i31397.trc
    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    ORACLE_HOME = /opt/app/ora11gR1db
    System name: Linux
    Node name: KSJYJ_DB01
    Release: 2.6.18-164.el5
    Version: #1 SMP Thu Sep 3 04:15:13 EDT 2009
    Machine: x86_64
    Instance name: jyj1
    Redo thread mounted by this instance: 1
    Oracle process number: 24
    Unix process pid: 11391, image: oracle@KSJYJ_DB01 (ASMB)


    *** 2013-06-21 17:05:52.045
    *** SESSION ID:(532.1) 2013-06-21 17:05:52.046
    *** CLIENT ID:() 2013-06-21 17:05:52.046
    *** SERVICE NAME:(SYS$BACKGROUND) 2013-06-21 17:05:52.046
    *** MODULE NAME:() 2013-06-21 17:05:52.046
    *** ACTION NAME:() 2013-06-21 17:05:52.046
     
    Dump continued from file: /opt/app/diag/rdbms/jyj/jyj1/trace/jyj1_asmb_11391.trc
    ORA-04030: out of process memory when trying to allocate 592 bytes (callheap,kfmditer)

    ========= Dump for incident 31397 (ORA 4030) ========

    *** 2013-06-21 17:05:52.046
    ----- SQL Statement (None) -----
    Current SQL information unavailable - no cursor.

    skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp
     <- ksfdmp <- dbgexPhaseII <- dbgexProcessError <- dbgeExecuteForError <- dbgePostErrorKGE
     <- 1774 <- dbkePostKGE_kgsf <- kgesev <- kgesec3 <- kghnospc
     <- kghalf <- kfmdIterInit <- kfkIterInit <- kfnbIostatiterOp <- 110
     <- kfnbRun <- ksbrdp <- opirip <- opidrv <- sou2o

    Process state
    -----------------------

    SO: 0x940dd1b98, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x940dd1b98, name=process, file=ksu.h LINE:10286, pg=0
     (process) Oracle pid:24, ser:1, calls cur/top: 0x920f28eb8/0x920f28eb8
     flags: (0x6) SYSTEM
     int error: 0, call error: 0, sess error: 0, txn error 0
     (post info) last post received: 0 0 34
     last post received-location: ksr2.h LINE:594 ID:ksrpublish
     last process to post me: 950dfd540 47 2
     last post sent: 0 0 64
     last post sent-location: kso2.h LINE:316 ID:ksoreq_reply
     last process posted by me: 930e5c948 1 0
     (latch info) wait_event=0 bits=0
     Process Group: DEFAULT, pseudo proc: 0x950e4c060
     O/S info: user: oracle, term: UNKNOWN, ospid: 11391
     OSD pid info: Unix process pid: 11391, image: oracle@KSJYJ_DB01 (ASMB)
    Dump of memory from 0x00000009D0DC0A10 to 0x00000009D0DC0C18


    分析:
    从报错信息(ORA-04030)看来,怀疑是Oracle的BUG导致的,因为以前碰到过类似的ASMB进程内存泄露的BUG,
    于是搜索metalink关键词:asmb 04030
    发现第一篇就跟客户的问题吻合。
    ASMB process grows raising ora-4030 intermittently (Doc ID 735180.1)
    ASMB process grows on memory, eventually leading to ora-4030 errors
    which causes DB crash.

    The reported error:
    ORA-04030: out of process memory when trying to allocate 552 Bytes (callheap,kfmditer)
     
    In the ASMB process heapdump we can see most of memory chunks are for 'kfmditer',
    example:

     BreakDown
     ~~~~~~~~~
     Type     Count   Sum        Average
     ~~~~     ~~~~~   ~~~        ~~~~~~~
     Free     285684  142841492  500.00
     kfmditer 285685  157698132  552.00   <-- 在ASMB的HEAPDUMP中也看到了绝大多数都为kfmditer的内存片

     Total = 300539624 bytes 293495.73k 286.62MB
     
     这个BUG在11.1以后的大版本中都有出现,但是在以下的patchset中被修复:
     
     This issue is fixed in

    11.2.0.1 (Base Release)
    11.1.0.7.1 (Patch Set Update)
    10.2.0.5 (Server Patch Set)
    11.1.0.7 Patch 11 on Windows Platforms
    11.1.0.7 RAC Recommended Patch Bundle #1
    11.1.0.6 Patch 11 on Windows Platforms

    如果不想做patchset升级的话,也可以直接打个小Patch 6851110可以解决这个问题。
    You can check if Patch 6851110 is available for your patchset release and
    O/S environment.:  Patch 6851110

    解决方法:
    在客户的数据库上打patch  6851110,经过持续观察一段时间,该问题未再现。

  • 相关阅读:
    adb调试链接真机找不到设备
    Java 遍历指定目录下的文件夹并查找包含指定关键字的文件
    java中遍历制定路径下的文件夹查找出文件并打印出路径
    chrome 设置驱动
    Java+Selenium之KSampleOfCM
    Java+selenium 第一个KeyWordsOfWeb
    QT调用Python脚本运行并打包发布
    使用Qt Installer Framework Manual 制作安装向导
    Redis和Memcache对比及选择
    设计模式学习之----代理模式
  • 原文地址:https://www.cnblogs.com/fuhaots2009/p/3481872.html
Copyright © 2011-2022 走看看