转 http://blog.itpub.net/28883355/viewspace-1080879/
oradebug它可以启动跟踪任何会话,dump SGA和其它内存结构,唤醒ORACLE进程,如SMON、PMON进程,也可以通过进程号使进程挂起和恢复等,还有很多功能,实际上这些功能都不常用,但是我们在看别人做问题诊断时,常看到别人在使用oradebug命令,其实我感觉最好用的就是他可以直接通过命令输出生成trace文件的名称(带路径的哦),省去不少麻烦,系统HANG住用它做分析也比较好用,和大家分享一下它最常用的方法!
以sysdba登陆后
SQL> oradebug help
HELP [command] Describe one or all commands
SETMYPID Debug current process
SETOSPID Set OS pid of process to debug
SETORAPID ['force'] Set Oracle pid of process to debug
SETORAPNAME Set Oracle process name to debug
SHORT_STACK Get abridged OS stack--查找系统内存堆栈
CURRENT_SQL Get current SQL
DUMP <dump_name>[addr] Invoke named dump
DUMPSGA [bytes] Dump fixed SGA
DUMPLIST Print a list of available dumps
EVENT Set trace event in process
SESSION_EVENT Set trace event in session
DUMPVAR <p|s|uga> [level] Print/dump a fixed PGA/SGA/UGA variable
DUMPTYPE
SETVAR <p|s|uga> Modify a fixed PGA/SGA/UGA variable
PEEK [level] Print/Dump memory
POKE Modify memory
WAKEUP Wake up Oracle process
SUSPEND Suspend execution
RESUME Resume execution
FLUSH Flush pending writes to trace file
CLOSE_TRACE Close trace file
TRACEFILE_NAME Get name of trace file
LKDEBUG Invoke global enqueue service debugger
NSDBX Invoke CGS name-service debugger
-G Parallel oradebug command prefix
-R Parallel oradebug prefix (return output
SETINST <instance# ..="" |="" all=""> Set instance list in double quotes
SGATOFILE Dump SGA to file; dirname in double quotes
DMPCOWSGA Dump & map SGA as COW; dirname in double quotes
MAPCOWSGA Map SGA as COW; dirname in double quotes
HANGANALYZE [level] [syslevel] Analyze system hang
FFBEGIN Flash Freeze the Instance
FFDEREGISTER FF deregister instance from cluster
FFTERMINST Call exit and terminate instance
FFRESUMEINST Resume the flash frozen instance
FFSTATUS Flash freeze status of instance
SKDSTTPCS Helps translate PCs to names
WATCH <self|exist|all|target> Watch a region of memory
DELETE <local|global|target>watchpoint Delete a watchpoint
SHOW <local|global|target>watchpoints Show watchpoints
DIRECT_ACCESS Fixed table access
CORE Dump core without crashing process
IPC Dump ipc information
UNLIMIT Unlimit the size of the trace file
PROCSTAT Dump process statistics
CALL [-t count] [arg1]...[argn] Invoke function with arguments
上面试oradebug的命令参数,可以实现我们不同的跟踪方式,功能还是比较强大的,我们先测试一个用oradebug做oracle process级10046
SQL> select distinct sid from v$mystat;
SID
----------
96
SQL> select spid,pid from v$Process where addr=(select paddr from v$session where sid=96);
SPID PID
------------------------ ----------
2556166 19
SQL> !ps -ef | grep LOCAL
oracle 3670242 10485930 0 11:25:50 - 0:00 oraclexupeng11g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 2556166 2031934 0 11:13:54 - 0:00 oraclexupeng11g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 10617238 2031934 0 11:34:30 pts/0 0:00 grep LOCAL
SQL> oradebug setorapid 19
Oracle pid: 19, Unix process pid: 2556166, image: oracle@cecgt (TNS V1-V3)
SQL> oradebug event 10046 trace name context forever,level 28;
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/xupeng11g/xupeng11g/trace/xupeng11g_ora_2556166.trc
SQL> !more /u01/app/oracle/diag/rdbms/xupeng11g/xupeng11g/trace/xupeng11g_ora_2556166.trc 我们这里查看完整的一段就行了,看用oradebug trace 10046事件的内容。
*** 2014-02-13 12:22:03.400
WAIT #0: nam='SQL*Net message from client' ela= 79182308 driver id=1650815232 #bytes=1 p3=0 obj#=528 tim=11404921879
=====================
PARSING IN CURSOR #24 len=202 dep=1 uid=0 oct=3 lid=0 tim=11404923285 hv=3819099649 ad='70000017ede6ec8' sqlid='3nkd3g3ju5ph1'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and na
mespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
PARSE #24:c=0,e=747,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=11404923283
BINDS #24:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=111c1c958 bln=22 avl=01 flg=05
value=0
Bind#1
oacdty=01 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
kxsbbbfp=111c1c920 bln=32 avl=01 flg=05
value="T"
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=111c1c8f0 bln=24 avl=02 flg=05
value=1
EXEC #24:c=10000,e=20358,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=853875749,tim=11404943911
FETCH #24:c=0,e=52,p=0,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=853875749,tim=11404944012
STAT #24 id=1 cnt=1 pid=0 pos=1 obj=18 op='TABLE ACCESS BY INDEX ROWID OBJ$ (cr=4 pr=0 pw=0 time=0 us cost=3 size=77 card=1)'
STAT #24 id=2 cnt=1 pid=1 pos=1 obj=37 op='INDEX RANGE SCAN I_OBJ2 (cr=3 pr=0 pw=0 time=0 us cost=2 size=0 card=1)'
CLOSE #24:c=0,e=86,dep=1,type=3,tim=11404944134
我们用dump获取系统状态信息
SQL> select spid,pid from v$Process where addr=(select paddr from v$session where sid=160);
SPID PID
------------------------ ----------
2032030 21
设定跟踪
SQL> oradebug setospid 2032030 或者换成pid也行
Oracle pid: 21, Unix process pid: 2032030, image: oracle@cecgt (TNS V1-V3)
指定SPID之后,我们就可以使用dump相关信息了,这些dump内容有很多,我们看一下
SQL> oradebug dumplist
TRACE_BUFFER_ON
TRACE_BUFFER_OFF
LATCHES
PROCESSSTATE
SYSTEMSTATE
INSTANTIATIONSTATE
REFRESH_OS_STATS
CROSSIC
CONTEXTAREA
HANGDIAG_HEADER
HEAPDUMP
HEAPDUMP_ADDR
POKE_ADDRESS
POKE_LENGTH
POKE_VALUE
POKE_VALUE0
GLOBAL_AREA
REALFREEDUMP
FLUSH_JAVA_POOL
POOL_SIMULATOR
PGA_DETAIL_GET
PGA_DETAIL_DUMP
PGA_DETAIL_CANCEL
PGA_SUMMARY
MODIFIED_PARAMETERS
EVENT_TSM_TEST
ERRORSTACK
CALLSTACK
TEST_STACK_DUMP
TEST_GET_CALLER
RECORD_CALLSTACK
EXCEPTION_DUMP
BG_MESSAGES
ENQUEUES
KSTDUMPCURPROC
KSTDUMPALLPROCS
KSTDUMPALLPROCS_CLUSTER
SIMULATE_EOV
KSFQP_LIMIT
KSKDUMPTRACE
DBSCHEDULER
LDAP_USER_DUMP
LDAP_KERNEL_DUMP
DUMP_ALL_OBJSTATS
DUMPGLOBALDATA
HANGANALYZE
HANGANALYZE_PROC
HANGANALYZE_GLOBAL
GES_STATE
OCR
CSS
CRS
SYSTEMSTATE_GLOBAL
GIPC
MMAN_ALLOC_MEMORY
MMAN_CREATE_DEF_REQUEST
MMAN_CREATE_IMM_REQUEST
MMAN_IMM_REQUEST
DUMP_ALL_COMP_GRANULE_ADDRS
DUMP_ALL_COMP_GRANULES
DUMP_ALL_REQS
DUMP_TRANSFER_OPS
DUMP_ADV_SNAPSHOTS
ADJUST_SCN
NEXT_SCN_WRAP
CONTROLF
FLUSH_CACHE
FULL_DUMPS
BUFFERS
RECOVERY
SET_TSN_P1
BUFFER
PIN_BLOCKS
BC_SANITY_CHECK
PIN_RANDOM_BLOCKS
SET_NBLOCKS
CHECK_ROREUSE_SANITY
DUMP_PINNED_BUFFER_HISTORY
KCBO_OBJ_CHECK_DUMP
KCB_WORKING_SET_DUMP
REDOLOGS
ARCHIVE_ERROR
LOGHIST
REDOHDR
LOGERROR
OPEN_FILES
DATA_ERR_ON
DATA_READ_ERR_ON
DATA_ERR_OFF
BLK0_FMTCHG
UPDATE_BLOCK0_FORMAT
TR_SET_BLOCK
TR_SET_ALL_BLOCKS
TR_SET_SIDE
TR_CRASH_AFTER_WRITE
TR_READ_ONE_SIDE
TR_CORRUPT_ONE_SIDE
TR_RESET_NORMAL
TEST_DB_ROBUSTNESS
LOCKS
GC_ELEMENTS
FILE_HDRS
KRB_CORRUPT_INTERVAL
KRB_CORRUPT_SIZE
KRB_CORRUPT_REPEAT
KRB_CORRUPT_OFFSET
KRB_PIECE_FAIL
KRB_OPTIONS
KRB_FAIL_INPUT_FILENO
KRB_SIMULATE_NODE_AFFINITY
KRB_TRACE
KRB_BSET_DAYS
KRB_SET_TIME_SWITCH
KRB_OVERWRITE_ACTION
KRB_CORRUPT_SPHEADER_INTERVAL
KRB_CORRUPT_SPHEADER_REPEAT
KRB_CORRUPT_SPBITMAP_INTERVAL
KRB_CORRUPT_SPBITMAP_REPEAT
KRB_CORRUPT_SPBAD_INTERVAL
KRB_CORRUPT_SPBAD_REPEAT
KRB_UNUSED_OPTION
KRBMRSR_LIMIT
KRBMROR_LIMIT
KRBABR_TRACE
KRDRSBF
KRC_TRACE
KRA_OPTIONS
KRA_TRACE
FBTAIL
FBINC
FBHDR
FLASHBACK_GEN
KTPR_DEBUG
DUMP_TEMP
DROP_SEGMENTS
TEST_SPACEBG
TREEDUMP
LONGF_CREATE
KDLIDMP
ROW_CACHE
LIBRARY_CACHE
LIBRARY_CACHE_OBJECT
CURSORDUMP
CURSORTRACE
CURSOR_STATS
XS_SESSION_STATE
SHARED_SERVER_STATE
LISTENER_REGISTRATION
JAVAINFO
KXFPCLEARSTATS
KXFPDUMPTRACE
KXFPBLATCHTEST
KXFXSLAVESTATE
KXFXCURSORSTATE
KXFRHASHMAP
WORKAREATAB_DUMP
KUPPLATCHTEST
OBJECT_CACHE
SAVEPOINTS
RULESETDUMP
RULESETDUMP_ADDR
FAILOVER
OLAP_DUMP
SELFTESTASM
ASMDISK_ERR_ON
ASMDISK_READ_ERR_ON
ASMDISK_ERR_OFF
IOERREMUL
IOERREMULRNG
ALRT_TEST
AWR_TEST
AWR_FLUSH_TABLE_ON
AWR_FLUSH_TABLE_OFF
ASHDUMP
MMON_TEST
ATSK_TEST
HM_FW_TRACE
HM_FDG_VERS
IR_FW_TRACE
KSDTRADV_TES
在这些dump选项中,大部分都有2,4,6,8,10,12等几个跟踪级别。在使用的时候要根据具体的情况来选择级别,不同级别的影响不一样。
为了获取全面的信息,我们使用level 10
SQL> oradebug unlimit
Statement processed.
SQL> oradebug dump systemstate 10
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/xupeng11g/xupeng11g/trace/xupeng11g_ora_2032030.trc
*** 2014-02-13 14:42:44.140
Processing Oradebug command 'dump systemstate 10'
===================================================
SYSTEM STATE (level=10)
------------
System global information:
processes: base 0x70000017f2dc6d0, size 150, cleanup 0x70000017f2e7408
allocation: free sessions 0x70000017f343dc0, free calls 0x0
control alloc errors: 0 (process), 0 (session), 0 (call)
PMON latch cleanup depth: 0
seconds since PMON's last scan for dead processes: 52
system statistics:
0 OS CPU Qt wait time
762 logons cumulative
24 logons current
48731 opened cursors cumulative
29 opened cursors current
1201 user commits
0 user rollbacks
7230 user calls
496615 recursive calls
2438 recursive cpu usage
4 pinned cursors current
325991 session logical reads
0 session stored procedure space
1929 CPU used when call started
3746 CPU used by this session
1307212 DB time
0 cluster wait time
72445489 concurrency wait time
165 application wait time
1310983 user I/O wait time
0 scheduler wait time
87778877 non-idle wait time
35641 non-idle wait count
0 session connect time
0 process last non-idle time
1013656413808 session uga memory
553837976 session uga memory max
4591 messages sent
4591 messages received
52440 background timeouts
0 remote Oradebug requests
536394816 session pga memory
560708672 session pga memory max
0 recursive system API invocations
2 enqueue timeouts
12 enqueue waits
0 enqueue deadlocks
112575 enqueue requests
2567 enqueue conversions
112558 enqueue releases
0 global enqueue gets sync
0 global enqueue gets async
0 global enqueue get time
0 global enqueue releases
21247 physical read total IO requests
64 physical read total multi block requests
0 physical read requests optimized
339247616 physical read total bytes
13191 physical write total IO requests
22 physical write total multi block requests
178188288 physical write total bytes
517435904 cell physical IO interconnect bytes
0 spare statistic 1
0 spare statistic 2
0 spare statistic 3
0 spare statistic 4
0 IPC CPU used by this session
0 gcs messages sent
0 ges messages sent
0 global enqueue CPU used by this session
SO: 0x70000017f0cd860, type: 1, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x0, name=cleanup state object, file=kss.h LINE:2051 ID:, pg=0
(cleanup state object) description: ASM file cleanup
latch: 0x700000000045c08
BEGIN DISPATCHER DUMPS
DISPATCHER 0x700000179e4e3e0 (0, 1)
Holder:
----------------------------------------
SO: 0x700000179e4e468, type: 83, owner: 0x70000000000b360, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x70000017c2b4818, name=circuit holder, file=kmc.h LINE:2612 ID:, pg=0
(circuit holder) disp = 0x700000179e4e3e0 (0, 1), proc = (0x70000017c2b4818, 1)
END DISPATCHER DUMPS
END OF SYSTEM STATE
我们再用oradebug 做系统hang住原因分析,这里有一个前提看是hang到什么程度,如果sys用户可以登录到库上 我们就用 oradebug,如果hang死了就需要我们想别的方法
我这里只列出命令:
oradebug setmypid
oradebug unlimit
oradebug setinst all --RAC环境
oradebug hanganalyze 3 -- 级别一般指定为3足够了
oradebug -g def dump systemstate 10 --RAC环境
oradebug tracefile_name
获取某进程的状态信息
oradebug setospid 22180
oradebug dump processstate 10
oradebug tracefile_name
获取进程错误信息状态
oradebug setospid 22180
oradebug dump errorstack 3
追踪造成错误信息的原因,如ORA-04031
oradebug event 4031 trace name errorstack level 3
oradebug的功能还有很多,我们列举出了常用的功能。后续还会有一些调优的方法论和工具推荐给大家!!!