[20190418]exclusive latch spin count.txt
--//昨天测试"process allocation" latch,主要这个latch与其它拴锁spin方式有点不同,但是缺省都是spin 20000.如何验证一直是困扰我的问题.
--//而且现在的模式是spin 一定数量后,调用semop睡眠,等待唤醒.在这步消耗cpu资源很少.而不是像以前反复spin,指数回退.
--//链接:http://andreynikolaev.wordpress.com/2011/01/06/spin-tales-part-1-exclusive-latches-in-oracle-9-2-11g/
--//对方的服务器Solaris,不管是Solaris X86还是 Solaris sparc CPU,都可以使用dtrace工具探究,我没有也不熟悉这个工具.他监测调用的函数如下:
LATCH_FUNC ADDR LNAME
---------- -------- ------------------------
kslgetl 500063D0 first_spare_latch
...
kslgetl - KSL GET exclusive Latch
sskgslgf - immediate latch get
kslges - wait latch get
skgslsgts
sskgslspin - spin for the latch
sskgslspin
sskgslspin
sskgslspin
sskgslspin
--//latch spin 使用sskgslspin函数调用,可是linux下使用intel cpu并没有对应的oracle内部函数.
(gdb) b sskgslspin
Function "sskgslspin" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 3 (sskgslspin) pending.
--//链接:https://fritshoogland.wordpress.com/2015/07/17/oracle-12-and-latches/
Some searching around revealed that a CPU register reveals this information. Add this to the above gdb script:
--//一些搜索显示CPU寄存器显示了这些信息。将其添加到上面的gdb脚本中:
break *0xc29b51
commands
silent
printf " kslges loop: %d
", $ecx
c
end
--//他没有讲如何获得这个地址,仅仅给出1个线索这些信息在CPU寄存器里面,如何探究呢?我通过我的测试说明.
--//首先说明一下我并不熟悉gdb调式工具.也不要在生产系统做这样的测试!!
1.环境:
SCOTT@book> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
SYS@book> @ hide spin_count
NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
----------------- ---------------------------------- ------------- ------------- ------------
_mutex_spin_count Mutex spin count TRUE 255 255
_spin_count Amount to spin waiting for a latch TRUE 2000 2000
SYS@book> select CLASS_KSLLT,decode(CLASS_KSLLT,2,KSLLTNAM,3,KSLLTNAM) name,count(*) from x$kslltr group by CLASS_KSLLT,decode(CLASS_KSLLT,2,KSLLTNAM,3,KSLLTNAM);
CLASS_KSLLT NAME COUNT(*)
----------- ------------------ --------
0 581
2 process allocation 1
--//前面我已经说过,仅仅process allocation latch比较特殊,使用latch clase=2.我个人认为这样设计避免登录出现阻塞.响应更快一些.
--//其它拴锁都是latch class="0".
SYS@book> select * from x$ksllclass ;
ADDR INDX INST_ID SPIN YIELD WAITTIME SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 SLEEP5 SLEEP6 SLEEP7
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
00000000861986C0 0 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861986EC 1 1 20000 0 1 1000 1000 1000 1000 1000 1000 1000 1000
0000000086198718 2 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
0000000086198744 3 1 20000 0 1 1000 1000 1000 1000 1000 1000 1000 1000
0000000086198770 4 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
000000008619879C 5 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987C8 6 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987F4 7 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
8 rows selected.
--//不管什么latch class,缺省spin count=20000,注意不是2000.
$ cat exclusive_latch.txt
/* 参数如下: @ exclusive_latch.txt latch_name willing why where sleep_num */
--//connect / as sysdba
col laddr new_value laddr
SELECT addr laddr FROM v$latch_parent WHERE NAME='&&1';
oradebug setmypid
oradebug call kslgetl 0x&laddr &&2 &&3 &&4
host sleep &&5
oradebug call kslfre 0x&laddr
--//exit
--//注:我前几天的测试脚本有connect / as sysdba,exit这两行,我为了调式方便,先注解这2行,避免反复退出进入会话.
--//session 1:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 1 2 100000
LADDR
----------------
00000000600098D8
Statement processed.
Function returned 1
--//后面的参数是sleep的秒数,数值大一些,避免跟踪时退出.想继续按ctrl+c就可以中断sleep.
--//session 2:
SYS@book> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
44 45 37744 DEDICATED 37745 27 21 alter system kill session '44,45' immediate;
--//记下SPID=37745.在打开一个终端窗口执行如下:
--//暂且称为window 3:
$ gdb -p 37745
--//session 2:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 3 4 1
--//挂起!!
--//windows 3:
(gdb) c
Continuing.
--//session 2:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 3 4 1
LADDR
----------------
00000000600098D8
Statement processed.
--//停在oradebug call kslgetl 调用,因为session 1目前持有该拴锁.调用前几天测试使用latch_free.sql脚本:
SYS@book> @ latch_free
2019-04-18 11:41:09
Process 26
holding: 00000000600098D8 "test excl. parent l0" lvl=0 whr=2 why=1, SID=30
Process 27, waiting for: 00000000600098D8 whr=4 why=3
--//回到window 3,按ctrl+c中断:
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00000037990d6407 in semop () from /lib64/libc.so.6
(gdb)
(gdb) bt 6
#0 0x00000037990d6407 in semop () from /lib64/libc.so.6
#1 0x0000000009809c0f in sskgpwwait ()
#2 0x00000000098089ce in skgpwwait ()
#3 0x00000000093f9fe1 in kslges ()
#4 0x00000000093f997a in kslgetl ()
#5 0x0000000007d7402e in skdxcall ()
(More stack frames follow...)
--//可以确定函数调用的堆栈或者称为顺序,当前停在semop睡眠上,可以发现调用kslgetl后,紧接着的是kslges.这样猜测spin计数在调用kslges函数里面.
2.重复前面测试,在gdb下设置断点:
--//在session 1按ctrl+c,退出window 3的gdb程序,重新执行gdb.
--//window 3:
$ rlwrap gdb -p 37745
(gdb)
(gdb) break kslges
Breakpoint 1 at 0x93f9b74
--//设置断点在kslges函数调用上.然后在session 1,2分别执行(后面不再说明):
--//注:我前面加入rlwrap,主要记忆一些命令~/.gdb_history,避免反复打入(主要原因有时候要退出gdb界面),实际上gdb是支持方向键的.
--//session 1:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 1 2 100000
--//session 2:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 3 4 1
--//再次挂起!在window 3,执行如下:
(gdb) c
Continuing.
Breakpoint 1, 0x00000000093f9b74 in kslges ()
(gdb)
(gdb) info register
rax 0x0 0
rbx 0x0 0
rcx 0x3 3
rdx 0x0 0
rsi 0x0 0
rdi 0x600098d8 1610651864
rbp 0x7fff93de4f40 0x7fff93de4f40
rsp 0x7fff93de4f40 0x7fff93de4f40
r8 0x4 4
r9 0x0 0
r10 0x0 0
r11 0xa 10
r12 0x600098d8 1610651864
r13 0x1 1
r14 0x3 3
r15 0x4 4
rip 0x93f9b74 0x93f9b74 <kslges+4>
--//这里应该指向下一条执行的地址.
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x27f 639
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x9847daa 159677866
foseg 0x7fff 32767
fooff 0x93de38d0 -1814153008
fop 0x0 0
mxcsr 0x1fa0 [ PE IM DM ZM OM UM PM ]
--//到底那个寄存器是spin计数呢?到目前根本看不出来.
(gdb) info register ecx
Invalid register `ecx'
--//昏,根本没有ecx这个寄存器,难道对方服务器不是intel系列的吗?
(gdb) set pagination off
(gdb) help alias
Aliases of other commands.
List of commands:
ni -- Step one instruction
--//ni 表示 Step one instruction
rc -- Continue program being debugged but run it in reverse
rni -- Step backward one instruction
rsi -- Step backward exactly one instruction
si -- Step one instruction exactly
stepping -- Specify single-stepping behavior at a tracepoint
tp -- Set a tracepoint at specified line or function
tty -- Set terminal for future runs of program being debugged
where -- Print backtrace of all stack frames
ws -- Specify single-stepping behavior at a tracepoint
Type "help" followed by command name for full documentation.
Type "apropos word" to search for commands related to "word".
Command name abbreviations are allowed if unambiguous.
--//如果ni,info register交替执行明显太慢.spin至少20000次呢,
--//如果我执行ni 1000应该不会错过什么,可以这时看寄存器应该猜测spin count在那个寄存器中.
(gdb) ni 1000
0x00000000093f9dfb in kslges ()
(gdb) info register
rax 0x4dc0 19904
rbx 0x0 0
rcx 0x4dbe 19902
rdx 0x100 256
rsi 0x0 0
rdi 0x1a 26
rbp 0x7fff93de4f40 0x7fff93de4f40
rsp 0x7fff93de4c00 0x7fff93de4c00
r8 0x861ca808 2250024968
r9 0x19c 412
r10 0x0 0
r11 0x1b 27
r12 0x8620f490 2250306704
r13 0x600098d8 1610651864
r14 0x4e20 20000
r15 0x1b 27
rip 0x93f9dfb 0x93f9dfb <kslges+651>
eflags 0x217 [ CF PF AF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x27f 639
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x9847daa 159677866
foseg 0x7fff 32767
fooff 0x93de38d0 -1814153008
fop 0x0 0
mxcsr 0x1fa0 [ PE IM DM ZM OM UM PM ]
--//^_^,明显spin count保存在rax,rcx寄存器中.现在必须通过rip地址确定循环的开头(实际上直接拿这个地址测试也可以的,毕竟每次spin 循
--//环都会停在这里)
(gdb) ni
0x00000000093f9dfe in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9dfe 0x93f9dfe <kslges+654>
(gdb) ni
0x00000000093f9ddc in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9ddc 0x93f9ddc <kslges+620>
--//运气太好了!!注意看rax的变化从19904=>19903.以及rip变小了,可以确定循环开头在0x93f9ddc地址.
--//下班,下午继续.
--//看看1个spin循环需要多少指令.
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9dfe 0x93f9dfe <kslges+654>
(gdb) ni
0x00000000093f9ddc in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9ddc 0x93f9ddc <kslges+620>
(gdb) ni
0x00000000093f9dde in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9dde 0x93f9dde <kslges+622>
(gdb) ni
0x00000000093f9de4 in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9de4 0x93f9de4 <kslges+628>
(gdb) ni
0x00000000093f9deb in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9deb 0x93f9deb <kslges+635>
(gdb) ni
0x00000000093f9def in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9def 0x93f9def <kslges+639>
(gdb) ni
0x00000000093f9df2 in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9df2 0x93f9df2 <kslges+642>
(gdb) ni
0x00000000093f9df8 in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbe 19902
rip 0x93f9df8 0x93f9df8 <kslges+648>
(gdb) ni
0x00000000093f9dfb in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbf 19903
rcx 0x4dbd 19901
rip 0x93f9dfb 0x93f9dfb <kslges+651>
(gdb) ni
0x00000000093f9dfe in kslges ()
(gdb) info register rax rcx rip
rax 0x4dbe 19902
rcx 0x4dbd 19901
rip 0x93f9dfe 0x93f9dfe <kslges+654>
--//一共需要9条指令.有了这些信息,可以写出gbd脚本.
(gdb) disassemble kslges
--//反汇编看看.
...
0x00000000093f9ddc <kslges+620>: xor %esi,%esi
0x00000000093f9dde <kslges+622>: mov %esi,-0xd8(%rbp)
0x00000000093f9de4 <kslges+628>: mov %sil,-0xa6(%rbp)
0x00000000093f9deb <kslges+635>: mov 0x0(%r13),%rdi
0x00000000093f9def <kslges+639>: test %rdi,%rdi
0x00000000093f9df2 <kslges+642>: je 0x93fa6c0 <kslges+2896>
0x00000000093f9df8 <kslges+648>: add $0xffffffffffffffff,%ecx
0x00000000093f9dfb <kslges+651>: add $0xffffffffffffffff,%eax
0x00000000093f9dfe <kslges+654>: jne 0x93f9ddc <kslges+620>
...
--//^_^.看不懂,估计这个$0xffffffffffffffff表示-1,明白了我前面ecx对应这里,如何显示呢?不懂.
--//我写入 printf " spin count loop: %d
", $ecx报错!!
Program received signal SIGSEGV, Segmentation fault.
0x0000000009805bd5 in slaac_int ()
--//实例崩溃了.
$ sqlplus -prelim /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 18 16:23:45 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
@> connect sys as sysdba
Enter password:
Prelim connection established
SYS@book> shutdown immediate ;
ORA-01012: not logged on
SYS@book> shutdown abort;
ORACLE instance shut down.
--//注:作者有说明硬件=SGI system.
3.重复前面测试,先写出gdb脚本:
$ cat spin.gdb
break kslgetl
commands
silent
printf "kslgetl %x, %d, %d, %d
", $rdi, $rsi, $rdx, $rcx
c
end
break kslges
commands
silent
printf "kslges %x, %d, %d, %d
", $rdi, $rsi, $rdx, $rcx
c
end
break skgpwwait
commands
silent
printf "skgpwwait %d, %d, %d, %d
", $rdi, $rsi, $rdx, $rcx
c
end
break sskgpwwait
commands
silent
printf "sskgpwwait %d, %d, %d, %d
", $rdi, $rsi, $rdx, $rcx
c
end
break semop
commands
silent
printf "semop %d, %d, %d, %d
", $rdi, $rsi, $rdx, $rcx
c
end
break *0x93f9ddc
commands
silent
printf " spin count loop: %d %d %x
", $rax,$rcx,$rip
c
end
#0 0x00000037990d6407 in semop () from /lib64/libc.so.6
#1 0x0000000009809c0f in sskgpwwait ()
#2 0x00000000098089ce in skgpwwait ()
#3 0x00000000093f9fe1 in kslges ()
#4 0x00000000093f997a in kslgetl ()
--//重复测试:
--//window 3:
$ gdb -p 37745 -x spin.gdb
--//session 1:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 1 2 100000
--//session 2:
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 3 4 1
--//window 3:
kslgetl 600098d8, 1, 3, 4
kslges 600098d8, 0, 0, 3
spin count loop: 20000 19999 93f9ddc
spin count loop: 19999 19998 93f9ddc
spin count loop: 19998 19997 93f9ddc
spin count loop: 19997 19996 93f9ddc
spin count loop: 19996 19995 93f9ddc
spin count loop: 19995 19994 93f9ddc
spin count loop: 19994 19993 93f9ddc
spin count loop: 19993 19992 93f9ddc
spin count loop: 19992 19991 93f9ddc
spin count loop: 19991 19990 93f9ddc
spin count loop: 19990 19989 93f9ddc
spin count loop: 19989 19988 93f9ddc
.....
--//不断按return继续...
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait -1814147480, 202182304, -2044659696, 0
sskgpwwait -1814147480, 202182304, -2044659696, 0
semop 314408960, -1814148224, 1, -1
spin count loop: 20000 19999 93f9ddc
spin count loop: 19999 19998 93f9ddc
.....
--//不断按return继续...
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait -1814147480, 202182304, -2044659696, 0
sskgpwwait -1814147480, 202182304, -2044659696, 0
semop 314408960, -1814148224, 1, -1
--//session 1:
--//按ctrl+c中断.
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 1 2 100000
old 1: SELECT addr laddr FROM v$latch_parent WHERE NAME='&&1'
new 1: SELECT addr laddr FROM v$latch_parent WHERE NAME='test excl. parent l0'
LADDR
----------------
00000000600098D8
Statement processed.
Function returned 1
Function returned 0
--//session 2等1秒也执行完成.
SYS@book> @ exclusive_latch.txt "test excl. parent l0" 1 3 4 1
LADDR
----------------
00000000600098D8
Statement processed.
Function returned 1
Function returned 0
SYS@book>
--//window 3界面显示如下:
semop 314408960, -1814148224, 1, -1
spin count loop: 20000 19999 93f9ddc
--//执行完成.可以发现执行2次周期 ,每个周期20000次,第3次获得.为什么是2次呢?
--//我再次重复测试:
(gdb) info break 6
Num Type Disp Enb Address What
6 breakpoint keep y 0x00000000093f9ddc <kslges+620>
breakpoint already hit 20001 times
silent
printf " spin count loop: %d %d %x
", $rax,$rcx,$rip
c
--//就只有20001次.
4.换一种方式测试:
--//定制spin次数如下:
*._spin_count=20
SYS@book> startup pfile=/tmp/@.ora
ORACLE instance started.
Total System Global Area 643084288 bytes
Fixed Size 2255872 bytes
Variable Size 205521920 bytes
Database Buffers 427819008 bytes
Redo Buffers 7487488 bytes
Database mounted.
Database opened.
SYS@book> select * from x$ksllclass ;
ADDR INDX INST_ID SPIN YIELD WAITTIME SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 SLEEP5 SLEEP6 SLEEP7
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
00000000861986C0 0 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861986EC 1 1 20 0 1 1000 1000 1000 1000 1000 1000 1000 1000
0000000086198718 2 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
0000000086198744 3 1 20 0 1 1000 1000 1000 1000 1000 1000 1000 1000
0000000086198770 4 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
000000008619879C 5 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987C8 6 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987F4 7 1 20 0 1 8000 8000 8000 8000 8000 8000 8000 8000
8 rows selected.
--//重复测试,细节不再列出,仅仅记录gdb输出.
(gdb) c
Continuing.
kslgetl 6010d860, 1, 2087607608, 3991
kslgetl 6010d860, 1, 2087558280, 3991
kslgetl 6010d860, 1, 0, 4039
kslgetl 6010d860, 1, 0, 3980
kslgetl 6010d860, 1, 0, 4039
kslgetl 6010d860, 1, 2087563160, 3991
kslgetl 6010d860, 1, 2087572104, 3991
kslgetl 600098d8, 1, 3, 4
kslges 600098d8, 0, 0, 3
spin count loop: 20 19 93f9ddc
spin count loop: 19 18 93f9ddc
spin count loop: 18 17 93f9ddc
spin count loop: 17 16 93f9ddc
spin count loop: 16 15 93f9ddc
spin count loop: 15 14 93f9ddc
spin count loop: 14 13 93f9ddc
spin count loop: 13 12 93f9ddc
spin count loop: 12 11 93f9ddc
spin count loop: 11 10 93f9ddc
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait -1031167592, 202182304, -2044672536, 0
sskgpwwait -1031167592, 202182304, -2044672536, 0
semop 314933248, -1031168336, 1, -1
spin count loop: 20 19 93f9ddc
spin count loop: 19 18 93f9ddc
spin count loop: 18 17 93f9ddc
spin count loop: 17 16 93f9ddc
spin count loop: 16 15 93f9ddc
spin count loop: 15 14 93f9ddc
spin count loop: 14 13 93f9ddc
spin count loop: 13 12 93f9ddc
spin count loop: 12 11 93f9ddc
spin count loop: 11 10 93f9ddc
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait -1031167592, 202182304, -2044672536, 0
sskgpwwait -1031167592, 202182304, -2044672536, 0
semop 314933248, -1031168336, 1, -1
spin count loop: 20 19 93f9ddc
--//第1次执行一共2次spin周期2次,每个周期20次.从另外一个方面验证spin 计数来之视图select * from x$ksllclass ;.
(gdb) info break 6
Num Type Disp Enb Address What
6 breakpoint keep y 0x00000000093f9ddc <kslges+620>
breakpoint already hit 41 times
silent
printf " spin count loop: %d %d %x
", $rax,$rcx,$rip
c
--//breakpoint already hit 41 times,如果接着重复测试:
(gdb) c
Continuing.
kslgetl 600098d8, 1, 3, 4
kslges 600098d8, 0, 0, 3
spin count loop: 20 19 93f9ddc
spin count loop: 19 18 93f9ddc
spin count loop: 18 17 93f9ddc
spin count loop: 17 16 93f9ddc
spin count loop: 16 15 93f9ddc
spin count loop: 15 14 93f9ddc
spin count loop: 14 13 93f9ddc
spin count loop: 13 12 93f9ddc
spin count loop: 12 11 93f9ddc
spin count loop: 11 10 93f9ddc
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait -1031167592, 202182304, -2044672536, 0
sskgpwwait -1031167592, 202182304, -2044672536, 0
semop 314933248, -1031168336, 1, -1
spin count loop: 20 19 93f9ddc
--//这次仅仅spin周期1次.
(gdb) info break 6
Num Type Disp Enb Address What
6 breakpoint keep y 0x00000000093f9ddc <kslges+620>
breakpoint already hit 62 times
silent
printf " spin count loop: %d %d %x
", $rax,$rcx,$rip
c
--//breakpoint already hit 62 times,也就是最后1次执行break 21次.
5.继续定制spin次数,采用不同类看看:
select addr,name,level#,latch#,gets,misses,sleeps,immediate_gets,immediate_misses,waiters_woken,waits_holding_latch,spin_gets,wait_time from v$latch_parent where lower(name) like '%'||lower('test excl. parent l0')||'%'
ADDR NAME LEVEL# LATCH# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES WAITERS_WOKEN WAITS_HOLDING_LATCH SPIN_GETS WAIT_TIME
---------------- -------------------- ------ ---------- ---------- ---------- ---------- -------------- ---------------- ------------- ------------------- ---------- ----------
00000000600098D8 test excl. parent l0 0 4 64 30 39 0 0 0 0 7 1.4769E+10
--//LATCH#=4
--//定制spin次数如下,修改参数文件加入:
#*._spin_count=20
*._latch_classes='4:3'
*._latch_class_3='10 0 1 10000 20000 30000 40000 50000 60000 70000 50000'
SYS@book> startup pfile=/tmp/@.ora
ORACLE instance started.
Total System Global Area 643084288 bytes
Fixed Size 2255872 bytes
Variable Size 205521920 bytes
Database Buffers 427819008 bytes
Redo Buffers 7487488 bytes
Database mounted.
Database opened.
SYS@book> select * from x$ksllclass ;
ADDR INDX INST_ID SPIN YIELD WAITTIME SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 SLEEP5 SLEEP6 SLEEP7
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
00000000861986C0 0 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861986EC 1 1 20000 0 1 1000 1000 1000 1000 1000 1000 1000 1000
0000000086198718 2 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
0000000086198744 3 1 10 0 1 10000 20000 30000 40000 50000 60000 70000 50000
0000000086198770 4 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
000000008619879C 5 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987C8 6 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
00000000861987F4 7 1 20000 0 1 8000 8000 8000 8000 8000 8000 8000 8000
8 rows selected.
--//SPIN=10.
--//重复测试,细节不再列出,仅仅记录gdb输出.
kslgetl 6010d860, 1, 2087529088, 3991
kslgetl 80641188, 1, 0, 4174
kslgetl 6010d860, 1, 2087479760, 3991
kslgetl 6010d860, 1, 0, 4039
kslgetl 6010d860, 1, 0, 3980
kslgetl 6010d860, 1, 0, 4039
kslgetl 6010d860, 1, 2087484640, 3991
kslgetl 6010d860, 1, 2087493584, 3991
kslgetl 600098d8, 1, 3, 4
kslges 600098d8, 0, 0, 3
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 10000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 20000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 30000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 40000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 50000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 60000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 70000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 50000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
skgpwwait 1986434424, 202182304, 0, 50000
spin count loop: 10 9 93f9ddc
spin count loop: 9 8 93f9ddc
spin count loop: 8 7 93f9ddc
spin count loop: 7 6 93f9ddc
spin count loop: 6 5 93f9ddc
spin count loop: 5 4 93f9ddc
spin count loop: 4 3 93f9ddc
spin count loop: 3 2 93f9ddc
spin count loop: 2 1 93f9ddc
spin count loop: 1 0 93f9ddc
...
(gdb) info break 6
Num Type Disp Enb Address What
6 breakpoint keep y 0x00000000093f9ddc <kslges+620>
breakpoint already hit 199 times
silent
printf " spin count loop: %d %d %x
", $rax,$rcx,$rip
c
(gdb) bt
#0 0x00000037990ce183 in __select_nocancel () from /lib64/libc.so.6
#1 0x0000000002d9751c in skgpnap ()
#2 0x0000000009808a7a in skgpwwait ()
#3 0x00000000093fa63b in kslges ()
#4 0x00000000093f997a in kslgetl ()
#5 0x0000000007d7402e in skdxcall ()
#6 0x00000000076c96aa in ksdxcall ()
#7 0x00000000076cdbcb in ksdxen_int ()
#8 0x00000000076d11a0 in ksdxen ()
#9 0x00000000095bbdad in opiodr ()
#10 0x00000000097a629f in ttcpip ()
#11 0x000000000186470e in opitsk ()
#12 0x0000000001869235 in opiino ()
#13 0x00000000095bbdad in opiodr ()
#14 0x00000000018607ac in opidrv ()
#15 0x0000000001e3a48f in sou2o ()
#16 0x0000000000a29265 in opimai_real ()
#17 0x0000000001e407ad in ssthrdmain ()
#18 0x0000000000a291d1 in main ()
--//这种方式不断的spin循环,获取拴锁.
总结:
1.不要在生产系统做这样测试.
2.主要是自己不熟悉gdb调式工具
3.又写太长,我主要记录比较详细,避免以后看不懂^_^.
4.可以看出latch 获取的变化,缺省都是20000次,对于exclusive latch.
5.我个人不主张定制化解决这类拴锁的问题,应该从应用着手,比如优化sql语句,减少执行测试等等.
6.明天分析shared latch,按照andreynikolaev.wordpress.com介绍是2*_spin_count次数.明天验证看看.
7.实际上仅仅记住现在latch机制与原来的不同,不再使用原来的指数回退sleep机制.而是仅仅spin 20000次,然后执行semop,等待唤醒.
--//为什么前面测试出现2次spin周期,我还不是很清楚...
--//再补充一个例子说明:
$ cat exclusive_latch.txt
/* 参数如下: @ exclusive_latch.txt latch_name willing why where sleep_num */
--//connect / as sysdba
col laddr new_value laddr
SELECT addr laddr FROM v$latch_parent WHERE NAME='&&1';
oradebug setmypid
oradebug call kslgetl 0x&laddr &&2 &&3 &&4
host sleep &&5
oradebug call kslfre 0x&laddr
exit
$ cat p6.sh
#! /bin/bash
vdate=$(date '+%Y%m%d%H%M%S')
echo $vdate
source peek.sh "$1" 20 | timestamp.pl >| /tmp/peekx_${vdate}.txt &
sqlplus -s -l / as sysdba <<EOF >| /tmp/latch_free_${vdate}.txt &
$(seq 20 | xargs -I {} echo -e '@latch_free
host sleep 1')
EOF
sleep 1
# 参数如下: @ exclusive_latch.txt latch_name willing why where sleep_num
sqlplus / as sysdba @ exclusive_latch.txt "$1" 1 4 5 10 > /dev/null &
p=$!
strace -fttT -p $p -o /tmp/pp_${vdate}_${p}.txt > /dev/null &
sleep 2
sqlplus / as sysdba @ exclusive_latch.txt "$1" 1 6 7 5 > /dev/null &
p=$!
strace -fttT -p $p -o /tmp/pp_${vdate}_${p}.txt > /dev/null &
wait
$ . p6.sh "test excl. parent l0"
20190419090713
Process 30017 attached - interrupt to quit
Process 30020 attached
Process 30023 attached
Process 30017 suspended
Process 30026 attached - interrupt to quit
Process 30028 attached
Process 30017 resumed
Process 30023 detached
Process 30047 attached
Process 30026 suspended
Process 30017 detached
Process 30020 detached
Process 30026 resumed
Process 30047 detached
Process 30026 detached
Process 30028 detached
[1] Done source peek.sh "$1" 20 | timestamp.pl >|/tmp/peekx_${vdate}.txt
[3] Done sqlplus / as sysdba @ exclusive_latch.txt "$1" 1 4 5 10 > /dev/null
[4] Done strace -fttT -p $p -o /tmp/pp_${vdate}_${p}.txt > /dev/null
[5]- Done sqlplus / as sysdba @ exclusive_latch.txt "$1" 1 6 7 5 > /dev/null
[6]+ Done strace -fttT -p $p -o /tmp/pp_${vdate}_${p}.txt > /dev/null
[2]+ Done sqlplus -s -l / as sysdba >|/tmp/latch_free_${vdate}.txt <<EOF
$(seq 20 | xargs -I {} echo -e '@latch_free
host sleep 1')
EOF
$ grep sem /tmp/pp_20190419090713*.txt
/tmp/pp_20190419090713_30017.txt:30020 09:07:25.053803 semctl(315195392, 33, SETVAL, 0x1) = 0 <0.000025>
/tmp/pp_20190419090713_30026.txt:30028 09:07:17.040321 semop(315195392, 0x7ffff4363890, 1) = 0 <8.013580>
--//你可以发现进程1发出kslfre时调用semctl时间在09:07:25.053803,进程2从09:07:17.040321开始执行semop.加上8.013580秒被唤醒.
--//在09:07:28:053901结束semop
9.select函数 属于 strace -e desc 可以跟踪到,-e ipc可以单独跟踪semop,semctl,semtimedop系统调用.