zoukankan      html  css  js  c++  java
  • (转载)硬解析和软解析情况下latch持有情况分析

    http://www.dbsnake.net/hard-and-soft-parse-latch-analysis.html#comment-359

    硬解析和软解析情况下latch持有情况分析

    为了方便阐述,这篇文章里我用了如下一些术语:

    1、用“library cache lock”代表名为library cache lock的enqueue;

    2、用“library cache pin”代表名为library cache pin的enqueue;

    3、用“Child library cache”代表名为library cache的子latch;

    4、用“Child shared pool”代表名为shared pool的子latch;

    5、用“Child library cache lock”代表名为library cache lock的子latch;

    6、用“Child library cache pin”代表名为library cache pin的子latch;

    你可能需要先阅读如下的三篇文章才能看懂我在说什么:

    shared pool latch和library cache latch

    library cache lock和library cache pin到底是什么

    library cache lock和library cache pin到底是什么(续)

    另外需要注意的是:从11gR1开始,Oracle用mutex替换了各种跟library cache相关的latch,所以这篇文章的结论并不适用于11g。

    我们首先来看library cache的经典架构:

     library_cache_architecture

    其中我们可以看到在每一个library cache object handle上都存在4个链表,它们分别是lock owners,lock waiters,Pin owners,Pin waiters。所以实际上library cache object handle你可以把它看成是library cache locklibrary cache pin所保护的resource。而向lock ownerslock waiters里添加/修改一个library cache lock时是需要先持有Child library cache lock;向Pin ownersPin waiters里添加/修改一个library cache pin时是需要先持有Child library cache pin

    无论是硬解析还是软解析,在扫描/修改library cache object handle链表的时候是需要先持有Child library cache;另外,硬解析还涉及到要在shared pool里分配内存,所以硬解析还会持有Child shared pool

    好了,我们来分6步进一步细化和扩展上述结论,这6步考察latch持有情况的思路均为:同时开3个session,session 1中先持有所有的相关子latch;session 2在session 1持有完所有的相关子latch后去执行硬解析/软解析,这个时候session 2一定会hang住;同时我们在session 3中去做session 2的process dump,从session 3的dump文件中即可看到session 2中的latch持有与等待的情况。

    好了,我们开始。

    第一步:先来证明硬解析会在持有Child library cache且并不释放的情况下转而去持有Child shared pool

    先分别到session 1、session 2、session 3里做一下准备工作:

    Session 1:

    SQL> select name,value from all_parameters where name=’_kghdsidx_count’;

    NAME                                                                             VALUE

    ——————————————–     —————————————————————-

    _kghdsidx_count                  1

    SQL> select name,level#,child#,addr,gets from v$latch_children where name like ‘library cache%’ or name=’shared pool’;

    NAME                                   LEVEL#     CHILD# ADDR           GETS

    ————————————————– ———- ———- ——– ———-

    shared pool                                      7          1 04E5D308     783011

    shared pool                                      7          2 04E5D370         26

    shared pool                                      7          3 04E5D3D8         26

    shared pool                                      7          4 04E5D440         26

    shared pool                                      7          5 04E5D4A8         26

    shared pool                                      7          6 04E5D510         26

    shared pool                                      7          7 04E5D578         26

    library cache                                     5          1 3213DE80     406403

    library cache                                     5          2 3213DE18     322236

    library cache                                     5          3 3213DDB0      96778

    library cache lock                                 6          1 3213E0F0     107337

    library cache lock                                 6          2 3213E088      86643

    library cache lock                                 6          3 3213E020      28278

    library cache pin                                  6          1 3213DFB8     204881

    library cache pin                                  6          2 3213DF50     170659

    library cache pin                                  6          3 3213DEE8      51722

    library cache pin allocation                         3          1 3213E228        561

    library cache pin allocation                         3          2 3213E1C0         38

    library cache pin allocation                         3          3 3213E158        206

    library cache lock allocation                        3          1 3213E360        722

    library cache lock allocation                        3          2 3213E2F8         46

    library cache lock allocation                        3          3 3213E290        229

    library cache hash chains                          9          1 3213E498          0

    library cache hash chains                          9          2 3213E430          0

    library cache hash chains                          9          3 3213E3C8          0

    25 rows selected

    从上述结果可以看到,Child library cache的level为5,Child library cache lock和Child library cache pin的level为6,Child shared pool的level为7,所以已经具备了在不释放Child library cache的情况下进而去持有Child shared pool的条件。

    SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’shared pool’ and child#=1;

    ‘ORADEBUGPOKE0X’||ADDR||’40X00

    ————————————-

    oradebug poke 0x04E5D308 4 0×00000001

    因为要测硬解析,这里我flush一下shared pool:

    SQL> alter system flush shared_pool;

    系统已更改。

    Session 2:

    SQL> select sid from v$mystat where rownum=1;

           SID

    ———-

           137

    SQL> select spid from v$process where addr in (select paddr from v$session where sid=138);

    SPID

    ————

    3952

    Session 3:

    SQL> oradebug setospid 3952

    Oracle pid: 24, Windows thread id: 3952, image: ORACLE.EXE (SHAD)

    准备工作已经做完,我们现在去session 1里手工持有上述Child shared pool:

    Session 1:

    SQL> oradebug setmypid

    已处理的语句

    SQL> oradebug poke 0x04E5D308 4 0×00000001

    BEFORE: [4E5D308, 4E5D30C) = 00000000

    AFTER:  [4E5D308, 4E5D30C) = 00000001

    然后到session 2中去随便执行一条sql,因为刚才在session 1里面flush了shared pool,所以这里一定是硬解析,并且这里一定会hang住:

    Session 2

    SQL> select * from scott.emp;

    ……这里hang住了

    接着我们到session 3中去对session 2做一下process dump:

    Session 3:

    SQL> oradebug dump processstate 10

    已处理的语句

    SQL> oradebug tracefile_name

    d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc

    d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc的内容:

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc

    Sat May 05 01:13:57 2012

    ORACLE V10.2.0.1.0 - Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 - type 586

    Process Affinity    : 0x00000000

    Memory (Avail/Total): Ph:1005M/2030M, Ph+PgF:2333M/3921M, VA:1273M/2047M

    Instance name: cuihua

     

    Redo thread mounted by this instance: 1

     

    Oracle process number: 24

     

    Windows thread id: 3952, image: ORACLE.EXE (SHAD)

     

     

    *** 2012-05-05 01:13:57.468

    *** ACTION NAME:(Command Window - New) 2012-05-05 01:13:57.406

    *** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:13:57.406

    *** SERVICE NAME:(cuihua) 2012-05-05 01:13:57.406

    *** SESSION ID:(137.75) 2012-05-05 01:13:57.406

    Received ORADEBUG command 'dump processstate 10' from process Windows thread id: 2924, image:

    ===================================================

    PROCESS STATE

    -------------

    Process global information:

         process: 33E4FB7C, call: 33F606A8, xact: 00000000, curses: 33F254F4, usrses: 33F254F4

      ----------------------------------------

      SO: 33E4FB7C, type: 2, owner: 00000000, flag: INIT/-/-/0x00

      (process) Oracle pid=24, calls cur/top: 33F606A8/33F606A8, flag: (0) -

                int error: 0, call error: 0, sess error: 0, txn error 0

      (post info) last post received: 0 0 0

                  last post received-location: No post

                  last process to post me: none

                  last post sent: 0 0 0

                  last post sent-location: No post

                  last process posted by me: none

        (latch info) wait_event=0 bits=20

            Location from where call was made: kghupr1: Chunk Header

            Context saved from call: 815704132

          waiting for 4e5d308 Child shared pool level=7 child#=1

            Location from where latch is held: kghupr1: Chunk Header

            Context saved from call: 737738784

            state=busy, wlstate=free

              waiters [orapid (seconds since: put on list, posted, alive check)]:

               10 (33, 1336151634, 33)

               20 (33, 1336151634, 33)

               24 (15, 1336151634, 15)

               waiter count=3

              gotten 857154 times wait, failed first 316 sleeps 24

              gotten 0 times nowait, failed: 0

          on wait list for 4e5d308

          holding    (efd=5) 3213de18 Child library cache level=5 child#=2

            Location from where latch is held: kglpndl: child: before processing: latch

            Context saved from call: 1

            state=busy, wlstate=free

        Process Group: DEFAULT, pseudo proc: 33E7F368

        O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 3952

        OSD pid info: Windows thread id: 3952, image: ORACLE.EXE (SHAD)

    ……省略显示其他内容

    注意上述trace文件中有如下内容:

    waiting for 4e5d308 Child shared pool level=7 child#=1

    holding    (efd=5) 3213de18 Child library cache level=5 child#=2

    所以我们已经证明了硬解析会在持有Child library cache且并不释放的情况下转而去持有Child shared pool。

    第二步:我们来证明硬解析会去持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cacheChild library cache lock

    先在session 1里手工持有所有的Child library cache lock:

    Session 1:

    SQL> alter system flush shared_pool;

    系统已更改。

    SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache lock’;

    ‘ORADEBUGPOKE0X’||ADDR||’40X00

    ————————————-

    oradebug poke 0x3213E0F0 4 0×00000001

    oradebug poke 0x3213E088 4 0×00000001

    oradebug poke 0x3213E020 4 0×00000001

    SQL> oradebug poke 0x3213E0F0 4 0×00000001

    BEFORE: [3213E0F0, 3213E0F4) = 00000000

    AFTER:  [3213E0F0, 3213E0F4) = 00000001

    SQL> oradebug poke 0x3213E088 4 0x00000001

    BEFORE: [3213E088, 3213E08C) = 00000000

    AFTER:  [3213E088, 3213E08C) = 00000001

    SQL> oradebug poke 0x3213E020 4 0x00000001

    BEFORE: [3213E020, 3213E024) = 00000000

    AFTER:  [3213E020, 3213E024) = 00000001

    然后到session 2中去随便执行一条sql,因为刚才在session 1里面flush了shared pool,所以这里一定是硬解析,并且这里一定会hang住:

    Session 2:

    SQL> select * from scott.emp;

    ……这里hang住了

    接着我们到session 3中去对session 2做一下process dump:

    Session 3:

    SQL> oradebug dump processstate 10

    已处理的语句

    SQL> oradebug tracefile_name

    d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc

    d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc的内容:

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 - type 586

    Process Affinity    : 0x00000000

    Memory (Avail/Total): Ph:1142M/2030M, Ph+PgF:2355M/3921M, VA:1345M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 0 <none>

    Oracle process number: 0

    Windows thread id: 3140, image: ORACLE.EXE (SHAD)

    KCCDEBUG_LEVEL = 0

    Dynamic strand is set to TRUE

    Running with 2 shared and 18 private strand(s). Zero-copy redo is FALSE

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc

    Sat May 05 01:26:29 2012

    ORACLE V10.2.0.1.0 - Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 - type 586

    Process Affinity    : 0x00000000

    Memory (Avail/Total): Ph:969M/2030M, Ph+PgF:2332M/3921M, VA:1267M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 1

    Oracle process number: 31

    Windows thread id: 3140, image: ORACLE.EXE (SHAD)

    *** 2012-05-05 01:26:29.609

    *** ACTION NAME:(Command Window - New) 2012-05-05 01:26:29.562

    *** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:26:29.562

    *** SERVICE NAME:(cuihua) 2012-05-05 01:26:29.562

    *** SESSION ID:(153.27) 2012-05-05 01:26:29.562

    Received ORADEBUG command 'dump processstate 10' from process Windows thread id: 2844, image:

    ===================================================

    PROCESS STATE

    -------------

    Process global information:

         process: 33E5250C, call: 31A8ACBC, xact: 00000000, curses: 33F38174, usrses: 33F38174

      ----------------------------------------

      SO: 33E5250C, type: 2, owner: 00000000, flag: INIT/-/-/0x00

      (process) Oracle pid=31, calls cur/top: 31A8ACBC/31A8ACBC, flag: (0) -

                int error: 0, call error: 0, sess error: 0, txn error 0

      (post info) last post received: 0 0 0

                  last post received-location: No post

                  last process to post me: none

                  last post sent: 0 0 0

                  last post sent-location: No post

                  last process posted by me: none

        (latch info) wait_event=0 bits=0

            Location from where call was made: kgllkdl: child: cleanup: latch

            Context saved from call: 1

          waiting for 3213e088 Child library cache lock level=6 child#=2

            Location from where latch is held: kgllkdl: child: cleanup: latch

            Context saved from call: 1

            state=busy, wlstate=free

              waiters [orapid (seconds since: put on list, posted, alive check)]:

               10 (15, 1336152387, 15)

               31 (3, 1336152387, 3)

               waiter count=2

              gotten 135516 times wait, failed first 6 sleeps 5

              gotten 0 times nowait, failed: 0

          on wait list for 3213e088

        Process Group: DEFAULT, pseudo proc: 33E7F368

        O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 3140

        OSD pid info: Windows thread id: 3140, image: ORACLE.EXE (SHAD)

    ……省略显示其他内容

    注意上述trace文件中有如下内容:

    waiting for 3213e088 Child library cache lock level=6 child#=2

    但注意这里没有holding Child library cache所以我们已经证明了硬解析会持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache lock。

    第三步:我们来证明硬解析会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cacheChild library cache pin

    证明步骤和思路与第二步一模一样,这里不再赘述,我们直接来看相关的process dump的trace文件:

    d:\oracle\admin\cuihua\udump\cuihua_ora_2444.trc的内容:

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2444.trc

    Sat May 05 01:34:27 2012

    ORACLE V10.2.0.1.0 – Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 – type 586

    Process Affinity    : 0×00000000

    Memory (Avail/Total): Ph:963M/2030M, Ph+PgF:2334M/3921M, VA:1269M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 1

    Oracle process number: 30

    Windows thread id: 2444, image: ORACLE.EXE (SHAD)

    *** 2012-05-05 01:34:27.609

    *** ACTION NAME:(Command Window – New) 2012-05-05 01:34:27.546

    *** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:34:27.546

    *** SERVICE NAME:(cuihua) 2012-05-05 01:34:27.546

    *** SESSION ID:(130.15) 2012-05-05 01:34:27.546

    Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 800, image:

    ===================================================

    PROCESS STATE

    ————-

    Process global information:

         process: 33E51F1C, call: 317DD38C, xact: 00000000, curses: 33F1D17C, usrses: 33F1D17C

      —————————————-

      SO: 33E51F1C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

      (process) Oracle pid=30, calls cur/top: 317DD38C/317DD38C, flag: (0) -

                int error: 0, call error: 0, sess error: 0, txn error 0

      (post info) last post received: 0 0 0

                  last post received-location: No post

                  last process to post me: none

                  last post sent: 0 0 0

                  last post sent-location: No post

                  last process posted by me: none

        (latch info) wait_event=0 bits=0

            Location from where call was made: kglpnal: child: alloc space: latch

            Context saved from call: 1

          waiting for 3213df50 Child library cache pin level=6 child#=2

            Location from where latch is held: kglpndl: child

            Context saved from call: 1

            state=busy, wlstate=free

              waiters [orapid (seconds since: put on list, posted, alive check)]:

               10 (49, 1336152864, 49)

               30 (24, 1336152864, 24)

               waiter count=2

              gotten 348654 times wait, failed first 2 sleeps 1

              gotten 0 times nowait, failed: 0

          on wait list for 3213df50

        Process Group: DEFAULT, pseudo proc: 33E7F368

        O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2444

        OSD pid info: Windows thread id: 2444, image: ORACLE.EXE (SHAD)

    ……省略显示其他内容

    注意上述trace文件中有如下内容:

    waiting for 3213df50 Child library cache pin level=6 child#=2

    但注意这里没有holding Child library cache所以我们已经证明了硬解析会持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache pin。

    第四步:我们来证明软解析也会去持有Child library cache,所以过度的软解析实际上也会导致Child library cache的争用:

    先到session 1、session 2和session 3中去做一下准备工作:

    Session 1:

    SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache’;

    ‘ORADEBUGPOKE0X’||ADDR||’40X00

    ————————————-

    oradebug poke 0x3213DE80 4 0×00000001

    oradebug poke 0x3213DE18 4 0×00000001

    oradebug poke 0x3213DDB0 4 0×00000001

    Session 2:

    SQL> show parameter session_cache;

    NAME                                 TYPE        VALUE

    ———————————— ———– ——————————

    session_cached_cursors               integer     20

    这里session_cached_cursors的值大于0,所以这里执行同样的sql三次以上的时候才会是软软解析,如果同样的sql的执行次数大于一次但不超过三次,则是软解析。

    SQL> select * from scott.emp;

    EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO

    —– ———- ——— —– ———– ——— ——— ——

     7981 CUIHUA                7981                                

     7369 SMITH      CLERK      7902 1980-12-17     800.00               20

     7499 ALLEN      SALESMAN   7698 1981-2-20     1600.00    300.00     30

     7521 WARD       SALESMAN   7698 1981-2-22     1250.00    500.00     30

     7566 JONES      MANAGER    7839 1981-4-2      2975.00               20

     7654 MARTIN     SALESMAN   7698 1981-9-28     1250.00   1400.00     30

     7698 BLAKE      MANAGER    7839 1981-5-1      2850.00               30

     7782 CLARK      MANAGER    7839 1981-6-9      2450.00               10

     7788 SCOTT      ANALYST    7566 1987-4-19     3000.00               20

     7839 KING       PRESIDENT       1981-11-17    5000.00               10

     7844 TURNER     SALESMAN   7698 1981-9-8      1500.00      0.00     30

     7876 ADAMS      CLERK      7788 1987-5-23     1100.00               20

     7800 JAME4      CLERK      7698 1981-12-3      950.00               30

    13 rows selected

    注意这里子cursor和执行次数均为1:

    SQL> select sql_text, SQL_ID, VERSION_COUNT,executions from v$sqlarea where sql_text like ‘select * from scott.emp%’;

    SQL_TEXT                                                                    SQL_ID        VERSION_COUNT EXECUTIONS

    ——————————————————————————– ————- ————- ———-

    select * from scott.emp                                                          dzk32w36udhc5             1          1

    SQL> select PLAN_HASH_VALUE, CHILD_NUMBER from v$sql where sql_id=’dzk32w36udhc5′;

    PLAN_HASH_VALUE CHILD_NUMBER

    ————— ————

         3956160932            0

    SQL> select sid from v$mystat where rownum=1;

           SID

    ———-

           134

    SQL> select spid from v$process where addr in (select paddr from v$session where sid=134);

    SPID

    ————

    2740

    Session 3:

    SQL> oradebug setospid 2740

    Oracle pid: 30, Windows thread id: 2740, image: ORACLE.EXE (SHAD)

    接着我们回到session 1,去手工持有所有的Child library cache:

    Session 1:

    SQL> oradebug setmypid

    已处理的语句

    SQL> oradebug poke 0x3213DE80 4 0×00000001

    BEFORE: [3213DE80, 3213DE84) = 00000000

    AFTER:  [3213DE80, 3213DE84) = 00000001

    SQL> oradebug poke 0x3213DE18 4 0x00000001

    BEFORE: [3213DE18, 3213DE1C) = 00000000

    AFTER:  [3213DE18, 3213DE1C) = 00000001

    SQL> oradebug poke 0x3213DDB0 4 0x00000001

    BEFORE: [3213DDB0, 3213DDB4) = 00000000

    AFTER:  [3213DDB0, 3213DDB4) = 00000001

    接着到session 2去执行一下我们刚才已经在session 1中执行过的sql,这里也会hang住:

    Session 2:

    SQL> select * from scott.emp;

    ……这里hang住了

    接着我们到session 3中去对session 2做一下process dump:

    Session 3:

    SQL> oradebug dump processstate 10

    已处理的语句

    SQL> oradebug tracefile_name

    d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

    d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc的内容:

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

    Thu Jan 28 09:08:16 2010

    ORACLE V10.2.0.1.0 - Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 - type 586

    Process Affinity    : 0x00000000

    Memory (Avail/Total): Ph:1387M/2030M, Ph+PgF:2882M/3921M, VA:1272M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 1

    Oracle process number: 24

    Windows thread id: 2740, image: ORACLE.EXE (SHAD)

    *** SERVICE NAME:(cuihua) 2010-01-28 09:08:16.375

    *** SESSION ID:(140.4) 2010-01-28 09:08:16.375

    Memory Notification: Library Cache Object loaded into SGA

    Heap size 2210K exceeds notification threshold (2048K)

    LIBRARY OBJECT HANDLE: handle=33872c70 mutex=33872D24(0)

    name=XDB.XDbD/PLZ01TcHgNAgAIIegtw==

    hash=e0f82b0c545a707da230a62675d34c80 timestamp=08-30-2005 14:28:09

    namespace=XDBS flags=KGHP/TIM/SML/[02000000]

    kkkk-dddd-llll=0000-0000-0000 lock=S pin=S latch#=2 hpc=0002 hlc=0002

    lwt=33872CCC[33872CCC,33872CCC] ltm=33872CD4[33872CD4,33872CD4]

    pwt=33872CB0[33872CB0,33872CB0] ptm=33872CB8[33872CB8,33872CB8]

    ref=33872CEC[33872CEC,33872CEC] lnd=33872CF8[33872CF8,33872CF8]

      LIBRARY OBJECT: object=303c0ba0

      type=XDBS flags=EXS/LOC[0005] pflags=[0000] status=VALD load=X

      DATA BLOCKS:

      data#     heap  pointer    status pins change whr alloc(K)  size(K)

      —– ——– ——– ——— —- —— — ——– ——–

          0 33872c00 303c0c5c I/P/A/-/-    0 NONE   00      0.34     0.00

          1 303c0c98 3038daa8 I/P/A/-/-    1 NONE   00   2210.98  2224.10

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

    Sat May 05 08:48:42 2012

    ORACLE V10.2.0.1.0 – Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 – type 586

    Process Affinity    : 0×00000000

    Memory (Avail/Total): Ph:1290M/2030M, Ph+PgF:2663M/3921M, VA:1268M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 1

    Oracle process number: 30

    Windows thread id: 2740, image: ORACLE.EXE (SHAD)

    *** 2012-05-05 08:48:42.781

    *** ACTION NAME:(Command Window – New) 2012-05-05 08:48:42.687

    *** MODULE NAME:(PL/SQL Developer) 2012-05-05 08:48:42.687

    *** SERVICE NAME:(cuihua) 2012-05-05 08:48:42.687

    *** SESSION ID:(134.7) 2012-05-05 08:48:42.687

    Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 400, image:

    ===================================================

    PROCESS STATE

    ————-

    Process global information:

         process: 33E51F1C, call: 317C80A0, xact: 00000000, curses: 33F21C9C, usrses: 33F21C9C

      —————————————-

      SO: 33E51F1C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

      (process) Oracle pid=30, calls cur/top: 317C80A0/317C80A0, flag: (0) -

                int error: 0, call error: 0, sess error: 0, txn error 0

      (post info) last post received: 0 0 0

                  last post received-location: No post

                  last process to post me: none

                  last post sent: 0 0 0

                  last post sent-location: No post

                  last process posted by me: none

        (latch info) wait_event=0 bits=0

            Location from where call was made: kglpin: child: heap processing: latch

            Context saved from call: 1

          waiting for 3213de18 Child library cache level=5 child#=2

            Location from where latch is held: kglpndl: child: after processing: latch

            Context saved from call: 1

            state=busy, wlstate=free

              waiters [orapid (seconds since: put on list, posted, alive check)]:

               30 (39, 1336178921, 39)

               waiter count=1

              gotten 179887 times wait, failed first 15 sleeps 1

              gotten 206 times nowait, failed: 0

          on wait list for 3213de18

        Process Group: DEFAULT, pseudo proc: 33E7F368

        O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2740

    OSD pid info: Windows thread id: 2740, image: ORACLE.EXE (SHAD)

    ……省略显示其他内容

    注意上述trace文件中有如下内容:

    waiting for 3213de18 Child library cache level=5 child#=2

    所以我们已经证明了软解析也会去持有Child library cache(这个其实很正常,因为软解析还是要扫描library cache object handle链表的嘛),即过度的软解析实际上也会导致Child library cache的争用。

    手工释放上述Child library cache后回到session 2,刚才hang住的sql已恢复正常:

    SQL> select * from scott.emp;

    EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO

    —– ———- ——— —– ———– ——— ——— ——

     7981 CUIHUA                7981                                

     7369 SMITH      CLERK      7902 1980-12-17     800.00               20

     7499 ALLEN      SALESMAN   7698 1981-2-20     1600.00    300.00     30

     7521 WARD       SALESMAN   7698 1981-2-22     1250.00    500.00     30

     7566 JONES      MANAGER    7839 1981-4-2      2975.00               20

     7654 MARTIN     SALESMAN   7698 1981-9-28     1250.00   1400.00     30

     7698 BLAKE      MANAGER    7839 1981-5-1      2850.00               30

     7782 CLARK      MANAGER    7839 1981-6-9      2450.00               10

     7788 SCOTT      ANALYST    7566 1987-4-19     3000.00               20

     7839 KING       PRESIDENT       1981-11-17    5000.00               10

     7844 TURNER     SALESMAN   7698 1981-9-8      1500.00      0.00     30

     7876 ADAMS      CLERK      7788 1987-5-23     1100.00               20

     7800 JAME4      CLERK      7698 1981-12-3      950.00               30

    13 rows selected

    从如下结果可以看到,子cursor的数量还是1,执行次数变成了2,即刚才第二次执行的sql毫无疑问是软解析:

    SQL> select sql_text, SQL_ID, VERSION_COUNT,executions from v$sqlarea where sql_text like ‘select * from scott.emp%’;

    SQL_TEXT                                                                         SQL_ID        VERSION_COUNT EXECUTIONS

    ——————————————————————————– ————- ————- ———-

    select * from scott.emp                                                          dzk32w36udhc5             1          2

    第五步:我们来证明软解析还是会去持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache lock,但不会同时持有Child library cacheChild library cache lock

    证明步骤和思路与第二步一模一样,这里不再赘述,我们直接来看相关的process dump的trace文件:

    d:\oracle\admin\cuihua\udump\cuihua_ora_2988.trc的内容:

    Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2988.trc

    Sat May 05 01:50:36 2012

    ORACLE V10.2.0.1.0 – Production vsnsta=0

    vsnsql=14 vsnxtr=3

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

    With the Partitioning, OLAP and Data Mining options

    Windows XP Version V5.1 Service Pack 2

    CPU                 : 2 – type 586

    Process Affinity    : 0×00000000

    Memory (Avail/Total): Ph:1031M/2030M, Ph+PgF:2330M/3921M, VA:1269M/2047M

    Instance name: cuihua

    Redo thread mounted by this instance: 1

    Oracle process number: 24

    Windows thread id: 2988, image: ORACLE.EXE (SHAD)

    *** 2012-05-05 01:50:36.531

    *** ACTION NAME:(Command Window – New) 2012-05-05 01:50:36.468

    *** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:50:36.468

    *** SERVICE NAME:(cuihua) 2012-05-05 01:50:36.468

    *** SESSION ID:(132.36) 2012-05-05 01:50:36.468

    Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 972, image:

    ===================================================

    PROCESS STATE

    ————-

    Process global information:

         process: 33E4FB7C, call: 33F61154, xact: 00000000, curses: 33F1F70C, usrses: 33F1F70C

      —————————————-

      SO: 33E4FB7C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

      (process) Oracle pid=24, calls cur/top: 33F61154/33F61154, flag: (0) -

                int error: 0, call error: 0, sess error: 0, txn error 0

      (post info) last post received: 0 0 0

                  last post received-location: No post

                  last process to post me: none

                  last post sent: 0 0 0

                  last post sent-location: No post

                  last process posted by me: none

        (latch info) wait_event=0 bits=0

            Location from where call was made: kgllkdl: child: cleanup: latch

            Context saved from call: 2

          waiting for 3213e0f0 Child library cache lock level=6 child#=1

            Location from where latch is held: kgllkdl: child: cleanup: latch

            Context saved from call: 2

            state=busy, wlstate=free

              waiters [orapid (seconds since: put on list, posted, alive check)]:

               25 (123, 1336153833, 123)

               27 (123, 1336153833, 123)

               11 (120, 1336153833, 120)

               20 (114, 1336153833, 114)

               19 (111, 1336153833, 111)

               24 (39, 1336153833, 39)

               waiter count=6

              gotten 261770 times wait, failed first 74 sleeps 22

              gotten 0 times nowait, failed: 0

          on wait list for 3213e0f0

        Process Group: DEFAULT, pseudo proc: 33E7F368

        O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2988

    OSD pid info: Windows thread id: 2988, image: ORACLE.EXE (SHAD)

    ……省略显示其他内容

    注意上述trace文件中有如下内容:

    waiting for 3213e0f0 Child library cache lock level=6 child#=1

    但注意这里没有holding Child library cache所以我们已经证明了软解析依然会持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache lock,但不会同时持有Child library cache和Child library cache lock。

    第六步:我们来证明软解析还是会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache pin,但不会同时持有Child library cacheChild library cache pin

    证明步骤和思路与第五步一模一样,这里不再赘述。

    结合我文章开头提到的那三篇文章(尤其是大熊的那篇文章)和上面的六步证明过程,我们得出如下关于硬解析和软解析时latch持有情况的结论(这个结论是近似的,并没有包括所有的情况):

    1SQL在硬解析的时候,会先去持有Child library cache以扫描library cache object handle链表,接着会释放Child library cache(因为找不到相关的library cache object handle),接下来会再次持有Child library cache(因为要修改library cache object handle链表),然后在不释放Child library cache的情况下进而去持有Child shared pool以分配内存,接着释放Child shared pool(因为已经分配完了),接着再释放Child library cache然后因为要以null模式获得library cache lock,所以要先持有Child library cache lock;接着要把相关sql的执行计划load到对应的子cursor的library cache object handle中,即要以exclusive模式获得library cache pin,所以要先持有Child library cache pin;

    2SQL在软解析的时候和硬解析类似,只不过不需要持有Child shared pool并且持有Child library cache的时间会更短。

    最后我们来说一下硬解析和软解析在latch争用情况下的本质区别:

    1、软解析不会导致Child shared pool的争用;

    2、硬解析跟软解析比会导致更为剧烈的Child library cache的争用,这个当然是因为硬解析在不释放Child library cache的情况下进而会去持有Child shared pool

     


    8 Comments on “硬解析和软解析情况下latch持有情况分析”

      1. 测试的真仔细!

      2. overmars

        关于硬分析持有的latch,我看大部分的书都是这样描述:
        硬分析应该是先持有SQL handle的library cache lock ,然后在生成计划的过程中,对SQLhandle的LCO以exclusive 方式持有library cache pin.

        但是楼主的说法,好像恰好反过来了,请楼主能否进一步解释下2者持有的顺序,lock和pin的对象分别是针对的对象是谁?

      3. overmars

        ok,等待您的大作!

        • Cui Hua

          这个你是对的,应该是先持有library cache lock latch,再持有library cache pin latch,我已经改过来了。

      4. wzh

        测试非常详细,看完后加深了对软解析、硬解析的理解。

      5. taowang2016

        你好,goog job!
        我在重现以下过程:
        第三步:我们来证明硬解析会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache pin
        时出现了一个问题,向你请教。

        环境:windows 32
        oracle:10.2.0.1

        准备信息如下:
        session1:
        SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache pin’;

        ‘ORADEBUGPOKE0X’||ADDR||’40X00000001′
        ————————————————
        oradebug poke 0x240BDF50 4 0×00000001
        oradebug poke 0x240BDEE8 4 0×00000001
        oradebug poke 0x240BDE80 4 0×00000001

        SQL> alter system flush shared_pool;

        系统已更改。

        SQL> oradebug setmypid
        已处理的语句

        session2的准备信息如下:
        SQL> select sid from v$mystat where rownum=1;

        SID
        ———-
        148

        SQL> select spid from v$process where addr in (select paddr from v$session where sid=148);

        SPID
        ————————
        4072

        session3准备信息如下:
        SQL> oradebug setospid 4072

        Oracle pid: 19, Windows thread id: 4072, image: ORACLE.EXE (SHAD)

        接着
        在session1中poke了library cache pin

        SQL> oradebug poke 0x240BDF50 4 0×00000001
        BEFORE: [240BDF50, 240BDF54) = 00000000
        AFTER: [240BDF50, 240BDF54) = 00000001
        SQL> oradebug poke 0x240BDEE8 4 0×00000001
        BEFORE: [240BDEE8, 240BDEEC) = 00000000
        AFTER: [240BDEE8, 240BDEEC) = 00000001
        SQL> oradebug poke 0x240BDE80 4 0×00000001
        BEFORE: [240BDE80, 240BDE84) = 00000000
        AFTER: [240BDE80, 240BDE84) = 00000001

        session2中执行以下查询,被hang住:

        SQL> select * from scott.emp;
        select * from scott.emp
        hang住

        session3中dump process信息,此时出现了如下错误:

        SQL> oradebug dump processstate 10
        ORA-00072: 进程 “Windows thread id: 4072, image: ORACLE.EXE (SHAD)” 不活动

        同时,发现session2中有如下提示错误:

        select * from scott.emp
        *
        第 1 行出现错误:
        ORA-03113: 通信通道的文件结束

        请问这是如何导致的,经过测试library cache lock的对的,但这个library cache pin的测试做了很多次,一直都是上面的错误信息,是我测试的哪一步出了问题呢?

        • taowang2016

          还是同样的测试,刚特意在centos 5.7 64bit, oracle 10.2.0.1 64bit的环境上进行,就没有这个问题,也就是说在linux上的测试结果和你的是一样的,可是在windows上,试了很多次,一直提示如上的错误,问题可能出在什么地方呢?

  • 相关阅读:
    微信小程序-上传多张图片加进度条(支持预览、删除)
    php中120个内置函数
    angular6 NgModule中定义模块module
    Aliasing input/output properties
    angular6 Can't bind to 'zzst' since it isn't a known property of
    [转]DOM 中 Property 和 Attribute 的区别
    Angular6
    [转]VirtualBox 修改UUID实现虚拟硬盘复制
    pthread_create如何传递两个参数以上的参数
    linux 线程操作问题undefined reference to 'pthread_create'的解决办法(cmake)
  • 原文地址:https://www.cnblogs.com/taowang2016/p/2932069.html
Copyright © 2011-2022 走看看