zoukankan      html  css  js  c++  java
  • [20190409]pre_page_sga=true与连接缓慢的问题.txt

    [20190409]pre_page_sga=true与连接缓慢的问题.txt

    --//曾经遇到11g下设置pre_page_sga=true启动缓慢的问题(没有使用hugepages).
    --//链接:http://blog.itpub.net/267265/viewspace-2295412/
    --//实际上这样系统也会遇到连接缓慢的情况,通过测试说明问题.

    1.建立pfile:
    $ cat initxxxx.ora
    db_name=xxxx
    instance_name=xxxx
    sga_target=20G
    #sga_target=0
    sga_max_size=20G
    pre_page_sga=true

    $ export ORACLE_SID=xxxx

    SYS@xxxx> startup nomount
    ORACLE instance started.
    Total System Global Area 2.1379E+10 bytes
    Fixed Size                  2262656 bytes
    Variable Size            2818574720 bytes
    Database Buffers         1.8522E+10 bytes
    Redo Buffers               36073472 bytes

    $ grep -i page /proc/meminfo
    AnonPages:        239336 kB
    PageTables:       701008 kB
    AnonHugePages:         0 kB
    HugePages_Total:     104
    HugePages_Free:       48
    HugePages_Rsvd:       41
    HugePages_Surp:        0
    Hugepagesize:       2048 kB

    --//才启动没有任何用户连接,PageTables=701008 kB.

    2.测试登陆问题:
    $ time sqlplus -s -l  / as sysdba <<<quit
    real    0m4.944s
    user    0m0.024s
    sys     0m0.016s
    --//单个连接需要5秒.如果多个用户并发呢?
    $ cat aa.sql
    quit

    $ time seq 50 | xargs -I{} -P 50 sqlplus -s -l  / as sysdba @ aa.sql
    real    0m13.978s
    user    0m1.833s
    sys     0m1.017s
    --//多个用户同时登录,全部完成需要13秒.

    3.分析:
    $ strace -f -c sqlplus -s -l  / as sysdba <<<quit
    Process 3120 attached
    Process 3120 detached
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     99.85    0.419648       69941         6           shmdt
      0.10    0.000434           1       413         2 read
      0.01    0.000046           0       287       127 open
      0.01    0.000045           0       180           mmap
      0.01    0.000037           0       178           close
      0.01    0.000032           0        79           rt_sigaction
      ....
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.420259                  1970       227 total

    $ man shmdt
    ...
    shmdt() detaches the shared memory segment located at the address specified by shmaddr from the address space of the
    calling process.  The to-be-detached segment must be currently attached with shmaddr equal to the value returned by the
    attaching shmat() call.

    On a successful shmdt() call the system updates the members of the shmid_ds structure associated with the shared memory
    segment as follows:

    shm_dtime is set to the current time.
    shm_lpid is set to the process-ID of the calling process.
    shm_nattch is decremented by one.  If it becomes 0 and the segment is marked for deletion, the segment is deleted.

    --//给人的感觉好像消耗在调用shmdt,实际上消耗的时间0.419648.也就是这样还是无法定位问题在那里.

    $ strace -fTtt  -e shmdt sqlplus -s -l  / as sysdba
    Process 6046 attached
    [pid  6046] 08:41:55.548586 shmdt(0x7f2c41600000) = 0 <0.000036>
    quit
    08:41:59.992835 shmdt(0x60000000)       = 0 <0.000042>
    08:41:59.992983 shmdt(0x68000000)       = 0 <0.000042>
    08:41:59.993097 shmdt(0x6c000000)       = 0 <0.360175>
    08:42:00.353446 shmdt(0x4ec000000)      = 0 <0.037200>
    08:42:00.390737 shmdt(0x560000000)      = 0 <0.000026>
    Process 6046 detached
    --//我手工执行quit,可以发现shmdt调用发生在退出阶段,也就是不是它导致登陆缓慢.

    $ ipcs -smt

    ------ Shared Memory Attach/Detach/Change Times --------
    shmid      owner      attached             detached             changed
    352092161  oracle      Apr 10 17:26:09      Apr 10 17:26:17      Apr 10 17:09:55
    352124930  oracle      Apr 10 17:26:09      Apr 10 17:26:17      Apr 10 17:09:55
    352157699  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55
    352190469  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55
    352223238  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55

    ------ Semaphore Operation/Change Times --------
    semid    owner      last-op                    last-changed
    310738944 oracle      Wed Apr 10 17:11:33 2019   Wed Apr 10 17:11:33 2019
    --//从执行时间看4.5秒并不是消耗在这里.

    4.继续分析,排除调用shmdt的情况:

    $ strace -fTr -o /tmp/a2.txt  sqlplus  -s -l / as sysdba <<<quit
    $ awk '{print $2}' /tmp/a2.txt | sort -nr | head
    5.041521
    0.287119
    0.064745
    0.008298
    0.006386
    0.004288
    0.004272
    0.001319
    0.000982
    0.000978

    --//查看a2.txt
    3366       0.000067 stat("/u01/app/oracle/product/11.2.0.4/dbhome_1/lib", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 <0.000025>
    3366       0.000112 chdir("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs") = 0 <0.000027>
    3366       5.041521 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c0b3000 <0.000048>
    3366       0.000407 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c090000 <0.000015>
    3366       0.000127 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c06d000 <0.000013>
    3366       0.000218 getcwd("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs"..., 256) = 46 <0.000016>
    3366       0.000087 access("/etc/intchg.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000019>
    ...
    --//也就是在chdir调用与mmap调用之间有5秒不知道在干什么?(chdir 仅仅需要0.000027秒).strace看不出来!!
    --//如果你使用top -u oracle观察,你可以发现RES字段不断在增加,接近SGA设置.我的感觉就是在这里建立pagetables的过程.

    $ cat aa.sql
    host sleep 1
    quit

    $ cat a.sh
    #! /bin/bash
    export ORACLE_SID=xxxx
    sqlplus  -s -l / as sysdba @ aa.sql &
    a=$(($!+2))
    top -p $a -d 1 -b

    $ . a.sh | tee /tmp/aa.txt
    ...
    --//按ctrl+c中断.

    $ grep PID /tmp/aa.txt | head -1; egrep 'oracle' /tmp/aa.txt
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
     8498 oracle    20   0 20.2g 1.5g 1.5g R 93.9  1.2   0:00.47 oracle
     8498 oracle    20   0 20.2g 5.8g 5.8g R 99.7  4.6   0:01.47 oracle
     8498 oracle    20   0 20.2g  10g  10g R 99.8  8.4   0:02.47 oracle
     8498 oracle    20   0 20.2g  15g  15g R 99.8 12.2   0:03.47 oracle
    --//sleep 1
     8498 oracle    20   0 20.2g  19g  19g S 95.9 15.7   0:04.43 oracle
     8498 oracle    20   0 20.2g  19g  19g S  0.0 15.7   0:04.43 oracle
     8498 oracle    20   0 20.0g  19g  19g R  1.0 15.6   0:04.44 oracle
    --//你可以发现RES,SHR都在增加,CPU使用率在9X%,感觉11g版本在设置pre_page_sga=true下有问题.
    --//大致定位时间定位在建立pagetables的过程.

    5.改用hugepages呢?
    $ grep -i page /proc/meminfo
    AnonPages:        244144 kB
    PageTables:        11508 kB
    AnonHugePages:         0 kB
    HugePages_Total:   26000
    HugePages_Free:    15800
    HugePages_Rsvd:       41
    HugePages_Surp:        0
    Hugepagesize:       2048 kB

    $ time  sqlplus -s -l  / as sysdba <<<quit
    real    0m0.180s
    user    0m0.029s
    sys     0m0.011s

    $ strace -fTr -o /tmp/a3.txt  sqlplus  -s -l / as sysdba <<<quit
    $ awk '{print $2}' /tmp/a3.txt | sort -nr | head
    0.121929
    0.009182
    0.009144
    0.009047
    0.005081
    0.002050
    0.001984
    0.001932
    0.001321
    0.001243

    --//查看a3.txt
    8716       0.000095 chdir("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs") = 0 <0.000019>
    --//这里等待时间很小.
    8716       0.121929 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f552659d000 <0.000029>
    8716       0.000143 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f552657a000 <0.000020>
    8716       0.000166 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5526557000 <0.000020>
    8716       0.000260 getcwd("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs"..., 256) = 46 <0.000023>

    --//总之:
    如果pre_page_sga=true最好开启hugepages.
    很奇怪的是12c 缺省pre_page_sga=true,没有开启hugepages的情况下,没有这个问题.

  • 相关阅读:
    java.lang.UnsatisfiedLinkError: No implementation found for
    target release 1.5 conflicts with default source release 1.7
    (转)makefile里PHONY的相关介绍
    Hint: A potential Change-Id was found, but it was not in the footer (last paragraph) of the commit message
    linux jdk版本随时切换
    提高Service的优先级
    第一章、数字图像的描述
    gluas图形处理——导读
    图像处理学习过程——网站,视频,书籍(长期更新)
    基数排序
  • 原文地址:https://www.cnblogs.com/lfree/p/10687961.html
Copyright © 2011-2022 走看看