zoukankan      html  css  js  c++  java
  • asm学习之rebalance

    rebalacne场景:

           向asm磁盘组中添加,删除,resize等操作都将会引起asm实例rebalacne。

    reblaacne过程:

    第一阶段planning:

      计算出rebalance的计划,会根据磁盘大小个数,磁盘吞吐,au大小等计算出大致计划,该过程一般只需几分钟。

    第二阶段extent relocating:

      真正进行重平衡的过程,将extent均匀的分配在各个磁盘中。该过程通常是整个rebalance最耗时的过程。

    第三阶段compacting:

      asm11.1.0.7以上才开始支持,将磁盘上存的数据尽可能的移动到磁盘的外圈磁道上去(机械盘的外圈速度更快),花费时间较少。  

    rebalance实际大小以及所需时间

      磁盘组的rebalance什么时候能完成?这可能是很多人最关心的问题。通常情况下可以通过gv$asm_operation.est_minutes来知道还剩下几分钟可以完成rebalance虽然这个值并不准确。

    做个小实验体验下整个过程:

    首先了解个公式

    Formula: (TAU/N)*N-1
    Where:
    TAU= Total Allocation Unit to move.
    N=number of disks

    查看磁盘组多少个extent

    13:59:11 SQL> Select count(pxn_kffxp) Extents, disk_kffxp disk_number, group_kffxp group_number from x$kffxp where group_kffxp=6 group by disk_kffxp, group_kffxp order by group_kffxp,disk_kffxp;

    EXTENTS DISK_NUMBER GROUP_NUMBER
    ---------- ----------- ------------
    194 0 6

    Elapsed: 00:00:00.00


    13:59:16 SQL> select (total_mb-free_mb) REQUIRED_MIRROR_FREE_MB,USABLE_FILE_MB, type,ALLOCATION_UNIT_SIZE/1024 from v$asm_diskgroup where name='TEST';

    REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB TYPE ALLOCATION_UNIT_SIZE/1024
    ----------------------- -------------- ------------ -------------------------
    200 457144 EXTERN 1024

    Elapsed: 00:00:00.03

    200m使用大小跟查出来194个extent大小基本相同

    23:27:36 SQL> show parameter asm                         ---------------------- 查看asm rebalance并行度

    NAME TYPE VALUE
    ------------------------------------ ---------------------- ------------------------------
    asm_power_limit integer 1

    23:06:28 SQL> alter diskgroup archdg drop disk ARCHDG_0005;

    Diskgroup altered.

    14:01:09 SQL> select * from v$asm_operation;        ---------------------可以看到实际需要rebalance的extent是129=194/3*2

    GROUP_NUMBER OPERATION STATE POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
    ------------ ---------- -------- ---------- ---------- ---------- ---------- ---------- ----------- ----------------------------------------------------------------------------------------
    6 REBAL RUN 1 1 2 129 5098 0

    在看下 EST_MINUTES字段

    23:06:45 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1;

    INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES
    ---------- ---------- -------- ---------- ---------- ---------- ---------- -----------
    2 REBAL WAIT 1
    1 REBAL RUN 1 6166 53989 10157 4

    这里显示4分钟完成,可以看到后台日志真正完成时间从23:06-23:13是7分钟,这里的数据较少所以结果还不是特别明显。

    SQL> alter diskgroup archdg drop disk ARCHDG_0005 
    NOTE: GroupBlock outside rolling migration privileged region
    NOTE: requesting all-instance membership refresh for group=1
    Wed Jul 04 23:06:42 2018
    GMON updating for reconfiguration, group 1 at 61 for pid 35, osid 30893
    NOTE: group ARCHDG: updated PST location: disk 0002 (PST copy 0)
    NOTE: group ARCHDG: updated PST location: disk 0004 (PST copy 1)
    NOTE: group 1 PST updated.
    Wed Jul 04 23:06:42 2018
    NOTE: membership refresh pending for group 1/0x8ab98994 (ARCHDG)
    GMON querying group 1 at 62 for pid 18, osid 20082
    SUCCESS: refreshed membership for 1/0x8ab98994 (ARCHDG)
    SUCCESS: alter diskgroup archdg drop disk ARCHDG_0005
    NOTE: Attempting voting file refresh on diskgroup ARCHDG
    NOTE: Refresh completed on diskgroup ARCHDG. No voting file found.
    NOTE: starting rebalance of group 1/0x8ab98994 (ARCHDG) at power 1
    Starting background process ARB0
    Wed Jul 04 23:06:48 2018
    ARB0 started with pid=36, OS id=41870 
    NOTE: assigning ARB0 to group 1/0x8ab98994 (ARCHDG) with 1 parallel I/O
    cellip.ora not found.
    Wed Jul 04 23:11:33 2018
    NOTE: GroupBlock outside rolling migration privileged region
    NOTE: requesting all-instance membership refresh for group=1
    Wed Jul 04 23:11:36 2018
    GMON updating for reconfiguration, group 1 at 63 for pid 37, osid 52733
    NOTE: group ARCHDG: updated PST location: disk 0002 (PST copy 0)
    NOTE: group ARCHDG: updated PST location: disk 0004 (PST copy 1)
    NOTE: group 1 PST updated.
    SUCCESS: grp 1 disk ARCHDG_0005 emptied
    NOTE: erasing header on grp 1 disk ARCHDG_0005
    NOTE: process _x000_+asm1 (52733) initiating offline of disk 5.3916003717 (ARCHDG_0005) with mask 0x7e in group 1
    NOTE: initiating PST update: grp = 1, dsk = 5/0xe9697985, mask = 0x6a, op = clear
    GMON updating disk modes for group 1 at 64 for pid 37, osid 52733
    NOTE: group ARCHDG: updated PST location: disk 0002 (PST copy 0)
    NOTE: group ARCHDG: updated PST location: disk 0004 (PST copy 1)
    NOTE: PST update grp = 1 completed successfully 
    NOTE: initiating PST update: grp = 1, dsk = 5/0xe9697985, mask = 0x7e, op = clear
    GMON updating disk modes for group 1 at 65 for pid 37, osid 52733
    NOTE: group ARCHDG: updated PST location: disk 0002 (PST copy 0)
    NOTE: group ARCHDG: updated PST location: disk 0004 (PST copy 1)
    NOTE: cache closing disk 5 of grp 1: ARCHDG_0005
    NOTE: PST update grp = 1 completed successfully 
    GMON updating for reconfiguration, group 1 at 66 for pid 37, osid 52733
    NOTE: cache closing disk 5 of grp 1: (not open) ARCHDG_0005
    NOTE: group ARCHDG: updated PST location: disk 0002 (PST copy 0)
    NOTE: group ARCHDG: updated PST location: disk 0004 (PST copy 1)
    NOTE: group 1 PST updated.
    Wed Jul 04 23:11:36 2018
    NOTE: membership refresh pending for group 1/0x8ab98994 (ARCHDG)
    GMON querying group 1 at 67 for pid 18, osid 20082
    GMON querying group 1 at 68 for pid 18, osid 20082
    NOTE: Disk ARCHDG_0005 in mode 0x0 marked for de-assignment
    SUCCESS: refreshed membership for 1/0x8ab98994 (ARCHDG)
    NOTE: Attempting voting file refresh on diskgroup ARCHDG
    NOTE: Refresh completed on diskgroup ARCHDG. No voting file found.
    Wed Jul 04 23:13:57 2018
    NOTE: stopping process ARB0
    SUCCESS: rebalance completed for group 1/0x8ab98994 (ARCHDG)

    此时可以使用ls -lrth +ASM_arb0*找到最新的arb trace文件,该文件会带有文件号类似+ASM1_arb0_70552.trc,其中可以看到arb0进程正在迁移数据


    *** 2018-07-04 23:24:07.940
    ARB0 relocating file +ARCHDG.2115.980618919 (120 entries)

    *** 2018-07-04 23:24:09.072
    ARB0 relocating file +ARCHDG.2115.980618919 (120 entries)

    *** 2018-07-04 23:24:11.826
    ARB0 relocating file +ARCHDG.2115.980618919 (120 entries)

    *** 2018-07-04 23:24:14.527
    ARB0 relocating file +ARCHDG.2115.980618919 (120 entries)

    通过pstack也可以看到这个进程使用了

    kfgbRebalExecute - kfdaExecute - kffRelocate函数

    #pstack 70552
    #0 0x00007fe8467bc644 in __io_getevents_0_4 () from /lib64/libaio.so.1
    #1 0x0000000002baea09 in skgfrliopo ()
    #2 0x0000000002bae801 in skgfospo ()
    #3 0x00000000086ba843 in skgfrwat ()
    #4 0x0000000008599e5d in ksfdwtio ()
    #5 0x000000000220019b in ksfdwat_internal ()
    #6 0x0000000003edce17 in kfk_reap_ufs_async_io ()
    #7 0x0000000003edcd5f in kfk_reap_ios_from_subsys ()
    #8 0x0000000000adefa4 in kfk_reap_ios ()
    #9 0x0000000003edc382 in kfk_io1 ()
    #10 0x0000000003edbf28 in kfkRequest ()
    #11 0x0000000003ee272e in kfk_transitIO ()
    #12 0x0000000003e35d5e in kffRelocateWait ()
    #13 0x0000000003e5cbf6 in kffRelocate ()
    #14 0x0000000003dd22df in kfdaExecute ()
    #15 0x0000000003eb563f in kfgbRebalExecute ()
    #16 0x0000000003ea2414 in kfgbDriver ()
    #17 0x00000000021a86d7 in ksbabs ()
    #18 0x0000000003eb964c in kfgbRun ()
    #19 0x00000000021ad44b in ksbrdp ()
    #20 0x00000000023d3eef in opirip ()
    #21 0x000000000167e7b5 in opidrv ()
    #22 0x0000000001c58477 in sou2o ()
    #23 0x00000000008472c2 in opimai_real ()
    #24 0x0000000001c5e795 in ssthrdmain ()
    #25 0x00000000008471b9 in main ()

    23:09:13 SQL> /

    INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES
    ---------- ---------- -------- ---------- ---------- ---------- ---------- -----------
    2 REBAL WAIT 1
    1 REBAL RUN 1 53989 53989 0 0

    当剩余0min中的时候其实是进入了第三阶段compact
    同样可以使用pstack看到使用了kfdCompact 函数

    #pstack 85306
    #0 0x00007fef99cdb644 in __io_getevents_0_4 () from /lib64/libaio.so.1
    #1 0x0000000002baea09 in skgfrliopo ()
    #2 0x0000000002bae801 in skgfospo ()
    #3 0x00000000086ba843 in skgfrwat ()
    #4 0x0000000008599e5d in ksfdwtio ()
    #5 0x000000000220019b in ksfdwat_internal ()
    #6 0x0000000003edce17 in kfk_reap_ufs_async_io ()
    #7 0x0000000003edcd5f in kfk_reap_ios_from_subsys ()
    #8 0x0000000000adefa4 in kfk_reap_ios ()
    #9 0x0000000003edc382 in kfk_io1 ()
    #10 0x0000000003edbf28 in kfkRequest ()
    #11 0x0000000003ee272e in kfk_transitIO ()
    #12 0x0000000003e35d5e in kffRelocateWait ()
    #13 0x0000000003e5cbf6 in kffRelocate ()
    #14 0x0000000003dd22df in kfdaExecute ()
    #15 0x0000000003da1281 in kfdCompact ()
    #16 0x0000000003da221a in kfdExecute ()
    #17 0x0000000003eb5342 in kfgbRebalExecute ()
    #18 0x0000000003ea2414 in kfgbDriver ()
    #19 0x00000000021a86d7 in ksbabs ()
    #20 0x0000000003eb964c in kfgbRun ()
    #21 0x00000000021ad44b in ksbrdp ()
    #22 0x00000000023d3eef in opirip ()
    #23 0x000000000167e7b5 in opidrv ()
    #24 0x0000000001c58477 in sou2o ()
    #25 0x00000000008472c2 in opimai_real ()
    #26 0x0000000001c5e795 in ssthrdmain ()
    #27 0x00000000008471b9 in main ()

      

    那么如何提升rebalance速度

    影响rebalance速度因素::

    • Asm_power_limit
    • Diskgroup redundancy
    • Disks throughput
    • Disks RAID
    • Disk provider
    • AU size
    • How many disks are being added or dropped
    • Failure groups

    这里列出几点不一一细说了:

    1、不要过度rebalance

    当使用alter diskgroup rebalance的时候,将会使asm承受很高的IO负载,所以我们应该量避免过度使用rebalance,能用1次操作的不要使用两次。 

    15:42:29 SQL> alter diskgroup testdd add disk '/dev/qdata/mpath-s03.3264.01.P0B00S05' rebalance power 20 wait;

    Diskgroup altered.

    Elapsed: 00:00:26.82

    15:42:50 SQL> alter diskgroup testdd add disk '/dev/qdata/mpath-s02.3264.01.P0B00S05' rebalance power 20 wait;

    Diskgroup altered.

    Elapsed: 00:00:25.97

    15:41:32 SQL> alter diskgroup testdd add disk '/dev/qdata/mpath-s03.3264.01.P0B00S05', '/dev/qdata/mpath-s02.3264.01.P0B00S05' rebalance power 20 wait;

    Diskgroup altered.

    Elapsed: 00:00:16.77

    2、power参数越大,rebalance越快

    15:48:00 SQL> alter diskgroup testdd add disk '/dev/qdata/mpath-s03.3264.01.P0B00S05', '/dev/qdata/mpath-s02.3264.01.P0B00S05' rebalance power 1 wait;

    Diskgroup altered.

    Elapsed: 00:00:20.11

    3、跳过compact阶段:  

    设置_DISABLE_REBALANCE_COMPACT=TRUE,ALTER DISKGROUP SET ATTRIBUTE "_rebalance_compact”="FALSE";在设置该参数前还是建议咨询SR,或者去问asm support guy Bane Radulovic

    rebalance新特性:

    asm fast reblacne:

    关闭数据库,将asm实例restrict模式重启,此时数据库无法连接,进行rebalance会快速一点,等待rebalacne完毕以后重启asm实例,似乎没什么用。

    asm fast mirror resync:

    当一个磁盘损坏修复后还原磁盘组冗余度时,有时候会很耗时间。这种场景下,oracle asm fast mirror resync能够有效的减少重同步时间。必须将兼容性设置11.1.0以上才能生效。fast mirror resync在这样的场景下,会跟踪磁盘offline期间extents的变化,等待online以后再进行重平衡。

    这个参数默认为3.6h,通过设置disk_repaire_time来设置。在drop之前要确定没有offline disk

    12c版本中,可以通过V$ASM_OPERATION.pass字段观察到resync阶段。

    SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION;

    GROUP_NUMBER PASS STAT

    ------------ --------- ----

    1 RESYNC RUN

    1 REBALANCE WAIT

    1 COMPACT WAIT

    2、power参数越大,rebalance越快

  • 相关阅读:
    内存映射文件原理探索(转载)
    虚拟内存原理
    CSAPP-链接
    CSAPP-程序优化
    CSAPP-过程调用,数据存储,缓冲区溢出
    【数学,方差运用,暴力求解】hdu-5037 Galaxy (2014鞍山现场)
    【贪心+一点小思路】Zoj
    【几何模板加点小思路】hdu-4998 Rotate
    【背包问题】【出来混总是要还的...】总结+入门练手题
    【优先队列】【最近连STL都写不出来了/(ㄒoㄒ)/~~】hdu_5360/多校#6_1008
  • 原文地址:https://www.cnblogs.com/huayng/p/9270786.html
Copyright © 2011-2022 走看看