zoukankan      html  css  js  c++  java
  • Oracle 11gR2光钎链路切换crs服务发生crash

    Oracle 11gR2光钎链路切换crs服务发生crash

     

    背景:

    我们将Oracle 11gR2(11.2.0.4)在RedHat EnterPrise 5.8上通过RDAC完毕的多路径链路冗余。在部署完毕后,我们须要做多路径链路冗余測试,我们的光钎链路连接方式例如以下。我们做多路径測试完毕了例如以下几个组合:

    拔线測试组合一:

    1、 先拔下光钎链路 ②和④ 一切正常没有问题。插上五分钟后运行第2步。

    2、 再拔下光钎链路 ①和③ 数据库服务正常,crs进程crash无法訪问。手工重新启动crs进程就可以。

    拔线測试组合二:

    1、 先拔下光钎链路 ①和③ 一切正常没有问题;插上五分钟后运行第2步。

    2、 再拔下光钎链路 ②和④ 数据库服务正常。crs进程crash无法訪问。手工重新启动crs进程就可以。

    拔线測试组合三:

    1、 先拔下光钎链路 ①和④ 一切正常没有问题;插上五分钟后运行第2步。

    2、 再拔下光钎链路 ②和③ 一切正常没有问题;

    拔线測试组合四:

    1、 先拔下光钎链路 ②和③ 一切正常没有问题;插上五分钟后运行第2步。

    2、 再拔下光钎链路 ①和④ 一切正常没有问题;

    控制器切换測试组合:

    1、 进入存储管理控制台。查看当前磁盘所在控制器为A控,手动所有切换到B。一切正常没有问题。

    2、 五分钟之后。再次进入存储管理控制台。将全部磁盘从B控制器切换到A控制器,一切正常没有问题。



     

    问题现象:

    问题发生在第一组和第二组的的測试2上面,问题现象例如以下:

    [grid@db01 ~] $ crs_stat -t -v
    CRS-0184: Cannot communicate with the CRS daemon.
    
    [root@db01 ~]# ps -ef|grep ora
    oracle    2687     1  0 00:12 ?        00:00:00 ora_pmon_woo
    oracle    2689     1  0 00:12 ?        00:00:00 ora_psp0_woo
    oracle    2691     1  0 00:12 ?        00:00:00 ora_vktm_woo
    oracle    2695     1  0 00:12 ?        00:00:00 ora_gen0_woo
    oracle    2697     1  0 00:12 ?        00:00:00 ora_diag_woo
    oracle    2699     1  0 00:12 ?

    00:00:00 ora_dbrm_woo oracle 2701 1 0 00:12 ? 00:00:00 ora_dia0_woo oracle 2703 1 0 00:12 ? 00:00:00 ora_mman_woo oracle 2705 1 0 00:12 ?

    00:00:00 ora_dbw0_woo oracle 2707 1 0 00:12 ?

    00:00:00 ora_lgwr_woo oracle 2709 1 0 00:12 ? 00:00:01 ora_ckpt_woo oracle 2711 1 0 00:12 ? 00:00:00 ora_smon_woo oracle 2713 1 0 00:12 ? 00:00:00 ora_reco_woo oracle 2715 1 0 00:12 ? 00:00:00 ora_mmon_woo oracle 2717 1 0 00:12 ? 00:00:00 ora_mmnl_woo oracle 2719 1 0 00:12 ?

    00:00:00 ora_d000_woo oracle 2721 1 0 00:12 ? 00:00:00 ora_s000_woo oracle 2728 1 0 00:12 ?

    00:00:00 ora_rvwr_woo SQL> select host_name,instance_name,status from gv$instance; HOST_NAME INSTANCE_NAME STATUS ---------- ---------------- ------------ db01 woo OPEN db02 woo OPEN


    日志排查:

            OSmessage:

    Oct 30 13:48:23 db01 kernel: lpfc 0000:1b:00.0: 1:(0):0203 Devloss timeout on WWPN 20:34:00:80:e5:3f:7b:f0 NPort x0000e4 Data: x0 x7 x0
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:0 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 7 [RAIDarray.mpp]oracledb:0:0 Path Failed
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:0 No new path: fall to failover controller case. vcmnd SN 74635 pdev H8:C0:T0:L0 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:0 Failed controller to 1. retry. vcmnd SN 74635 pdev H8:C0:T0:L0 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:0 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:0 No new path: fall to failover controller case. vcmnd SN 74625 pdev H8:C0:T0:L0 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:0 Failed controller to 1. retry. vcmnd SN 74625 pdev H8:C0:T0:L0 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74645 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74645 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74644 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74644 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74643 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74643 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74642 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74642 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74634 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74634 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74632 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74632 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74631 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74631 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74624 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74624 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:23 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74622 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74622 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:23 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74621 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74621 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74620 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74620 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:1 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:1 No new path: fall to failover controller case. vcmnd SN 74619 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:1 Failed controller to 1. retry. vcmnd SN 74619 pdev H8:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:2 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:2 No new path: fall to failover controller case. vcmnd SN 74641 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:2 Failed controller to 1. retry. vcmnd SN 74641 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:2 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:2 No new path: fall to failover controller case. vcmnd SN 74633 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:2 Failed controller to 1. retry. vcmnd SN 74633 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:2 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:2 No new path: fall to failover controller case. vcmnd SN 74623 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:2 Failed controller to 1. retry. vcmnd SN 74623 pdev H8:C0:T0:L2 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:3 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:3 No new path: fall to failover controller case. vcmnd SN 74636 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:3 Failed controller to 1. retry. vcmnd SN 74636 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:3 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:3 No new path: fall to failover controller case. vcmnd SN 74626 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:3 Failed controller to 1. retry. vcmnd SN 74626 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:3 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:3 No new path: fall to failover controller case. vcmnd SN 74615 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:3 Failed controller to 1. retry. vcmnd SN 74615 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:3 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:3 No new path: fall to failover controller case. vcmnd SN 74610 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:3 Failed controller to 1. retry. vcmnd SN 74610 pdev H8:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:4 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:4 No new path: fall to failover controller case. vcmnd SN 74637 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:4 Failed controller to 1. retry. vcmnd SN 74637 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:4 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:4 No new path: fall to failover controller case. vcmnd SN 74630 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:4 Failed controller to 1. retry. vcmnd SN 74630 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:4 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:4 No new path: fall to failover controller case. vcmnd SN 74614 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:4 Failed controller to 1. retry. vcmnd SN 74614 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:4 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:4 No new path: fall to failover controller case. vcmnd SN 74609 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:4 Failed controller to 1. retry. vcmnd SN 74609 pdev H8:C0:T0:L4 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:5 Selection Retry count exhausted
    Oct 30 13:48:24 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:5 No new path: fall to failover controller case. vcmnd SN 74638 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:24 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:5 Failed controller to 1. retry. vcmnd SN 74638 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:5 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:5 No new path: fall to failover controller case. vcmnd SN 74629 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:5 Failed controller to 1. retry. vcmnd SN 74629 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:5 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:5 No new path: fall to failover controller case. vcmnd SN 74616 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:5 Failed controller to 1. retry. vcmnd SN 74616 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:5 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:5 No new path: fall to failover controller case. vcmnd SN 74613 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:5 Failed controller to 1. retry. vcmnd SN 74613 pdev H8:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:6 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:6 No new path: fall to failover controller case. vcmnd SN 74639 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:6 Failed controller to 1. retry. vcmnd SN 74639 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:6 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:6 No new path: fall to failover controller case. vcmnd SN 74628 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:6 Failed controller to 1. retry. vcmnd SN 74628 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:6 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:6 No new path: fall to failover controller case. vcmnd SN 74618 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:6 Failed controller to 1. retry. vcmnd SN 74618 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:6 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:6 No new path: fall to failover controller case. vcmnd SN 74611 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:6 Failed controller to 1. retry. vcmnd SN 74611 pdev H8:C0:T0:L6 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:7 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:7 No new path: fall to failover controller case. vcmnd SN 74640 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:7 Failed controller to 1. retry. vcmnd SN 74640 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:7 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:7 No new path: fall to failover controller case. vcmnd SN 74627 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:7 Failed controller to 1. retry. vcmnd SN 74627 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:7 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:7 No new path: fall to failover controller case. vcmnd SN 74617 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:7 Failed controller to 1. retry. vcmnd SN 74617 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 94 [RAIDarray.mpp]oracledb:0:0:7 Selection Retry count exhausted
    Oct 30 13:48:25 db01 kernel: 496 [RAIDarray.mpp]oracledb:0:0:7 No new path: fall to failover controller case. vcmnd SN 74612 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 497 [RAIDarray.mpp]oracledb:0:0:7 Failed controller to 1. retry. vcmnd SN 74612 pdev H8:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
    Oct 30 13:48:25 db01 kernel: 10 [RAIDarray.mpp]oracledb:1 Failover command issued
    Oct 30 13:48:25 db01 kernel: 801 [RAIDarray.mpp]Failover succeeded to oracledb:1<strong>
    </strong>

    ASMalert日志信息:

    Thu Oct 30 13:47:40 2014
    kjbdomdet send to inst 2
    detach from dom 4, sending detach message to inst 2
    WARNING: Offline for disk OCR_VOTE1 in mode 0x7f failed.
    WARNING: Offline for disk OCR_VOTE2 in mode 0x7f failed.
    WARNING: Offline for disk OCR_VOTE3 in mode 0x7f failed.
    WARNING: Offline for disk OCR_VOTE4 in mode 0x7f failed.
    WARNING: Offline for disk OCR_VOTE5 in mode 0x7f failed.
    Thu Oct 30 13:47:40 2014
    List of instances:
     1 2
    Dirty detach reconfiguration started (new ddet inc 1, cluster inc 4)
     Global Resource Directory partially frozen for dirty detach
    * dirty detach - domain 4 invalid = TRUE 
     17 GCS resources traversed, 0 cancelled
    Dirty Detach Reconfiguration complete
    Thu Oct 30 13:47:40 2014
    WARNING: dirty detached from domain 4
    NOTE: cache dismounted group 4/0x4558828B (OCR_VOT001) 
    SQL> alter diskgroup OCR_VOT001 dismount force /* ASM SERVER:1163428491 */ 
    Thu Oct 30 13:47:40 2014
    NOTE: cache deleting context for group OCR_VOT001 4/0x4558828b
    GMON dismounting group 4 at 19 for pid 28, osid 17335
    NOTE: Disk OCR_VOTE1 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE2 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE3 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE4 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE5 in mode 0x7f marked for de-assignment
    NOTE:Waiting for all pending writes to complete before de-registering: grpnum 4
    Thu Oct 30 13:47:58 2014
     Received dirty detach msg from inst 2 for dom 4
    Thu Oct 30 13:47:58 2014
    List of instances:
     1 2
    Dirty detach reconfiguration started (new ddet inc 2, cluster inc 4)
     Global Resource Directory partially frozen for dirty detach
    * dirty detach - domain 4 invalid = TRUE 
     0 GCS resources traversed, 0 cancelled
    freeing rdom 4
    Dirty Detach Reconfiguration complete
    Thu Oct 30 13:48:10 2014
    NOTE:Waiting for all pending writes to complete before de-registering: grpnum 4
    Thu Oct 30 13:48:24 2014
    ASM Health Checker found 1 new failures
    Thu Oct 30 13:48:26 2014
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    WARNING: requested mirror side 1 of virtual extent 0 logical extent 0 offset 6754304 is not allocated; I/O request failed
    WARNING: requested mirror side 2 of virtual extent 0 logical extent 1 offset 6754304 is not allocated; I/O request failed
    WARNING: requested mirror side 3 of virtual extent 0 logical extent 2 offset 6754304 is not allocated; I/O request failed
    Errors in file /DBSoft/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_6826.trc:
    ORA-15079: ASM file is closed
    ORA-15079: ASM file is closed
    ORA-15079: ASM file is closed
    Thu Oct 30 13:48:26 2014
    SQL> alter diskgroup OCR_VOT001 check /* proxy */ 
    Thu Oct 30 13:48:40 2014
    SUCCESS: diskgroup OCR_VOT001 was dismounted
    SUCCESS: alter diskgroup OCR_VOT001 dismount force /* ASM SERVER:1163428491 */
    Thu Oct 30 13:48:40 2014
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "OCR_VOT001" does not exist or is not mounted
    ERROR: alter diskgroup OCR_VOT001 check /* proxy */
    Thu Oct 30 13:48:40 2014
    NOTE: diskgroup resource ora.OCR_VOT001.dg is offline
    SUCCESS: ASM-initiated MANDATORY DISMOUNT of group OCR_VOT001
    Thu Oct 30 13:48:46 2014
    NOTE: client exited [6814]
    Thu Oct 30 13:48:47 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17643] opening OCR file
    Thu Oct 30 13:48:49 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17656] opening OCR file
    Thu Oct 30 13:48:51 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17673] opening OCR file
    Thu Oct 30 13:48:53 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17720] opening OCR file
    Thu Oct 30 13:48:55 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17740] opening OCR file
    Thu Oct 30 13:49:02 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17760] opening OCR file
    Thu Oct 30 13:49:04 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17773] opening OCR file
    Thu Oct 30 13:49:06 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17790] opening OCR file
    Thu Oct 30 13:49:08 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17812] opening OCR file
    Thu Oct 30 13:49:10 2014
    NOTE: [crsd.bin@db01 (TNS V1-V3) 17826] opening OCR file
    Thu Oct 30 13:49:42 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18043] opening OCR file
    Thu Oct 30 13:49:43 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18087] opening OCR file
    Thu Oct 30 13:49:43 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18221] opening OCR file
    Thu Oct 30 13:49:44 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18274] opening OCR file
    Thu Oct 30 13:49:44 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18292] opening OCR file
    Thu Oct 30 13:49:45 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18416] opening OCR file
    Thu Oct 30 13:49:45 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18434] opening OCR file
    Thu Oct 30 13:49:46 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18536] opening OCR file
    Thu Oct 30 13:49:47 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18641] opening OCR file
    Thu Oct 30 13:49:47 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 18659] opening OCR file
    Thu Oct 30 13:53:36 2014
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 1.
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 1.
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 2.
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 2.
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
    WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
    Thu Oct 30 13:54:43 2014
    NOTE: [crsctl.bin@db01 (TNS V1-V3) 19810] opening OCR file
    

    CRS日志:

    2014-10-30 13:48:26.715: [   CRSPE][1174640960]{2:1454:184} RI [ora.OCR_VOT001.dg db02 1] new target state: [OFFLINE] old value: [ONLINE]
    2014-10-30 13:48:26.716: [  CRSOCR][1166235968]{2:1454:184} Multi Write Batch processing...
    2014-10-30 13:48:26.716: [   CRSPE][1174640960]{2:1454:184} RI [ora.OCR_VOT001.dg db02 1] new internal state: [STOPPING] old value: [STABLE]
    2014-10-30 13:48:26.716: [   CRSPE][1174640960]{2:1454:184} Sending message to agfw: id = 3284
    2014-10-30 13:48:26.716: [   CRSPE][1174640960]{2:1454:184} CRS-2673: Attempting to stop 'ora.OCR_VOT001.dg' on 'db02'
    
    2014-10-30 13:48:26.720: [   CRSPE][1174640960]{2:1454:184} Received reply to action [Stop] message ID: 3284
    2014-10-30 13:48:26.725: [  OCRRAW][1166235968]proprior: Header check from OCR device 0 offset 6651904 failed (26).
    2014-10-30 13:48:26.725: [  OCRRAW][1166235968]proprior: Retrying buffer read from another mirror for disk group [+OCR_VOT001] for block at offset [6651904]
    2014-10-30 13:48:26.725: [  OCRASM][1166235968]proprasmres: Total 0 mirrors detected
    2014-10-30 13:48:26.725: [  OCRASM][1166235968]proprasmres: Only 1 mirror found in this disk group.
    2014-10-30 13:48:26.725: [  OCRASM][1166235968]proprasmres: Need to invoke checkdg. Mirror #0 has an invalid buffer. 
    2014-10-30 13:48:26.740: [   CRSPE][1174640960]{2:1454:184} Received reply to action [Stop] message ID: 3284
    2014-10-30 13:48:26.740: [   CRSPE][1174640960]{2:1454:184} RI [ora.OCR_VOT001.dg db02 1] new internal state: [STABLE] old value: [STOPPING]
    2014-10-30 13:48:26.740: [   CRSPE][1174640960]{2:1454:184} RI [ora.OCR_VOT001.dg db02 1] new external state [OFFLINE] old value: [ONLINE] label = [] 
    2014-10-30 13:48:26.740: [   CRSPE][1174640960]{2:1454:184} CRS-2677: Stop of 'ora.OCR_VOT001.dg' on 'db02' succeeded
    
    2014-10-30 13:48:26.740: [  CRSRPT][1176742208]{2:1454:184} Published to EVM CRS_RESOURCE_STATE_CHANGE for ora.OCR_VOT001.dg
    2014-10-30 13:48:40.891: [  OCRASM][1166235968]proprasmres: kgfoControl returned error [8]
    [  OCRASM][1166235968]SLOS : SLOS: cat=8, opn=kgfoCkDG01, dep=15032, loc=kgfokge
    
    2014-10-30 13:48:40.891: [  OCRASM][1166235968]ASM Error Stack : ORA-27091: unable to queue I/O
    ORA-15079: ASM file is closed
    ORA-06512: at line 4
    
    2014-10-30 13:48:55.542: [ CRSMAIN][199140176] Initializing OCR
    [   CLWAL][199140176]clsw_Initialize: OLR initlevel [70000]
    2014-10-30 13:48:55.805: [  OCRASM][199140176]proprasmo: Error in open/create file in dg [OCR_VOT001]
    [  OCRASM][199140176]SLOS : SLOS: cat=8, opn=kgfoOpen01, dep=15056, loc=kgfokge
    
    2014-10-30 13:48:55.805: [  OCRASM][199140176]ASM Error Stack : 
    2014-10-30 13:48:55.825: [  OCRASM][199140176]proprasmo: kgfoCheckMount returned [6]
    2014-10-30 13:48:55.825: [  OCRASM][199140176]proprasmo: The ASM disk group OCR_VOT001 is not found or not mounted
    2014-10-30 13:48:55.825: [  OCRRAW][199140176]proprioo: Failed to open [+OCR_VOT001]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
    2014-10-30 13:48:55.825: [  OCRRAW][199140176]proprioo: No OCR/OLR devices are usable
    2014-10-30 13:48:55.825: [  OCRASM][199140176]proprasmcl: asmhandle is NULL
    2014-10-30 13:48:55.826: [    GIPC][199140176] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 690], original from [clsss.c : 5343]
    2014-10-30 13:48:55.826: [ default][199140176]clsvactversion:4: Retrieving Active Version from local storage.
    2014-10-30 13:48:55.827: [ CSSCLNT][199140176]clssgsgrppubdata: group (ocr_db-cluster) not found
    2014-10-30 13:48:55.827: [  OCRRAW][199140176]proprio_repairconf: Failed to retrieve the group public data. CSS ret code [20]
    2014-10-30 13:48:55.830: [  OCRRAW][199140176]proprioo: Failed to auto repair the OCR configuration.
    2014-10-30 13:48:55.830: [  OCRRAW][199140176]proprinit: Could not open raw device 
    2014-10-30 13:48:55.830: [  OCRASM][199140176]proprasmcl: asmhandle is NULL
    2014-10-30 13:48:55.831: [  OCRAPI][199140176]a_init:16!: Backend init unsuccessful : [26]
    2014-10-30 13:48:55.832: [  CRSOCR][199140176] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage
    2014-10-30 13:48:55.832: [    CRSD][199140176] Created alert : (:CRSD00111:) :  Could not init OCR, error: PROC-26: Error while accessing the physical storage
    2014-10-30 13:48:55.832: [    CRSD][199140176][PANIC] CRSD exiting: Could not init OCR, code: 26
    2014-10-30 13:48:55.832: [    CRSD][199140176] Done.
    

    故障处理有两种方法:

    多路径切换层面,參考例如以下:

    FailOverQuiescenceTime:

    Quiescence Timeout before Failover (Mode Select Page 2C) command. Thetime,in seconds,the array will wait for a quiescence condition to clear for an explicitfailover operation. A typical setting is 20 seconds.

    FailedPathCheckingInterval:

    This parameter defines how long (in seconds) the MPP drivershould wait before initiating a path-validation action.Default value is 60 seconds.

    Egg:

    [root@db01 ~]# cat /etc/mpp.conf
    VirtualDiskProductId=VirtualDisk
    DebugLevel=0x0
    NotReadyWaitTime=270
    BusyWaitTime=270
    QuiescenceWaitTime=270
    InquiryWaitTime=60
    MaxLunsPerArray=256
    MaxPathsPerController=4
    ScanInterval=60
    InquiryInterval=1
    MaxArrayModules=30
    ErrorLevel=3
    SelectionTimeoutRetryCount=0
    UaRetryCount=10
    RetryCount=10
    SynchTimeout=170
    FailOverQuiescenceTime=20
    FailoverTimeout=120
    FailBackToCurrentAllowed=1
    ControllerIoWaitTime=300
    ArrayIoWaitTime=600
    DisableLUNRebalance=0
    SelectiveTransferMaxTransferAttempts=5
    SelectiveTransferMinIOWaitTime=3
    IdlePathCheckingInterval=60
    RecheckFailedPathWaitTime=30
    FailedPathCheckingInterval=60
    ArrayFailoverWaitTime=300
    PrintSenseBuffer=0
    ClassicModeFailover=0
    AVTModeFailover=0
    LunFailoverDelay=3
    LoadBalancePolicy=1
    ImmediateVirtLunCreate=0
    BusResetTimeout=150
    LunScanDelay=2
    AllowHBAsgDevs=0
    S2ToS3Key=471f51f35ec5426e


    ASM检測时间方面:

            仅仅须要调整ASM隐含參数 _asm_hbeatiowait的值将其调大些,我这直接调到120了,又一次运行五组測试。问题没有再现,故障解决。

    (參看隐含參数值得方法參考:archive-1980

     

    Egg:

    [root@db01 ~] # su – gird
    [grid@db01 ~] $ sqlplus sysasm/oracle
    SQL*Plus: Release 11.2.0.4.0 Production on Wed Nov 12 22:15:18 2014
    Copyright (c) 1982, 2013, Oracle.  All rights reserved.
    
    Connected to:
    Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    
    SQL> alter system set "_asm_hbeatiowait"=120 scope=spfile sid='*';
    System altered.
    SQL> <span style="color:#ff0000;">
    </span>

  • 相关阅读:
    png 的特点
    UIImangeView的用法
    uiTextView简单的用法
    UITextField简单的用法
    UIWindow的简单实用(二)
    UIView的简单实用
    objective-C 复合(组合)
    OC
    objective-C protocol协议
    object-C NSDate
  • 原文地址:https://www.cnblogs.com/gavanwanggw/p/6972829.html
Copyright © 2011-2022 走看看