zoukankan      html  css  js  c++  java
  • 案例:Standby RAC遭遇ORA-1157,1111,1110导致实例crash处理

    案例:Standby RAC遭遇ORA-1157,1111,1110导致实例crash处理
    环境:RHEL 6.5 + Oracle RAC 11.2.0.4 + Dataguard
    今天在实验环境的Pirmary RAC主库上做了一个增加表空间的操作,结果Standby RAC启动同步后直接crash,具体报错如下:

    Fri Aug 11 19:14:20 2017
    Completed: ALTER DATABASE MOUNT /* db agent *//* {1:53752:2} */
    ALTER DATABASE OPEN /* db agent *//* {1:53752:2} */
    AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access
    This instance was first to open
    Fri Aug 11 19:14:23 2017
    Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
    Beginning Standby Crash Recovery.
    Serial Media Recovery started
    Fri Aug 11 19:14:24 2017
    Managed Standby Recovery starting Real Time Apply
    Fri Aug 11 19:14:25 2017
    Warning: VKTM detected a time drift.
    Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
    Fri Aug 11 19:14:26 2017
    Primary database is in MAXIMUM PERFORMANCE mode
    Fri Aug 11 19:14:26 2017
    Primary database is in MAXIMUM PERFORMANCE mode
    RFS[1]: Assigned to RFS process 4031
    RFS[1]: Selected log 11 for thread 1 sequence 171 dbid -1785877518 branch 919999037
    Standby Crash Recovery aborted due to error 1111.
    Errors in file /opt/app/oracle/diag/rdbms/mynas/jyzhao1/trace/jyzhao1_ora_3923.trc:
    ORA-01111: name for data file 10 is unknown - rename to correct file
    ORA-01110: data file 10: '/opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010'
    ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
    ORA-01111: name for data file 10 is unknown - rename to correct file
    ORA-01110: data file 10: '/opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010'
    Fri Aug 11 19:14:27 2017
    RFS[2]: Assigned to RFS process 4029
    RFS[2]: Selected log 13 for thread 1 sequence 170 dbid -1785877518 branch 919999037
    Fri Aug 11 19:14:28 2017
    RFS[3]: Assigned to RFS process 4043
    RFS[3]: Selected log 22 for thread 2 sequence 134 dbid -1785877518 branch 919999037
    RFS[4]: Assigned to RFS process 4041
    RFS[4]: Selected log 23 for thread 2 sequence 135 dbid -1785877518 branch 919999037
    Completed Standby Crash Recovery.
    Fri Aug 11 19:14:31 2017
    Abort recovery for domain 0
    Errors in file /opt/app/oracle/diag/rdbms/mynas/jyzhao1/trace/jyzhao1_ora_3923.trc:
    ORA-10458: standby database requires recovery
    ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
    ORA-01111: name for data file 10 is unknown - rename to correct file
    ORA-01110: data file 10: '/opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010'
    ORA-10458 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:53752:2} */...
    Fri Aug 11 19:14:33 2017
    Shutting down instance (abort)
    License high water mark = 9
    USER (ospid: 4089): terminating the instance
    Instance terminated by USER, pid = 4089
    Fri Aug 11 19:14:34 2017
    Instance shutdown complete
    

    根据报错查到MOS相关文档:

    How to resolve ORA-01111 ORA-01110 ORA-01157 in a physical standby database (文档 ID 1416554.1)

    这里应该就可以猜测出原因基本就是由于主库建立一个文件,备库同步日志,没有创建正确。

    MOS 解决方案核心内容:

    SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    SQL>ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=MANUAL SID='*';
    Rename the unknown datafile 97.
    
    SQL> ALTER DATABASE RENAME FILE '/u01/app/oracle/product/10.2.0.5/db_1/dbs/UNNAMED00097' to '</absolute path/real datafile name>';
    
    Create a empty datafile which same structure as the datafile 97 and it would need all archivelogs from time of creation for recovery
    
    Alter database create datafile '/u01/app/oracle/product/10.2.0.5/db_1/dbs/UNNAMED00097' as '+DATA_ACS4/' size <Actual size of datafile on Primary>
    Please note while giving the above command on ASM you also need to specify the action size of the datafile from the primary (Query v$datafiles for Bytes column)
    
    SQL>ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO SID='*';
    SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
    SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
    

    看起来和STANDBY_FILE_MANAGEMENT有关系,进一步查看show parameter STANDBY_FILE_MANAGEMENT

    On Primary:
    SYS@jyzhao1 > show parameter STANDBY_FILE_MANAGEMENT
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    standby_file_management              string      MANUAL
    SYS@jyzhao1 >
    
    On Standby:
    SQL> show parameter STANDBY_FILE_MANAGEMENT
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    standby_file_management              string      MANUAL
    SQL> 
    

    发现果然都是MANUAL。

    先处理当前问题,按照MOS的几种方法,选择适合我这里环境的方案:
    创建和问题文件相同结构的空文件,然后备库开启应用所有归档文件即可完成恢复。

    创建与错误数据文件结构相同的空文件,具体大小从主库查询,不加大小可能遭遇其他问题,详见上面MOS文档:
    Alter database create datafile '/opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010' as '+DATA/' size 104857600;
    创建后,可以发现v$datafile已经正确:

    SQL> select name from v$datafile;
    
    NAME
    --------------------------------------------------------------------------------
    +DATA/mynas/datafile/system.258.951608183
    +DATA/mynas/datafile/sysaux.257.951608183
    +DATA/mynas/datafile/undotbs1.259.951608185
    +DATA/mynas/datafile/users.265.951608205
    +DATA/mynas/datafile/undotbs2.261.951608185
    +DATA/mynas/datafile/dbs_d_jingyu.262.951608185
    +DATA/mynas/datafile/dbs_i_jingyu.263.951608185
    +DATA/mynas/datafile/test.264.951608185
    +DATA/mynas/datafile/test2.260.951608185
    /opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010
    
    10 rows selected.
    
    SQL> Alter database create datafile '/opt/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00010' as '+DATA/' size 104857600;
    
    Database altered.
    
    SQL> select name from v$datafile;
    
    NAME
    --------------------------------------------------------------------------------
    +DATA/mynas/datafile/system.258.951608183
    +DATA/mynas/datafile/sysaux.257.951608183
    +DATA/mynas/datafile/undotbs1.259.951608185
    +DATA/mynas/datafile/users.265.951608205
    +DATA/mynas/datafile/undotbs2.261.951608185
    +DATA/mynas/datafile/dbs_d_jingyu.262.951608185
    +DATA/mynas/datafile/dbs_i_jingyu.263.951608185
    +DATA/mynas/datafile/test.264.951608185
    +DATA/mynas/datafile/test2.260.951608185
    +DATA/mynas/datafile/dbs_d_hank.273.951774293
    
    10 rows selected.
    

    然后修改STANDBY_FILE_MANAGEMENT=AUTO,正常开启数据库,开启实时应用:

    SQL>ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO SID='*';
    SQL>ALTER DATABASE OPEN;
    SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
    

    再测试增加数据文件,删除数据文件,都可以正常同步:

    SYS@jyzhao1 >create tablespace dbs_only_test;
    SYS@jyzhao1 >drop tablespace dbs_only_test including contents and datafiles;
    

    备库都会自动的正常添加删除,日志都会记录如下:

    Fri Aug 11 21:49:11 2017
    Media Recovery Waiting for thread 1 sequence 175 (in transit)
    Recovery of Online Redo Log: Thread 1 Group 11 Seq 175 Reading mem 0
      Mem# 0: +FRA/mynas/standbylog/standby_group_11.log
    Fri Aug 11 21:51:15 2017
    Successfully added datafile 11 to media recovery
    Datafile #11: '+DATA/mynas/datafile/dbs_only_test.275.951774665'
    Fri Aug 11 21:53:20 2017
    Recovery deleting file #11:'+DATA/mynas/datafile/dbs_only_test.275.951774665' from controlfile.
    Deleted Oracle managed file +DATA/mynas/datafile/dbs_only_test.275.951774665
    Recovery dropped tablespace 'DBS_ONLY_TEST'
    

    总结:DG环境,一定要注意参数STANDBY_FILE_MANAGEMENT=AUTO,否则可能遇到类似情况。

  • 相关阅读:
    hdu In Action
    hdu Bus Pass
    acm steps chapter6总结
    原创:扫雷游戏
    排列组合算法
    原创:泛型方法例子
    原创:QQ群发言统计
    菜鸟也做有道难题①
    查询时带Ntext或nvarchar(Max)到底有多影响速度
    使用正则要注意的几点
  • 原文地址:https://www.cnblogs.com/jyzhao/p/7348235.html
Copyright © 2011-2022 走看看