zoukankan      html  css  js  c++  java
  • RMAN备份-未使用catalog-控制文件丢失


    情况描述
    客户报告数据库故障,新来的系统管理员误操作。删掉了一些文件。具体情况是:删掉了所有重要数据文件、所有控制文件。数据库原来是归档模式,用rman备份数据,而rman 使用控制文件。 幸运的是,最后一次 rman full 备份是包括了控制文件在内。系统没有设定自动备份控制文件。现在状况是数据库无法启动。
    不用说,客户的备份方案不够完善,但是这时候再去说这些话责备用户有事后诸葛亮之嫌,"用户是上帝,不要去得罪他"。还有,客户有Full备份(虽然不是自动备份控制文件,这样无法用常规的恢复步骤来进行恢复)。这对我们来说是个绝对的好消息。
    下面我们通过一次模拟操作来演示这个问题的解决办法。
    背景知识
    在Oracle 816 以后的版本中,Oracle提供了一个包:DBMS_BACKUP_RESTORE.DBMS_BACKUP_RESTORE
    包是由dbmsbkrs.sql 和 prvtbkrs.plb 这两个脚本创建的.catproc.sql 脚本运行后会调用这两个包.所以是每个数据库都有的这个包是Oracle服务器和操作系统之间IO操作的接口.由恢复管理器直接调用。而且据说这两个脚本的功能是内建到Oracle的一些库文件中的.
    由此可见,我们可以在数据库 nomount 情况下调用这些package ,来达到我们的恢复目的。在dbmsbkrs.sql 和prvtbkrs.plb 这两个脚本中有详细的说明文档,出于篇幅问题,就不一一加以翻译了,但在下面会直接引用一些原文说明。
    关键的内容有:

    FUNCTION  deviceAllocate(
           type IN varchar2 default NULL
          ,name IN varchar2 default NULL
          ,ident IN varchar2 default NULL
          ,noio IN boolean default FALSE
          ,params IN varchar2 default NULL )
    RETURN varchar2;
    
    -- Describe the device to be used for sequential I/O. For device types where
    -- only one process at a time can use a device, this call allocates a device
    -- for exclusive use by this session. The device remains allocated until
    -- deviceDeallocate is called or session termination. The device can be used
    -- both for creating and restoring backups.
    --
    -- Specifying a device allocates a context that exists until the session
    -- terminates or deviceDeallocate is called. Only one device can be specified
    -- at a time for a particular session. Thus deviceDeallocate must be called
    -- before a different device can be specified. This is not a limitation since
    -- a session can only read or write one backup at a time.
    --
    -- The other major effect of allocating a device is to specify the name space
    -- for the backup handles (file names). The handle for a sequential file does
    -- not necessarily define the type of device used to write the file. Thus it
    -- is necessary to specify the device type in order to interpret the file
    -- handle. The NULL device type is defined for all systems. It is the file
    -- system supplied by the operating system. The sequential file handles are
    -- thus normal file names.
    --
    -- A device can be specified either by name or by type.
    -- If the type is specified but not the name, the system picks an
    -- available device of that type.
    -- If the name is specified but not the type, the type is determined
    -- from the device.
    -- If neither the type or the name is given, the backups are files in
    -- the operating system file system.
    
    -- Note that some types of devices, optical disks for example, can be shared
    -- by many processes, and thus do not really require allocation of the device
    -- itself. However we do need to allocate the context for accessing the
    -- device, and we do need to know the device type for proper interpretation
    -- of the file handle. Thus it is always necessary to make the device
    -- allocation call before making most other calls in this package.
    --
    -- Input parameters:
    -- type
    -- If specified, this gives the type of device to use for sequential
    -- I/O. The allowed types are port specific. For example a port may
    -- support the type "TAPE" which is implemented via the Oracle tape
    -- API. If no type is specified, it may be implied by specifying a
    -- particular device name to allocate. The type should be allowed to
    -- default to NULL if operating system files are to be used.
    --
    -- name
    -- If specified, this names a particular piece of hardware to use for
    -- accessing sequential files. If not specified, any available
    -- device of the correct type will be allocated. If the device cannot
    -- be shared, it is allocated to this session for exclusive use.
    -- The name should be allowed to default to NULL if operating system
    -- files are to be used.
    --
    -- ident
    -- This is the users identifier that he uses to name this device. It
    -- is only used to report the status of this session via
    -- dbms_application_info. This value will be placed in the CLIENT_INFO
    -- column of the V$SESSION table, in the row corresponding to the
    -- session in which the device was allocated. This value can also
    -- be queried with the dbms_application_info.read_client_info procedure.
    --
    -- noio
    -- If TRUE, the device will not be used for doing any I/O. This allows
    -- the specification of a device type for deleting sequential files
    -- without actually allocating a piece of hardware. An allocation for
    -- noio can also be used for issuing device commands. Note that some
    -- commands may actually require a physical device and thus will get
    -- an error if the allocate was done with noio set to TRUE.
    --
    -- params
    -- This string is simply passed to the device allocate OSD. It is
    -- completely port and device specific.
    --
    -- Returns:
    -- It returns a valid device type. This is the type that should be
    -- allocated to access the same sequential files at a later date. Note
    -- that this might not be exactly the same value as the input string.
    -- The allocate OSD may do some translation of the type passed in. The
    -- return value is NULL when using operating system files.
    
    
    PROCEDURE restoreControlfileTo(cfname IN varchar2);
    
    -- This copies the controlfile from the backup set to an operating system
    -- file. If the database is mounted, the name must NOT match any of the
    -- current controlfiles.
    --
    -- Input parameters:
    -- cfname
    -- Name of file to create or overwrite with the controlfile from the
    -- backup set.
    
    PROCEDURE restoreDataFileTo( dfnumber IN binary_integer
    ,toname IN varchar2 default NULL);
    --
    -- restoreDataFileTo creates the output file from a complete backup in the
    -- backup set. 
    restore_procedure
    如果您有兴趣可以去阅读一下这两个文件的注释说明.
    解决过程
    首先,用控制文件作数据库系统的全备份:
    C:WUTemp>rman target / 
     
    Recovery Manager: Release 9.2.0.1.0 -
    Production.
     
    Copyright (c) 1995, 2002, Oracle
    Corporation. All rights reserved.
     
    connected to target database: DEMO (DBID=3272375326)
     
    RMAN> run {
    2> allocate channel C1 type disk;
    3> backup full tag 'FullBackup' format 'd:KDE\%d_%u_%s_%p.dbf'
    database include current controlfile;
    4> sql ' alter system archive log current';
    5> release channel C1;
    6> }
     
    using target database controlfile
    instead of recovery catalog
    allocated channel: C1
    channel C1: sid=15 devtype=DISK
     
    Starting backup at 18-JUL-04
    channel C1: starting full datafile
    backupset
    channel C1: specifying datafile(s) in
    backupset
    including current SPFILE in backupset
    including current controlfile in
    backupset
    input datafile fno=00001
    name=D:ORACLEORADATADEMOSYSTEM01.DBF
    input datafile fno=00002 name=D:ORACLEORADATADEMOUNDOTBS01.DBF
    input datafile fno=00004
    name=D:ORACLEORADATADEMOEXAMPLE01.DBF
    input datafile fno=00009 name=D:ORACLEORADATADEMOXDB01.DBF
    input datafile fno=00005 name=D:ORACLEORADATADEMOINDX01.DBF
    input datafile fno=00008 name=D:ORACLEORADATADEMOUSERS01.DBF
    input datafile fno=00003 name=D:ORACLEORADATADEMODRSYS01.DBF
    input datafile fno=00006 name=D:ORACLEORADATADEMOODM01.DBF
    input datafile fno=00007 name=D:ORACLEORADATADEMOTOOLS01.DBF
    channel C1: starting piece 1 at
    18-JUL-04
    channel C1: finished piece 1 at
    18-JUL-04
    piece handle=D:KDEDEMO_01FR79OT_1_1.DBF comment=NONE
    channel C1: backup set complete, elapsed
    time: 00:01:17
    Finished backup at 18-JUL-04
     
    sql statement: alter system archive log
    current
     
    released channel: C1
    如上所示,我们做了一次数据库的Full备份.备份片中包括控制文件.注意上面输出内容的黑体部分.我们在后面的恢复操作中会用到.
    模拟错误,关掉实例,删掉所有的控制文件和所有的.DBF文件。然后starup会看到如下的出错信息:
    SQL> startup
    ORACLE instance started.
     
    Total System Global Area 152115804 bytes
    Fixed Size 453212 bytes
    Variable Size 100663296 bytes
    Database Buffers 50331648 bytes
    Redo Buffers 667648 bytes
    ORA-00205: error in identifying
    controlfile, check alert log for more info
    查看alert Log,应该是系统找不到控制文件.现在情形和客户问题一致.不过在继续讲述之前,我们还需要介绍一点背景知识.
    我们首先尝试恢复控制文件:
     
    SQL>startup force nomount;
     
    SQL> DECLARE
    2 devtype varchar2(256);
    3 done boolean;
    4 BEGIN
    5
    devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'',ident=>'T1');
    6
    sys.dbms_backup_restore.restoreSetDatafile;
    7
    sys.dbms_backup_restore.restoreControlfileTo(cfname=>'d:oracleControl01.ctl');
    8
    sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>'D:KDEDEMO_01FR79OT_1_1.DBF',
    params=>null);
    9
    sys.dbms_backup_restore.deviceDeallocate;
    10 END;
    11 /
     
    PL/SQL procedure successfully completed.
    OK,控制文件恢复完成.对以上内容的解释:
    • 第五行 分配一个device channel,因为使用的操作系统文件,所以这里为空,如果是从磁带上恢复要用  "sbt_tape";
    • 第六行 指明开始restore ;
    • 第七行 指出待恢复文件目标存储位置;
    • 第八行 从哪个备份片中恢复;
    • 第九行 释放设备通道.
    不妨对以上操作的结果验证一下:
    SQL> host dir d:oracle
    Volume in drive D is DATA
    Volume Serial Number is DC79-57F8
    Directory of d:oracle
     
    07/18/2004 09:08 PM <DIR> .
    07/18/2004 09:08 PM <DIR> ..
    06/08/2004 03:21 PM <DIR> admin
    07/18/2004 09:08 PM 1,871,872
    CONTROL01.CTL 
    07/16/2004 11:27 AM <DIR> ORA92
    07/18/2004 09:02 PM <DIR> oradata
     
    这样,我们成功的restore了控制文件 .如果控制文件在Full备份之后单独做的,接下来关掉实例,拷贝控制文件到具体位置,然后rman 执行restore database;即可。 可是,我们这里的情况有些不同. 视丢失文件的情况而定,继续进行如下的恢复操作:
     
    SQL> DECLARE
    2 devtype varchar2(256);
    3 done boolean;
    4 BEGIN
    5
    devtype:=sys.dbms_backup_restore.deviceAllocate (type=>'',ident=>'t1');
    6 sys.dbms_backup_restore.restoreSetDatafile;
    7
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>01,toname=>'
    d:oracleoradatademoSYSTEM01.DBF');
    8
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>02,toname=>'
    d:oracleoradatademoUNDOTBS01.DBF');
    9 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>03,toname=>'
    d:oracleoradatademoDRSYS01.DBF');
    10
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>04,toname=>'
    d:oracleoradatademoEXAMPLE01.DBF');
    11
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>05,toname=>'
    d:oracleoradatademoINDX01.DBF');
    12
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>06,toname=>'
    d:oracleoradatademoODM01.DBF');
    13
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>07,toname=>'
    d:oracleoradatademoTOOLS01.DBF');
    14
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>08,toname=>'
    d:oracleoradatademoUSERS01.DBF');
    15
    sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>09,toname=>'
    d:oracleoradatademoXDB01.DBF');
    16
    sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>'
    D:KDEDEMO_01FR79OT_1_1.DBF',
    params=>null);
    17
    sys.dbms_backup_restore.deviceDeallocate;
    18 END;
    19 /
     
    PL/SQL procedure successfully completed.
    --我们的情形是所有的数据文件都丢失了,那就如法炮制 ........... --文件对应编号来自前面全备份时候的屏幕输出内容.所以,在备份的时候保留操作Log是个很好的习惯.
     
    SQL> startup force mount;
    ORACLE instance started.
     
    Total System Global Area 152115804 bytes
    Fixed Size 453212 bytes
    Variable Size 100663296 bytes
    Database Buffers 50331648 bytes
    Redo Buffers 667648 bytes
    Database mounted.
    SQL> Recover database using backup
    controlfile until cancel ;
    ORA-00279: change 243854 generated at
    07/18/2004 20:57:03 needed for thread 1
    ORA-00289: suggestion :
    D:KDEARC00002.001
    ORA-00280: change 243854 for thread 1 is
    in sequence #2
    Specify log: {<RET>=suggested |
    filename | AUTO | CANCEL}
     
    D:KDEARC00002.001
    ORA-00279: change 244089 generated at
    07/18/2004 20:58:18 needed for thread 1
    ORA-00289: suggestion :
    D:KDEARC00003.001
    ORA-00280: change 244089 for thread 1 is
    in sequence #3
    ORA-00278: log file
    'D:KDEARC00002.001' no longer needed for this recovery
     
    Specify log: {<RET>=suggested |
    filename | AUTO | CANCEL}
    cancel
    Media recovery cancelled.
     
    SQL> alter database open resetlogs;
     
    Database altered.
    最后,不得不resetlogs .
    然后,打扫战场,马上进行数据库的全备份。如果您是DBA的话,应该进一步制定并完善备份计划.亡羊补牢,为时未晚。
    总结一下
    • 控制文件在备份中意义重大,建议每次对其单独备份,如果数据库版本允许的话,应该设置为控制文件自动备 份。同时应该尽可能地增大CONTROL_FILE_RECORD_KEEP_TIME这个初始化参数的值。以便备份信息能更长时间的保留
    • 应该制定比较完善的备份计划,否则备份计划一旦出现缺口,将可能给系统带来灾难.记住, "可能出错的地方一定会出错".
    • 熟悉RMAN内部备份机制,对DBMS_BACKUP_RESTORE的用法有一定的掌握在关键时侯很有帮助.
    • 备份脚本应该对Log重定向并保存.以便在出错的查找有用信息.


    参考信息


    RMAN Recovery Without Recovery Catalog
    or Controlfiles by Bonnie Bizzaro 
    dbmsbkrs.sql 和 prvtbkrs.plb 文件说明注释(可在你的系统 $ORACLE_HOME/rdbms/admin/中找到.)
    相关链接
    本文的更多讨论,请看这里 - http://www.itpub.net/244345.html 
    本文的Blog讨论,请看这里 - http://blog.csdn.net/fenng/archive/2004/07/19/44945.aspx 
     
  • 相关阅读:
    cf914D. Bash and a Tough Math Puzzle(线段树)
    RNQOJ [stupid]愚蠢的矿工(树形依赖背包)
    BZOJ4552: [Tjoi2016&Heoi2016]排序(线段树 二分)
    多项式系数学习笔记
    BZOJ4653: [Noi2016]区间(线段树 双指针)
    洛谷P3372 【模板】线段树 1(树状数组)
    BZOJ3261: 最大异或和(可持久化trie树)
    BZOJ4260: Codechef REBXOR (01Tire树)
    Android 关于显示键盘,布局错乱网上顶的问题
    Java 输入流读取文本文件换行符问题
  • 原文地址:https://www.cnblogs.com/jackhub/p/3332197.html
Copyright © 2011-2022 走看看