不完全数据库恢复
到目前为止,前面讨论的都是完全恢复数据库,这是recover database ecover tablespace ecover datafile默认的行为特征。
所谓完全恢复指在介质恢复阶段(执行recover)必须应用所有的应该应用的重做日志:它们包括从数据文件头中的检查点RBA所指向的重做记录开始,
一致到在线的当前的CURRENT redo log结束,不论在归档文件还是redo中,recover将尽可能寻找这些记录,最后返回
media recovery complete
前面章节讨论没有丢失任何一个归档跟redo日志,所以都是完全恢复总是成功的。
不完全恢复在介质恢复阶段不需要应用所有的重做记录,而只需要一部分:介质恢复的起点必须是数据文件头中的检查点RBA所指向的重做记录,终点可以
是之后的任意一条重做记录,这些重做记录可以在归档日志或是在线日志中。不完全恢复的结果是将数据库恢复到“过去”的一个时间点,而不像完全恢复那样将db恢复到当前时间点。
完全恢复不会丢失任何一个已经提交的事务,而不完全恢复可能丢失已提交的事务。
12.1 工作原理
不完全恢复的结果是一样的,2个特点:
--1 不完全恢复极可能造成已经提交的事务丢失,即将db回退到过去的一个时间点,导致某些变更就像根本没有发生过一样。因此,不完全恢复也被称为指定时间目标(point in time)恢复,PIT恢复
--2 不完全恢复的影响范围一定是数据库范围的,即只有recover database命令支持不完全恢复。
由于极可能造成事务丢失,并且影响整个数据库,绝非必要,否则不应该使用不完全恢复解决问题。
主动与被动不完全恢复
--被动恢复:原计划完全恢复,但是在恢复过程中发现recover需要的一个归档日志或处于active或current的redo log丢失或损坏,结果被迫以不完全恢复,否则db无法打开。
--主动恢复:由于人为错误破坏了数据库内的大量信息(比如update,truncate重要的表),为了找回正确的信息使数据库回到破坏操作之前采用的不完全恢复。
PIT的4种表达形式
--1 精确到重做日志的基于日志序列号recover database until sequence,仅适用RMAN
比如,RECOVER DATABASE UNTIL SEQUENCE 100视图将db恢复至99号日志的最后一条重做记录,不包括100号日志。
--2 精确到重做日志的基于交互式手动喊停的recover database until cancel,仅适用sqlplus
--3 精确到重做记录的基于时间的recover database until time,适用于RMAN和sqlplus。
比如,recover database until time ‘sysdate-1’ 试图将db恢复至大于24小时之前
--4 精确到重做记录的基于SCN的recover database until scn和recover database until restore point,适用于RMAN;recover database until change 和recover database until restore point适用于sqlplus。
比如recover database until scn 10000试图将db恢复至PIT点SCN10000
为了使不完全恢复能够成功,在recover执行之前的restore命令有一个非常重要的注意事项,那就是必须还原足够“旧”的备份中的数据文件。因为recover命令使用的是重做记录,重做记录只能将数据向“前”推,增大其头部的检查点SCN。
假设不完全恢复PIT点的SCN是70000,但是restore命令还原的数据文件头部中的SCN是80000,这会导致recover database until scn 70000抛出RMAN-06556异常,意思是数据文件不够“旧”,
即restore命令还原的数据文件头SCN不够小,其值应该小于或等于70000,recover命令才能成功。
12.2 被动恢复
场景1:数据库有备份,运行时数据文件yhqt01.dbf丢失,并且某些在备份后产生的未备份的归档日志丢失,应用报错ORA-01116无法打开10号文件
丢失的是普通数据文件,所以
RMAN> run { sql'alter database datafile 10 offline'; restore datafile 10; recover datafile 10; sql'alter database datafile 10 online';} Starting recover at 25-JUL-19 using channel ORA_DISK_1 starting media recovery archived log for thread 1 with sequence 84 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc archived log for thread 1 with sequence 85 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc archived log for thread 1 with sequence 88 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_88_gml7o84w_.arc RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 07/25/2019 11:26:00 RMAN-06053: unable to perform media recovery because of missing log RMAN-06025: no backup of archived log for thread 1 with sequence 87 and starting SCN of 1013521 found to restore RMAN-06025: no backup of archived log for thread 1 with sequence 86 and starting SCN of 1013510 found to restore
--由于手工删除了86.87号归档日志
查看当前在线日志的情况
SQL> select group#,sequence#,status from v$log; GROUP# SEQUENCE# STATUS ---------- ---------- ---------------- 1 88 INACTIVE 2 89 CURRENT 3 87 INACTIVE
查看被还原的10号文件yhqt01.dbf头部的SCN号,该数据文件头部的SCN即使recover命令恢复的起点
SQL> select file#,change# from v$recover_file; FILE# CHANGE# ---------- ---------- 10 1006178
使用SCN号是1006178去咨询一下v$log_history视图,得到recover datafile 10的起点在哪个日志
SQL> select sequence# from v$log_history where 1006178 between first_change# and next_change#-1; SEQUENCE# ---------- 82
结果是82号日志
根据上面3个查询,刚才使用的recover datafile 10,需要从82号日志恢复到89号日志,其中88,89为在线日志,RMAN报告86,87号归档日志找不到,既然不能成功地从82号一直恢复到89号,recover命令异常。
接下来去归档路径下找一找,发现确实找不到,这里人为的rm掉了
[oracle@DSI 2019_07_25]$ ll total 5148 -rw-r----- 1 oracle oinstall 5262336 Jul 25 11:14 o1_mf_1_84_gml7nj0y_.arc -rw-r----- 1 oracle oinstall 2048 Jul 25 11:14 o1_mf_1_85_gml7nk2o_.arc -rw-r----- 1 oracle oinstall 2560 Jul 25 11:14 o1_mf_1_88_gml7o84w_.arc
再使用RMAN查看归档备份的情况
RMAN> list backup of archivelog all;
那么oracle认为86 87号归档日志在哪里呢
RMAN> list archivelog low sequence 86 high sequence 87; List of Archived Log Copies for database with db_unique_name ORCL ===================================================================== Key Thrd Seq S Low Time ------- ---- ------- - --------- 11 1 86 A 25-JUL-19 Name: /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_86_gml7o58q_.arc 12 1 87 A 25-JUL-19 Name: /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_87_gml7o64p_.arc
这种情况下,似乎只能将yhqt01.dbf文件恢复到85号日志的最后一条重做记录,
RMAN> recover datafile 10 until sequence 86;
命令执行成功。但是不要误认为oracle可以支持单独的一个数据文件进行不完全恢复。
上线10号文件报错
RMAN> sql'alter database datafile 10 online'; sql statement: alter database datafile 10 online RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of sql command on default channel at 07/25/2019 11:36:40 RMAN-11003: failure during parse/execution of SQL statement: alter database datafile 10 online ORA-01113: file 10 needs media recovery ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'
错误RMAN-11003表示yhqt01.dbf文件和其他数据文件无法同步,还需要更多恢复,需要86号日志,但是86号日志已经丢失找不到了。这个时候,
有两种办法,要么在也不上线yhqt,不访问yhqt01.dbf数据文件。要么在数据库范围上执行一次真正的不完全恢复,使所有数据文件倒退至85号文件的最后一条重做日志,保证所有数据文件之间的一致性。
RMAN> recover database until sequence 86; Starting recover at 25-JUL-19 using channel ORA_DISK_1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 07/25/2019 11:40:03 RMAN-06556: datafile 1 must be restored from backup older than SCN 1013510
直接执行会报错,因为其他数据文件不够旧
关闭数据库
SQL> shutdown abort; ORACLE instance shut down.
RMAN> run { startup force mount; restore database; }2> 3> 4> Oracle instance started database mounted Total System Global Area 784998400 bytes Fixed Size 2257352 bytes Variable Size 499125816 bytes Database Buffers 276824064 bytes Redo Buffers 6791168 bytes Starting restore at 25-JUL-19 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=133 device type=DISK channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/orcl/system01.dbf channel ORA_DISK_1: restoring datafile 00002 to /u01/app/oracle/oradata/orcl/sysaux01.dbf channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/orcl/users01.dbf channel ORA_DISK_1: restoring datafile 00005 to /u01/app/oracle/oradata/orcl/test01.dbf channel ORA_DISK_1: restoring datafile 00008 to /u01/app/oracle/oradata/orcl/rc_data01.dbf channel ORA_DISK_1: reading from backup piece /home/oracle/backup/bol_fullbak01u7hht8_1_1_20190725 channel ORA_DISK_1: piece handle=/home/oracle/backup/bol_fullbak01u7hht8_1_1_20190725 tag=BOL_FULLBAK channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:07 channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/orcl/undotbs01.dbf channel ORA_DISK_1: restoring datafile 00006 to /u01/app/oracle/oradata/orcl/assm01.dbf channel ORA_DISK_1: restoring datafile 00007 to /u01/app/oracle/oradata/orcl/mssm01.dbf channel ORA_DISK_1: restoring datafile 00009 to /u01/app/oracle/oradata/orcl/ogg01.dbf channel ORA_DISK_1: restoring datafile 00010 to /u01/app/oracle/oradata/orcl/yhqt01.dbf channel ORA_DISK_1: reading from backup piece /home/oracle/backup/bol_fullbak02u7hhtf_1_1_20190725 channel ORA_DISK_1: piece handle=/home/oracle/backup/bol_fullbak02u7hhtf_1_1_20190725 tag=BOL_FULLBAK channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:07 Finished restore at 25-JUL-19
利用动态视图检查所还原的数据文件是否足够“旧”但又不至于“太旧”。
SQL> select 'restore check ok' "restore_check" from (select max(next_change#-1) tail,min(first_change#) head from v$log_history where sequence# between 82 and 86) lh where (select max(change#) from v$recover_file) <= lh.tail and (select min(change#) from 2 3 4 5 6 7 8 9 v$recover_file) >= lh.head; restore_check ---------------- restore check ok
restore database默认从最新的备份中还原,在82号日志之后是备份过数据库,并且没有在85号日志之后备份过数据库。
实际情况应该在restore中给出from关键字指定哪个备份集还原
RMAN> run { 2> set until sequence 86; 3> restore database; 4> } executing command: SET until clause Starting restore at 25-JUL-19 using channel ORA_DISK_1 skipping datafile 1; already restored to file /u01/app/oracle/oradata/orcl/system01.dbf skipping datafile 2; already restored to file /u01/app/oracle/oradata/orcl/sysaux01.dbf skipping datafile 4; already restored to file /u01/app/oracle/oradata/orcl/users01.dbf skipping datafile 5; already restored to file /u01/app/oracle/oradata/orcl/test01.dbf skipping datafile 8; already restored to file /u01/app/oracle/oradata/orcl/rc_data01.dbf skipping datafile 3; already restored to file /u01/app/oracle/oradata/orcl/undotbs01.dbf skipping datafile 6; already restored to file /u01/app/oracle/oradata/orcl/assm01.dbf skipping datafile 7; already restored to file /u01/app/oracle/oradata/orcl/mssm01.dbf skipping datafile 9; already restored to file /u01/app/oracle/oradata/orcl/ogg01.dbf skipping datafile 10; already restored to file /u01/app/oracle/oradata/orcl/yhqt01.dbf restore not done; all files read only, offline, or already restored Finished restore at 25-JUL-19
在restore前使用set until可以有效防止还原不够“旧”的备份。这里还原的数据文件头部的SCN必须小于86号日志的第一条重做记录的SCN。
RMAN> recover database until sequence 86; Starting recover at 25-JUL-19 using channel ORA_DISK_1 starting media recovery archived log for thread 1 with sequence 84 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc archived log for thread 1 with sequence 85 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc channel ORA_DISK_1: starting archived log restore to default destination channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=82 channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=83 channel ORA_DISK_1: reading from backup piece /home/oracle/backup/arch_06u7hhtm_1_1_20190725 channel ORA_DISK_1: piece handle=/home/oracle/backup/arch_06u7hhtm_1_1_20190725 tag=TAG20190725T110004 channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_82_gml9gk3o_.arc thread=1 sequence=82 channel default: deleting archived log(s) archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_82_gml9gk3o_.arc RECID=16 STAMP=1014551121 archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_83_gml9gk46_.arc thread=1 sequence=83 channel default: deleting archived log(s) archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_83_gml9gk46_.arc RECID=17 STAMP=1014551121 archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc thread=1 sequence=84 archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc thread=1 sequence=85 media recovery complete, elapsed time: 00:00:00 Finished recover at 25-JUL-19
以上的命令也能这样执行
RMAN> run{
set until sequence 86;
recover database;
}
至此,从86号日志第一条重做开始其后的所有重做记录描述的变更均未生效,数据文件中没有这些结果的变更。
SQL> alter database open resetlogs; Database altered. SQL> select sequence#,status from v$log; SEQUENCE# STATUS ---------- ---------------- 1 CURRENT 0 UNUSED 0 UNUSED
整个事故到此尚未结束,对业务来说需要将85号日志以后提交的事务重新录入数据库。
上面的步骤总体可以总结为:
RMAN> run { startup force mount; set until sequence 86; restore database; recover database; alter database open resetlogs; }
另外,若在RAC环境下,为了避免PIT点产生歧义,在set until时指定线程号
RMAN> run { startup force mount; set until sequence 86 thread 1; restore database; recover database; alter database open resetlogs; }
话外篇
此时查询yhqt依然报错
SQL> conn yhqt/**** Connected. SQL> select count(*) from yhqtest_3; select count(*) from yhqtest_3 * ERROR at line 1: ORA-00376: file 10 cannot be read at this time ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf' SQL> select file#||chr(9)||status||chr(9)||name||chr(9)||bytes from v$datafile where file#=10; FILE#||CHR(9)||STATUS||CHR(9)||NAME||CHR(9)||BYTES -------------------------------------------------------------------------------- 10 OFFLINE /u01/app/oracle/oradata/orcl/yhqt01.dbf 52428800 SQL> alter database datafile 10 online; alter database datafile 10 online * ERROR at line 1: ORA-01190: control file or data file 10 is from before the last RESETLOGS ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf' SQL> set linesize 999
查看相关的检查点
--数据文件检查点 SQL> select file#,CREATION_CHANGE#,CHECKPOINT_CHANGE# ,UNRECOVERABLE_CHANGE#,LAST_CHANGE#,OFFLINE_CHANGE#,STATUS from v$datafile order by 1; FILE# CREATION_CHANGE# CHECKPOINT_CHANGE# UNRECOVERABLE_CHANGE# LAST_CHANGE# OFFLINE_CHANGE# STATUS ---------- ---------------- ------------------ --------------------- ------------ --------------- ------- 1 7 1013532 0 1013510 SYSTEM 2 1786 1013532 0 1013510 ONLINE 3 2807 1013532 0 1013510 ONLINE 4 15626 1013532 0 1013510 ONLINE 5 923171 1013532 0 1013510 ONLINE 6 923455 1013532 0 1013510 ONLINE 7 923728 1013532 0 1013510 ONLINE 8 924001 1013532 0 1013510 ONLINE 9 924272 1013532 0 1013510 ONLINE 10 986669 1013531 0 1013532 1013510 OFFLINE --数据文件头检查点 SQL> select checkpoint_change# from v$datafile_header; CHECKPOINT_CHANGE# ------------------ 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013510 --系统检查点 SQL> select checkpoint_change# from v$database; CHECKPOINT_CHANGE# ------------------ 1013532
查看incrarnation
RMAN> list incarnation; List of Database Incarnations DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time ------- ------- -------- ---------------- --- ---------- ---------- 1 1 ORCL 1542322764 PARENT 1 25-JUL-19 2 2 ORCL 1542322764 CURRENT 1013511 25-JUL-19
10号数据文件的头部检查点还在reset scn的前面
恢复10号文件
SQL> recover datafile 10; Media recovery complete. SQL> select checkpoint_change# from v$datafile_header; CHECKPOINT_CHANGE# ------------------ 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013532 1013532 10 rows selected. SQL> alter database datafile 10 online; Database altered. SQL> select count(*) from yhqtest_3; COUNT(*) ---------- 5001