很多朋友一直对DataGuard的fast-start failover持怀疑的态度;通过observer的观察,(Fast-Start Failover)FSFO提供了一种在primary数据库不可用情况下自动故障切换到standby数据库的能力。造成很多朋友不愿意使用FSFO的原因之一是故障切换后不得不重建原primary数据库,不过如果我们能配合使用10g中的闪回数据库特性的话,这一重建数据库的工作便也可以一劳永逸了。
/* 启用fast_start failover的前提之一是在primary和standby库上都启用flashback database */
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
/* 当前的primary库情况 */
SQL> select open_mode ,database_role,flashback_on from v$database;
OPEN_MODE DATABASE_ROLE FLASHBACK_ON
-------------------- ---------------- ------------------
READ WRITE PRIMARY YES
SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss';
Session altered.
SQL> select OLDEST_FLASHBACK_TIME from v$flashback_database_log;
OLDEST_FLASHBACK_TI
-------------------
2011-02-19 22:40:39
/* 当前的standby库情况 */
SQL> select open_mode ,database_role,flashback_on from v$database;
OPEN_MODE DATABASE_ROLE FLASHBACK_ON
-------------------- ---------------- ------------------
READ ONLY WITH APPLY PHYSICAL STANDBY YES
SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss';
Session altered.
SQL> select OLDEST_FLASHBACK_TIME from v$flashback_database_log;
OLDEST_FLASHBACK_TI
-------------------
2011-02-19 22:34:56
DGMGRL> edit configuration set property faststartfailoverthreshold=10;
Property "faststartfailoverthreshold" updated
DGMGRL> show configuration verbose;
Configuration - pro
Protection Mode: MaxAvailability
Databases:
sbdb - Primary database
prod - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
/* 另开一个终端启动observer */
DGMGRL> start observer
Observer started
DGMGRL> show fast_start failover
Fast-Start Failover: DISABLED
Threshold: 10 seconds
Target: (none)
Observer: rh3.oracle.com
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
/* 启动快速故障切换 */
DGMGRL> enable fast_start failover
Enabled.
/* 记录实际切换前的时间 */
[maclean@rh3 ~]$ date
Sat Feb 19 23:35:44 CST 2011
[maclean@rh2 ~]$ ps -ef|grep pmon|grep -v grep
maclean 25165 1 0 22:58 ? 00:00:00 ora_pmon_SBDB
maclean 26080 1 0 23:04 ? 00:00:00 ora_pmon_PROD
[maclean@rh2 ~]$ kill -9 25165
/* 通过kill pmon进程造成主库crash,引发fast-start failover */
/* observer观察到primary库意外终止后,实施了快速故障切换 */
23:37:27.92 Saturday, February 19, 2011
Initiating Fast-Start Failover to database "prod"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "prod"
23:37:33.69 Saturday, February 19, 2011
23:39:17.98 Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Error: ORA-16653: failed to reinstate database
Failed.
Reinstatement of database "sbdb" failed
23:39:26.30 Saturday, February 19, 2011
23:40:05.00 Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Error: ORA-16653: failed to reinstate database
Failed.
Reinstatement of database "sbdb" failed
23:40:09.24 Saturday, February 19, 2011
/* 接下来我们通过闪回数据库来将失败切换后的primary库flashback到faliover之前以便重用*/
SQL> startup mount;
SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss');
flashback database to timestamp to_timestamp('2011-02-19 23:35:44','YYYY-MM-DD hh24:mi:ss')
*
ERROR at line 1:
ORA-38754: FLASHBACK DATABASE not started; required redo log is not available
ORA-38762: redo logs needed for SCN 1143987 to SCN End-of-Redo
ORA-38761: redo log sequence 10 in thread 1, incarnation 3 could not be
accessed
/* 缺少最近的归档日志而无法完成闪回,需要到目前的primary库中查找 */
SQL> select name from v$archived_log where FIRST_CHANGE#<1143987 and NEXT_CHANGE#>1143987;
NAME
--------------------------------------------------------------------------------
/s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc
SQL> alter database register physical logfile '/s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc';
Database altered.
SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss');
Flashback complete.
/* 成功闪回后,observer将主动去尝试reinstate这个目前可用的standby库,如以下日志*/
23:46:20.16 Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Operation requires shutdown of instance "SBDB" on database "sbdb"
Shutting down instance "SBDB"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "SBDB" on database "sbdb"
Starting instance "SBDB"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "sbdb" ...
Reinstatement of database "sbdb" succeeded
23:47:13.49 Saturday, February 19, 2011
DGMGRL> show configuration verbose
Configuration - pro
Protection Mode: MaxAvailability
Databases:
prod - Primary database
sbdb - (*) Physical standby database
(*) Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 10 seconds
Target: sbdb
Observer: rh3.oracle.com
Lag Limit: 30 seconds (not in use)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configuration Status:
SUCCESS
/*可以通过show configuration命令监控到当前primary与standby库都处于可用状态,
且fast_start failover也为启用状态 */