客户报告,用RMAN 的 duplicate 命令,在具备RAC环境的standby 端,创建standby 数据库时,失败。
报:ORA-19504、ORA-17502、ORA-15001、ORA-27140
执行的过程如下:
[oracle @ racddb001g ~] $ export ORACLE_SID = tmt011
[oracle @ racddb001g ~] $
[oracle @ racddb001g ~] $ sqlplus/as sysdba
SQL * Plus: Release 12.2.0.1.0 Production on Fri April 4 02:26:18 2021
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 --64bit Production
Connected to.
SQL> shutdown immediate
ORA-01507: database is not mounted
The ORACLE instance has been shut down.
SQL> startup nomount pfile ='/media/dg/stby_inittmt01.ora'
The ORACLE instance has started.
Total System Global Area 1593835520 bytes
Fixed Size 8421136 bytes
Variable Size 453985776 bytes
Database Buffers 848860800 bytes
Redo Buffers 294367808 bytes
SQL> exit
[oracle@racddb001g ~]$ export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'
[oracle@racddb001g ~]$ rman target 'sys@tmt01H' auxiliary /
RMAN> duplicate target database for standby dorecover nofilenamecheck;
Duplicate Db started at 2021/04/04 02:28:29
Channel: ORA_AUX_DISK_1 assigned
Channel ORA_AUX_DISK_1: SID = 30 Instance = tmt011 Device Type = DISK
The current log has been archived.
Memory script content:
{
set until scn 5645034;
restore clone standby controlfile;
}
Running a memory script
Execution command: SET until clause
restore is starting at 2021/04/02 02:28:40
Use of channel ORA_AUX_DISK_1
Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup
Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup
Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
ORA-19505: failed to identify file "/my/oracle/dbhome_1/dbs/c-2060537070-20210320-00".
ORA-27037: unable to get file status
Failover to previous backup
RMAN-00571: ============================================== =============
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ============================================== =============
RMAN-03002: Duplicate Db command failed at 04/02/2021 02:45:39
RMAN-05501: abort copy of target database
RMAN-03015: error in stored script Memory Script
RMAN-06026: missing target--stop restore
RMAN-06024: cannot find backup or copy to restore control file
RMAN>
从上面的出错信息上看,这是duplicate过程创建辅助instance,向磁盘组 restore 控制文件失败了:
Channel ORA_AUX_DISK_1: Restoring control file ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl". ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl ORA-15001: diskgroup "DG001" does not exist or is not mounted ORA-27140: attach to post/wait facility failed Failover to previous backup
首先看看用户oracle 有没有对磁盘组的读写权限:
SQL> select NAME,STATE from v$asm_diskgroup; NAME ------------------ STATE ------------------ DG001 MOUNTED SQL> select name,PATH from v$asm_disk; NAME ------------------------------------- PATH ------------------------------------- DG001_0000 /dev/mapper/ora01
查看磁盘组对应的物理磁盘,发现映射到 /dev/dm-11 设备。
$ ls -l /dev/mapper/ora01 lrwxrwxrwx 1 root root 8 Mar 27 00:52 /dev/mapper/ora01 -> ../dm-11
该设备的 owner 和组,分别是 grid:asmadmin。
$ ls -l /dev/dm-* brw-rw---- 1 grid asmadmin 253, 10 Mar 27 00:58 /dev/dm-10 brw-rw---- 1 grid asmadmin 253, 11 Mar 27 00:58 /dev/dm-11
看一下 grid 用户的组:
# su - grid $ id uid=10000(grid) gid=11000(oinstall) groups=11000(oinstall),11002(asmadmin) ,11003(asmdba),11004(asmoper)
再看一下oracle 用户的组:
# su - oracle $ id uid=10001(oracle) gid=11000(oinstall) groups=11000(oinstall),11001(dba),11003(asmdba),11005(racdba),11006(backupdba),11007(dgdba),11008(kmdba),11009(oper)
发现 oracle 用户,根本不在 asmadmin 组里。可能是因为这个原因,没有办法访问数据库。
请客户把 oracle 用户也加入到 asmadmin 中,但是再次执行duplicate 没有什么变化:
usermod -a -G asmadmin oracle
现在,还需要考虑其它的原因,就是 oracle 可执行文件,是否有权限以 grid 用户身份运行。
<primary> # su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 19 00:22 /my/oracle/dbhome_1/bin/oracle [root@rachdb001g ~]# su - grid 最終ログイン: 2021/04/01 (木) 16:29:18 JST [grid@rachdb001g ~]$ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 grid oinstall 373409344 Mar 18 03:05 /opt/oracle/grid/12.2.0/grid/bin/oracle [grid@rachdb001g ~]$ <standby> #su - oracle $ ls -l $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle asmadmin 408674152 3月 24 22:19 /my/oracle/dbhome_1/bin/oracle # su - grid $ ls -l $ORACLE_HOME/bin/oracle -rwxr-x--x 1 grid oinstall 373409344 Mar 20 01:56 /opt/oracle/grid/12.2.0/grid/bin/oracle
可以看到,主库和备库上的 grid 用户的 $ORACLE_HOME/bin/oracle 的权限是不一样的。
一个是: -rwsr-x--x ,一个是 -rwxr-x--x。
需要进行设置:
chown grid:oinstall $GI_HOME/bin/oracle chmod 6751 $GI_HOME/bin/oracle
重新启动 standby 端的辅助instance,再次执行 duplicate, 已经可以成功执行。