zoukankan      html  css  js  c++  java
  • Goldengate升级之目标端(replicat端)升级

    转自红黑联盟Goldengate升级之目标端(replicat端升级

     

    要升级replicat端的原因为:目标端OGG软件版本与源端OGG软件版本不同,在实际生产应用中,经常发现replicat端事务丢失的情况,所以,需要将目标端的OGG软件升级为与源端OGG相同软件版本。

    1、升级前环境情况

    源端OGG版本11.2.1.0.1

    目标端OGG版本11.1.1.1.2

    升级前,为了解决源端、目标端OGG版本不一致不能正常同步的问题,在源端抽取Tail file格式时,加了format release 11.1的格式转换命令,在extract与data pump进程中均配置,其配置方式如下:

    EXTTRAIL ./dirdat/tr, format release 11.1

    2、升级目标

    将目标端OGG版本从11.1.0.2升级到与源端一致的11.2.0.1版本

    3、升级前准备

    3.1 停止源端的extract和datapump进程

    GGSCI>stop exttr

    GGSCI>stop dpetr

    3.2 停止目标端的replicat和mgr进程

    GGSCI>stop reptr

    GGSCI>stop mgr

    3.3 去除extract和datapump进程中的format release参数

    因为在extract与datadump进程中都配置有formatrelease 11.1关键字,在目标端升级后,该配置需要取消,但是,取消该配置,并不只是从配置文件中删除这么简单的事,还需要对进程做ETROLLOVER操作,否则进程启动时,会报如下错误:

    ERROR OGG-01416 File ./dirdat/tr000008, with format RELEASE 10.4/11.1, does not match

    current format specification of RELEASE 11.2.Modify the parameter file to specify format RELEASE 10.4/11.1

    or issueETROLLOVER prior to restart.

    操作方法:在OGG中以edit paramsextract_name和edit params datadump_name的方式进入进行修改(具体操作:略)。

    3.4 对extract与datadump进程做ETROLLOVER操作

    因为extract与datadump都修改了format release信息,所以,都需要做ETROLLOVER操作

    GGSCI>alter extract exttrETROLLOVER

    GGSCI>alter extract dpetrETROLLOVER

    3.5 修改源端datapump进程的EXTSEQNO号和EXTRBA号

    因为源端的extract-exttr进程执行ETROLLOVER操作后,进程的extseqno和extrba号被重置至下一个extseqno号的第0号extrba,而datapump并不知道extract进程发生了这个变化,仍然守望在extract ETROLLOVER前的extseqno和extrba号上,永远也等不到这个extseqno上有新的RBA变化,造成就无法将extract新抽取到的数据,传递到目标端去。

    所以,在extract进程ETROLLOVER后,需要使用“Alterextract group_name EXTSEQNO X, EXTRBA 0”的命令,重置checkpoint位置,datadump进程才能正常的将数据继续传递到远端。

    3.5.1 记录下exttr进程ETROLLOVER后的extseqno和extrba号

    GGSCI> info exttr, detail

    GGSCI (server1) 140> info exttr, detail

    EXTRACT EXTTR Initialized 2015-01-30 13:02 Status STOPPED

    Checkpoint Lag 00:00:00 (updated 00:00:27 ago)

    Log Read Checkpoint Oracle Redo Logs

    2015-01-30 13:18:27 Seqno 365, RBA 17822208

    SCN 0.17726669 (17726669)

    Target Extract Trails:

    Remote Trail Name Seqno RBA Max MB

    ./dirdat/tr 4 0 100

    Extract Source Begin End

    ……

    3.5.2 修改datadump进程的extseqno和extrba号

    GGSCI>Alter extract dpetr EXTSEQNO 4,EXTRBA 0

    3.6 查出datadump进程ETROLLOVER后的Target ExtractTrails的extseqno和extrba号

    GGSCI (server1) 160> info dpetr, detail

    EXTRACT DPETR Initialized 2015-01-30 13:03 Status STOPPED

    Checkpoint Lag 00:00:00 (updated 00:02:10 ago)

    Log Read Checkpoint File ./dirdat/tr000003

    2015-01-30 13:18:18.000000 RBA 48953966

    Target Extract Trails:

    Remote Trail Name Seqno RBA Max MB

    ./dirdat/tr 4 0 100

    ……

    3.7 备份目标端的OGG目录

    #cp –ra /u01/ogg /u01/ogg_backup

    3.8 记录下目标端replicat的checkpoint信息

    GGSCI (server2) 1> info reptr, showch

    REPLICAT REPTR Last Started 2015-01-30 13:03 Status STOPPED

    Checkpoint Lag 00:00:00 (updated 00:10:36 ago)

    Log Read Checkpoint File ./dirdat/tr000003

    2015-01-30 13:18:16.296427 RBA 48953996

    Current Checkpoint Detail:

    Read Checkpoint #1

    GGS Log Trail

    Startup Checkpoint (starting position in the data source):

    Sequence #: 0

    RBA: 0

    Timestamp: Not Available

    Extract Trail: ./dirdat/tr

    Current Checkpoint (position of last record read in the data source):

    Sequence #: 3

    RBA: 48953996

    Timestamp: 2015-01-30 13:18:16.296427

    Extract Trail: ./dirdat/tr

    ……

    Current Checkpoint为 Sequence #: 3 RBA: 48953996

    4、OGG软件升级

    4.1 将新版OGG 11.2.1.0.1软件复制到原旧版OGG目录下

    $ cp ogg112101_fbo_ggs_Linux_x64_ora11g_64bit.zip/u01/ogg

    4.2 删除OGG目录下的fbo_ggs_Linux_x64_ora11g_64bit.tar文件

    fbo_ggs_Linux_x64_ora11g_64bit.tar文件是旧版OGG安装时解压出来的tar文件,新版OGG软件unzip解压时,也会生成此文件,当然也可以使用unzip解压时的覆盖替代手工删除

    $rm fbo_ggs_Linux_x64_ora11g_64bit.tar

    4.3 在原版OGG目录中安装(解压)新版OGG软件

    $unzip ogg112101_fbo_ggs_Linux_x64_ora11g_64bit.zip

    $tar xvf fbo_ggs_Linux_x64_ora11g_64bit.tar

    4.4 验证升级成功情况

    $cd $OGG

    [oracle@server2 u01]$ cd $OGG

    [oracle@server2 ogg]$ ./ggsci

    Oracle GoldenGate Command Interpreter for Oracle

    Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO

    Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14

    Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.

    GGSCI (server2) 1>

    从这里看到,OGG软件版本已经升级至11.2.1.0.1

    5、验证升级后replicat的checkpoint是否为一致

    5.1 查看升级后replicat的checkpoint是否与升级前一致

    GGSCI (server2) 46> info reptr, detail

    REPLICAT REPTR Last Started 2015-01-30 13:03 Status STOPPED

    Checkpoint Lag 00:00:00 (updated 00:12:58 ago)

    Log Read Checkpoint File ./dirdat/tr000003

    2015-01-30 13:18:16.296427 RBA 48953996

    Extract Source Begin End

    ./dirdat/tr000003 * Initialized * 2015-01-30 13:18

    ./dirdat/tr000000 * Initialized * First Record

    Current directory /u01/ogg

    Report file /u01/ogg/dirrpt/REPTR.rpt

    Parameter file /u01/ogg/dirprm/reptr.prm

    Checkpoint file /u01/ogg/dirchk/REPTR.cpr

    Checkpoint table GOLDENGATE.CHECKPOINT_REPTR_01

    Process file /u01/ogg/dirpcs/REPTR.pcr

    Stdout file /u01/ogg/dirout/REPTR.out

    Error log /u01/ogg/ggserr.log

    Current Checkpoint 跟升级前一样,仍为Sequence #: 3 RBA: 48953996,进一步证明升级成功。

    6、验证升级后可正常同步数据场景准备

    6.1 先记录下源端、目标端数据量为一致

    (生产库环境此步可以跳过,因为生产库的表受业务的操作数据不断发生变化,本案例为创建一张专用用于测试升级的表进行测试)

    源端

    目标端

    SQL> select count(*) from goldengate.ogg_upg;

    COUNT(*)

    ----------

    2150000

    SQL> select count(*) from goldengate.ogg_upg;

    COUNT(*)

    ----------

    2150000

    源端和目标端数据量一致。

    6.2 在目标端replicat进程启动前在源端删除部分数据

    SQL> delete goldengate.ogg_upg where rownum <1000001;

    1000000 rows deleted.

    SQL> commit;

    SQL> select count(*) from goldengate.ogg_upg;

    COUNT(*)

    ----------

    1150000

    7、重建目标端replicat进程

    重建replicat进程的原因:ogg_11.1.1.1.2每条replicat进程只有CHECKPOINT一张表,而ogg_11.2.x.x.1每条replicat进程有CHECKPOINT,CHECKPOINT_LOX两张表,所以,如果在升级后,直接启动replicat进程,是无法启动的,会报如下错误:

    ERROR OGG-00665 OCI Error describe for query (status = 942-ORA-00942: table or view does not exist), SQL<SELECT a.current_dir, a.seqno, a.rba, a.audit_ts, a.log_csn, a.log_xid, a.log_cmplt_csn, a.log_cmplt_xids, b.log_cmplt_xids FROM GOLDENGATE.CHECKPOINT_REPTR_01 a LEFT JOIN GOLDENGATE.CHECKPOINT_REPTR_01_lox b ON a.group_name = b.group_name AND a.group_key = b.group_key AND a.log_cmplt_csn = b.log_cmplt_csn WHERE a.group_name = 'REPTR' AND a.group_key = 2810015614>.

    2015-01-28 05:12:59 ERROR OGG-01668 PROCESS ABENDING.

    提示表不存在,这张表是指:GOLDENGATE.CHECKPOINT_REPTR_01_lox表(CHECKPOINT_LOX表),重建replicat的最终目的是让在重建进程时,自动将两张checkpoint表都自动创建起来。

    7.1 删除replicat进程与checkpoint表

    GGSCI>dblogin userid goldengate,password goldengate

    GGSCI>delete replicat reptr

    GGSCI>delete checkpointtable GOLDENGATE.CHECKPOINT_REPTR_01

    7.2 重新建立replicat进程

    GGSCI>dbloginuserid goldengate, password goldengate

    GGSCI>add checkpointtable goldengate.checkpoint_reptr_01

    GGSCI>add replicat reptr, exttrail ./dirdat/tr,checkpointtable goldengate.checkpoint_reptr_01

    到这一步,replicat启动后,还是无法继续应用源端投递过来的trail的,因为源端的datapump进程做过ETROLLOVER了,所以需要手工修改replicat的sequence#和RBA号到与datadump进程的target extract trails中的seqno和RBA一致。

    7.3 修改目标端replicat的extseqno和extrba号

    根据上面一步,查出的源端的datadump进程的remotetrail file的seqno号和RBA,决定replicat要更改的extseqno和extrba号

    GGSCI (server2) 37> Alter replicat reptrEXTSEQNO 4, EXTRBA 0

    7.4 查看新建立的replicat进程的SEQ#和RBA号

    GGSCI(server2) 27> info reptr, showch

    GGSCI (server2) 3> info reptr, showch

    REPLICAT REPTR Initialized 2015-01-30 13:39 Status STOPPED

    Checkpoint Lag 00:00:00 (updated 00:00:07 ago)

    Log Read Checkpoint File ./dirdat/tr000004

    First Record RBA 0

    Current Checkpoint Detail:

    Read Checkpoint #1

    GGS Log Trail

    Startup Checkpoint (starting position in the data source):

    Sequence #: 4

    RBA: 0

    Timestamp: Not Available

    Extract Trail: ./dirdat/tr

    Current Checkpoint (position of last record read in the data source):

    Sequence #: 4

    RBA: 0

    Timestamp: Not Available

    Extract Trail: ./dirdat/tr

    ……

    Start checkpoint 的sequence#和RBA号已经通过手工,都定位到了源端ETROLLOVER后的状态

    Current checkpoint的sequence#和RBA号已经通过手工,也都定位到了源端ETROLLOVER后的状态

    7.5 查看checkpoint表的数据内容

    SQL>select * from goldengate.checkpoint_reptr_01;

    No rows selected

    SQL>select * from goldengate.checkpoint_reptr_01_lox;

    No rows selected

    这时,两个checkpoint表中的数据还是空的,待replicat启动运行后,进行状态信息就会写入到checkpoint表中

    8、启动源端和目标端进程

    8.1 启动目标端的replicat和mgr进程

    GGSCI>start mgr

    GGSCI>start reptr

    8.2 启动源端的datapump进程

    GGSCI>start dpetr

    9、验证是否能继续同步数据

    此步最重要的,是要验证在升级时间时,源端所做的操作能否同步到目标库中来。本案例在升级过程中,对goldengate.ogg_upg表删除了1000000行数据。

    9.1 查看目标端goldengate.ogg_upg表的数据变化

    源端

    目标端

    SQL> select count(*) from goldengate.ogg_upg;

    COUNT(*)

    ----------

    2050000

    SQL> select count(*) from goldengate.ogg_upg;

    COUNT(*)

    ----------

    2050000

    9.2 目标端查看checkpoint表内check状态

    select * from goldengate.checkpoint_reptr_01

    -------------------------------------------------

    REPTR 2149948420 4 19280017 2015-01-30 13:55:59.368501 2015/1/30 13:36:43 2015/1/30 14:01:01 /u01/ogg 17929533 7.16.20512 17929533 7.16.20512 1

    从上面两步看来,升级后replicat正常的,接着升级前的状态在同步数据。

  • 相关阅读:
    Linux目录管理常用指令
    生成器
    Python上的MVC和MVT理解,请求头,请求体,请求行的理解
    sellect、poll、epoll
    冒泡法排序
    (android / IOS)
    发现一个bug如何定位是前端还是后台问题?
    ANR----以及如何定位是前端问题还是后台问题?
    给你一个web端项目你如何展开测试?
    给你一个app你如何展开测试?
  • 原文地址:https://www.cnblogs.com/yangyudexiaobai/p/4432228.html
Copyright © 2011-2022 走看看