oracle Goldengate 维护要点
1.查看各个进程状态和日志
1.1 查看所有进程状态,如
GGSCI (IP0106) 36> info all
Program Status Group Lag Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EX1 00:00:00 00:00:07
EXTRACT RUNNING PMP1 00:00:00 00:00:01
REPLICAT RUNNING REP1 00:00:00 00:00:02
REPLICAT RUNNING REP2 00:00:00 00:00:04
status是running即为正常,如果是abended或者stopped就需要查看错误原因,lag正常为0
1.2 查看单个进程状态 如
GGSCI (IP0106) 37> info ex1,detail
EXTRACT EX1 Last Started 2011-06-04 11:35 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint Oracle Redo Logs
2011-06-12 13:29:37 Seqno 1763, RBA 179811328
Target Extract Trails:
Remote Trail Name Seqno RBA Max MB
./dirdat/ex 134 834768473 1000
Extract Source Begin End
/u01/app/oracle/oradata/orcl/redo04.log 2011-06-04 10:29 2011-06-12 13:29
/u01/app/oracle/oradata/orcl/redo05.log 2011-05-10 11:17 2011-06-04 10:29
/u01/app/oracle/oradata/orcl/redo05.log 2011-05-06 16:49 2011-05-10 11:17
/u01/app/oracle/oradata/orcl/redo04.log 2011-05-02 13:16 2011-05-06 16:49
/u01/app/oracle/oradata/orcl/redo04.log 2011-04-30 14:05 2011-05-02 13:16
/u01/app/oracle/oradata/orcl/redo02.log 2011-04-30 12:48 2011-04-30 14:05
/u01/app/oracle/oradata/orcl/redo05.log 2011-04-30 11:51 2011-04-30 12:48
/u01/app/oracle/oradata/orcl/redo05.log 2011-04-30 11:29 2011-04-30 11:51
/u01/app/oracle/oradata/orcl/redo05.log 2011-04-30 06:11 2011-04-30 11:29
/u01/app/oracle/oradata/orcl/redo05.log 2011-04-29 16:29 2011-04-30 06:11
Not Available * Initialized * 2011-04-29 16:29
/u01/app/oracle/arch/1_144_745802503.dbf 2011-04-29 15:13 2011-04-29 15:13
/u01/app/oracle/arch/1_144_745802503.dbf 2011-04-29 15:13 2011-04-29 15:13
Not Available * Initialized * 2011-04-29 15:13
Current directory /u01/app/ggs
Report file /u01/app/ggs/dirrpt/EX1.rpt
Parameter file /u01/app/ggs/dirprm/ex1.prm
Checkpoint file /u01/app/ggs/dirchk/EX1.cpe
Process file /u01/app/ggs/dirpcs/EX1.pce
Stdout file /u01/app/ggs/dirout/EX1.out
Error log /u01/app/ggs/ggserr.log
主要查看log read checkpoint的时间是否还停留在过去某一时间点,说明extract进程没有在读日志
2.查看系统和进程日志
2.1 查看系统日志 如
GGSCI>view ggsevt
2.2 查看进程日志,在排错时常常用到 如
GGSCI>view report ex1,detail
查看ogg错误和数据库对应错误编号
3.启动各个进程
3.1 启动extract,如
GGSCI>start ex1
3.2 启动datapump 如
GGSCI>start pmp1
3.3 启动replicat 如
GGSCI>start rep1
4.复制停止后的处理
4.1 extract进程因为错误而停止
查看进程错误日志,一般错误都为日志读取错误,可以start ex1手动启动,如果状态还是abended,日志读取还是有问题,建议等源端没有同步任务的时候
delete extract,然后再重新添加add extract的方式重建进程,如
GGSCI>delete ex1
GGSCI>add ex1,vam,begin now --mysql extract的语法
GGSCI>add exttract ./sq2/ex1,extract ex1 --队列位置要和重建前的队列一致,通过
GGSCI>info exttrail * --查看
4.2 datapump因为错误而停止
一般为远端mgr的端口连接不上导致错误,查看远端mgr的端口和防火墙的配置
4.3 replicat因为错误而停止
建议查看replicat进程的错误日志,查看具体原因
如果需要跳过问题事务而恢复复制
GGSCI>start rep1,skiptransaction
一般等replicat的lag时间为0且状态为running说明恢复正常
5.队列的维护
一般都在mgr进程中配置了队列文件的自动删除,如
GGSCI>edit params mgr
purgeoldextracts ./dirdat/ex*,minkeepdays 3,minkeepfiles 10
可以手动修改保持时间,然后refresh mgr,如
GGSCI>refresh mgr
阅读(865) | 评论(0) | 转发(0) |