Chinaunix首页 | 论坛 | 博客
  • 博客访问: 294672
  • 博文数量: 60
  • 博客积分: 1437
  • 博客等级: 中尉
  • 技术积分: 632
  • 用 户 组: 普通用户
  • 注册时间: 2011-02-10 14:12
文章存档

2012年(7)

2011年(53)

分类: Oracle

2011-03-01 11:17:46

今天早上来收到stream 的监控邮件显示,应用一直停在凌晨01:54:41 不动:

检查capture的状态,发现也状态正常嘛,两边状态都正常,怎么就不传数据呢?难道是今天没有跑业务?检查一次今天生成的日志,发现已经有十几个了嘛,但是stream怎么就不正常了呢?接着检查capture 端的alert日志:
Tue Mar 01 02:31:24 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16948, /pgprod_arch/pgprod_arc_1_16948_729213055.arc
Tue Mar 01 02:31:24 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16948_729213055.arc
Tue Mar 01 02:46:25 GMT+08:00 2011RFS[1611]: Archived Log: '/pgprod_arch/pgprod_arc_1_16949_729213055.arc'
Tue Mar 01 02:46:25 GMT+08:00 2011RFS LogMiner: Registered logfile [/pgprod_arch/pgprod_arc_1_16949_729213055.arc] to LogMiner session id [4]
Tue Mar 01 02:46:25 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16949, /pgprod_arch/pgprod_arc_1_16949_729213055.arc
Tue Mar 01 02:46:25 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16949_729213055.arc
Tue Mar 01 02:47:15 GMT+08:00 2011Propagation Schedule for (STRMADMIN.MARS_CAP_QUEUE, "STRMADMIN"."MARS_APP_QUEUE"@GZPROD.DEST) encountered following error:
ORA-02068: following severe error from GZPROD.DEST
ORA-03135: connection lost contact
ORA-06512: at "SYS.DBMS_AQADM_SYS", line 1034
ORA-06512: at "SYS.DBMS_AQADM_SYS", line 7902
ORA-06512: at "SYS.DBMS_AQADM", line 631
ORA-06512: at line 1

咦,网络似乎在这个时间点出了问题,难道网络不通?
bash-3.2$ sqlplus strmadmin/pwd;
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Mar 1 10:45:28 2011
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select sysdate from dual@GZPROD.DEST;
SYSDATE
---------
01-MAR-11
SQL>
现在网络是通的,难道propagation进程挂了?

发现状态也是好好的啊。。。。那就重启propagation进程吧,但是很悲愤,重启不了,那就重启数据库吧,发现重启数据库是解决stream 问题的唯一办法,有时甚至要shutdown abort才能关闭数据库!这也是stream硬伤啊。。。。
重启capture端的数据库后,检查一下日志,可以看到stream已经恢复正常了:
bash-3.2$ tail -f /ora10g/admin/rptdb/bdump/alert_rptdb.log
LOGMINER: session# = 4, preparer process P002 started with pid=27 OS id=5243270
Tue Mar 01 10:54:06 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16968, /pgprod_arch/pgprod_arc_1_16968_729213055.arc
Tue Mar 01 10:54:07 GMT+08:00 2011STREAMS Capture C  1: first scn changed.
scn: 0x0992.723a0dc3
Tue Mar 01 10:54:07 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16968_729213055.arc
Tue Mar 01 10:54:07 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16969, /pgprod_arch/pgprod_arc_1_16969_729213055.arc
Tue Mar 01 10:54:11 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16969_729213055.arc
Tue Mar 01 10:54:11 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16970, /pgprod_arch/pgprod_arc_1_16970_729213055.arc
Tue Mar 01 10:55:14 GMT+08:00 2011Thread 1 advanced to log sequence 10928 (LGWR switch)
  Current log# 6 seq# 10928 mem# 0: /rptdata/oradata/rptdb/redo06.log
Tue Mar 01 10:56:07 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16970_729213055.arc
Tue Mar 01 10:56:07 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16971, /pgprod_arch/pgprod_arc_1_16971_729213055.arc
Tue Mar 01 10:56:26 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16971_729213055.arc
Tue Mar 01 10:56:26 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16972, /pgprod_arch/pgprod_arc_1_16972_729213055.arc
Tue Mar 01 10:56:56 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16972_729213055.arc
Tue Mar 01 10:56:56 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16973, /pgprod_arch/pgprod_arc_1_16973_729213055.arc
Tue Mar 01 10:57:33 GMT+08:00 2011LOGMINER: End mining logfile: /pgprod_arch/pgprod_arc_1_16973_729213055.arc
Tue Mar 01 10:57:33 GMT+08:00 2011LOGMINER: Begin mining logfile for session 4 thread 1 sequence 16974, /pgprod_arch/pgprod_arc_1_16974_729213055.arc
Tue Mar 01 10:58:00 GMT+08:00 2011Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[1]: Assigned to RFS process 8585418
RFS[1]: Identified database type as 'repository'
RFS[1]: Identified database type as 'repository'
Tue Mar 01 10:58:00 GMT+08:00 2011RFS LogMiner: Client enabled and ready for notification
Tue Mar 01 10:58:00 GMT+08:00 2011RFS LogMiner: RFS id [8585418] assigned as thread [1] PING handler
RFS[1]: Archived Log: '/pgprod_arch/pgprod_arc_1_16986_729213055.arc'
Tue Mar 01 10:58:04 GMT+08:00 2011RFS LogMiner: Registered logfile [/pgprod_arch/pgprod_arc_1_16986_729213055.arc] to LogMiner session id [4]
Tue Mar 01 10:58:13 GMT+08:00 2011C001: large txn committed, xid: 0x000d.006.00065e75

propagation队列里的数据也不断的变化了:
bash-3.2$ ./propa_status.sh
***************************
Schedule of propagation job
***************************

                          Process Schedule         Avg       Total          Last Run   Next Run   Next Run
Destination               Name    Disabled      Number      Number Failures Date       Date       Time
------------------------- ------- -------- ----------- ----------- -------- ---------- ---------- ----------
"STRMADMIN"."MARS_APP_QUE J000    N                  0           0        0 2011-03-01
UE"@GZPROD.DEST                                                              10:54:48

"STRMADMIN"."MARS_APP_QUE J000    N              91904       91904        0 2011-03-01
UE"@GZPROD.DEST                                                              10:54:48

Determining the Total Number of Messages and Bytes Propagated
*************************************************************
bash-3.2$ ./propa_status.sh
***************************
Schedule of propagation job
***************************

                          Process Schedule         Avg       Total          Last Run   Next Run   Next Run
Destination               Name    Disabled      Number      Number Failures Date       Date       Time
------------------------- ------- -------- ----------- ----------- -------- ---------- ---------- ----------
"STRMADMIN"."MARS_APP_QUE J000    N                  0           0        0 2011-03-01
UE"@GZPROD.DEST                                                              10:54:48

"STRMADMIN"."MARS_APP_QUE J000    N              96512       96512        0 2011-03-01
UE"@GZPROD.DEST                                                              10:54:48

Determining the Total Number of Messages and Bytes Propagated
*************************************************************

apply端也开始应用变化了:








阅读(3034) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~