Logical Standby Dataguard sql apply hung的问题-cwn21-ChinaUnix博客

oracle dba

首页　| 　博文目录　| 　关于我

cwn21

博客访问： 82597
博文数量： 6
博客积分： 1370
博客等级：中尉
技术积分： 350
用户组：普通用户
注册时间： 2008-02-19 12:19

文章分类

全部博文（6）

Life（0）
Liunx（0）
Solaris（1）
Aix（0）
Oracle（5）

Dba（3）

Dataguard（1）

Rac（0）

Install（0）

Tuning（1）

ToubleShooting（0）
未分配的博文（0）

文章存档

2008年（6）

我的朋友

相关博文

Logical Standby Dataguard sql apply hung的问题

分类： Oracle

2008-03-03 10:49:59

OS: LINUX AS4 DB Version: oracle10.2.02

主机PRIMARY和备机STANDY配置一样。

在logical standby dataguard维护过程中发现以下现象：

1、主机的归档日志已经完全同步到备机上，并且注册到备机数据库上；

SQL> select sequence#, first_time, next_time, dict_begin, dict_end from dba_logstdby_log order by sequence#;

SEQUENCE# FIRST_TIME   NEXT_TIME    DIC DIC
---------- ------------ ------------ --- ---
     12675 27-DEC-07    27-DEC-07    NO NO
     12676 27-DEC-07    27-DEC-07    NO NO
     …………………………

12711 28-DEC-07 28-DEC-07 NO NO
12712 28-DEC-07 28-DEC-07 NO NO

38 rows selected.

2、备机上没有需要手动补偿的事务；

SQL> select event,xidusn,xidslt,xidsqn from dba_logstdby_events where event_time=(select max(event_time) from dba_logstdby_events);

EVENT
--------------------------------------------------------------------------------
XIDUSN XIDSLT XIDSQN
---------- ---------- ----------

3、备机数据同步执行非常缓慢，貌似数据库hung住；

alter session set nls_date_format = 'DD-Mon-YYYY hh24:mi:ss';

select * from dba_logstdby_progress;

检查发现，数据库只同步到昨天晚上9点。

SQL>SELECT TYPE, HIGH_SCN, STATUS FROM V$LOGSTDBY;

TYPE                  HIGH_SCN                         STATUS
COORDINATOR        9,371,034,866,807        ORA-16116: no work available
READER          9,371,034,866,812        ORA-16127: stalled waiting for additional transactions to be applied
BUILDER                9,371,034,866,807        ORA-16127: stalled waiting for additional transactions to be applied
PREPARER        9,371,034,866,806        ORA-16127: stalled waiting for additional transactions to be applied
ANALYZER        9,371,034,866,807        ORA-16116: no work available
APPLIER          9,371,034,866,805        ORA-16124: transaction 20 17 74407 is waiting on another transaction
APPLIER                                        ORA-16116: no work available
APPLIER                                        ORA-16116: no work available
APPLIER                                        ORA-16116: no work available
APPLIER                                        ORA-16116: no work available

查看ORACLE METALINK发现：

If there are a number of appliers that consistently have a status code of 16124, then there is a possibility of SQL Apply being delayed by an excessive number of Eager Transactions.

applier状态出现16124说明sql apply正处理一个非常大的事务，造成运行缓慢。（后来询问了开发的同事，果然是在主库中有个job，需要在一个事务中更新几万行记录）

An Eager Transaction is a transaction that updates many rows. From Data Guard 10g Release 1, an Eager Transaction is defined as updating more than 201 rows. The purpose of an Eager Transaction is to alleviate an apparent "hang" by trickling records into the standby database as they become available rather than waiting for the entire transaction to be built before applying it to the standby database.

This works well for most cases. However the performance of SQL Apply will deteriorate if an application creates multiple concurrent eager transactions on the primary database (multiple sessions connected to the primary database, all modifying more than 201 rows each).

An Eager Transaction指需要更新多行的大事务，从DATAGUAD 10G的第一个版本开始，an Eager Transaction 定义为可以更新多于201行数据。。。。。。。。。

If the slowed performance of SQL Apply impacts the standby database service levels, it is possible to reduce the interference by increasing the definition of an Eager Transaction.

如果sql applay性能缓慢，导致standy服务受到影响，可以通过增加an Eager Transaction的限制参数来降低影响。

Use the following query to determine if this is the cause of a hung or slowly progressing SQL Apply:

col event format a50 trunc
select ls.status_code, s.event, count(1) No_Of_Appliers
from v$logstdby ls
      , v$streams_apply_server sas
      , v$session s
where ls.type = 'APPLIER'
   and ls.status_code in ( 16124, 16123 )
   and ls.logstdby_id = sas.server_id
   and s.sid = sas.sid
group by ls.status_code, s.event;

STATUS_CODE EVENT                                              No_Of_Appliers
----------- -------------------------------------------------- --------------
      16123 rdbms ipc message                                              12
      16124 rdbms ipc message                                              97
      16124 db file sequential read                                         1

In the example above, the large number of appliers waiting on the "rdbms ipc message" event suggests that a number of eager transactions are being applied, and that performance would benefit from increasing the number of rows that define an eager transaction.

To determine an appropriate value for Eager Size, analyze the application (if possible) that is creating the slow down to occur, determine the number of rows that make up the transaction, and set Eager Size equal to this value.

To increase the Eager Transaction parameter, SQL Apply must be stopped briefly.

步骤：
SQL> alter database stop logical standby apply;
SQL> execute dbms_logstdby.apply_set('_EAGER_SIZE',);
SQL> alter database start logical standby apply [immediate];

The value to which the _EAGER_SIZE should be increased should not exceed 1000. Since a higher value will result in fewer transactions being defined as Eager, this may result in the LCR Cache (controlled by the MAX_SGA parameter) becoming full.

After changing the _EAGER_SIZE parameter, double-check to ensure paging is not occurring per b above. If paging is occurring, then the MAX_SGA parameter should be increased (if possible) or else the _EAGER_SIZE parameter should be reduced to a level where paging does not occur.

至此，故障得到解决。在部署dataguard的过程中，除了安装官方文档进行安装外，还要在部署前检查在primary 机器上面有没大事务需要处理的，尽量将事务分解成若干个小事务，或者调大AN EAGER SIZE的限定，以免影响dataguard的同步性能。

阅读(1863) | 评论(0) | 转发(0) |

上一篇：10g 采用v$session快速查找阻塞

下一篇：常见的wait等待事件及处理

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6