Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1759202
  • 博文数量: 335
  • 博客积分: 4690
  • 博客等级: 上校
  • 技术积分: 4341
  • 用 户 组: 普通用户
  • 注册时间: 2010-05-08 21:38
个人简介

无聊之人--除了技术,还是技术,你懂得

文章分类

全部博文(335)

文章存档

2016年(29)

2015年(18)

2014年(7)

2013年(86)

2012年(90)

2011年(105)

分类: DB2/Informix

2016-02-21 16:39:01

现在IDC大行其道,同城/异地双活甚至多活的概念已经日益普及。12年的时候,写过一篇mysql中关于MASTER/SLAVE的实现mysql replication---master/salve IN ONE PC 。针对不同的DBMS,双活具体的实现不同,但都基于相同的理论或是基于SQL REPLICATION,或是基于LOG REPLICATION。这里介绍一下IBM中关于A/A的实现,关于Q的架构以及实现后面专门抽出一章来进行介绍,这里是实践贴,使你对Q有一个感性认识。关于本文最后的Q latency部分,也是每一个系统运维人员都必须要掌握的,因为它对于RPO/RTO的重要性不言而喻。

Why Active/Active ? - Business Requirements

standby to a/a


Replication Technologies for Business Continuity

technologies for bc

Requirements and trade-offs to consider in selecting technologies:

What needs to be recovered?: Application Data vs. DBMS vs. Entire Systems
How long does it take? Recovery Time Objective (RTO): One hours or more vs. few seconds
How much data could you lose? Recovery Point Objective (RPO): No data loss vs. seconds of data
Distance required between sites?: 10s of kilometers vs. 100s of kilometers
Hardware Utilization: Standby vs. Active
Impact on applications: Direct overhead (synchronous technologies) vs. no impact (async technologies)
CPU Overhead: Negligible (hardware e.g., PPRC) vs. Proportional to the workload (transaction replay
technology)


SENARIO OF TWO NODE QREP

Objects needed in WebSphere MQ


TWO NODE

OBJECTS IN Q


LOG BASED QREPLICATION

OBJECT DEFINE DESCRIPTION

OBJECT NAMES NODE1 NODE2
HOSTNAME SITEA SITEB
IP 10.1.1.1 10.2.1.1
DB2 VERSION 10 11
SUBSYSTEM DB1 DB2
LOCATION DB0A DB0B
DATABASE PORT 8000 9000
DATABASE DB1 DB2
Q MGR QMGR1 QMGR2
MQ PROT 8001 9001
RESTARTQ MQ1.RESTARTQ MQ2.RESTARTQ
ADMINQ MQ1.ADMINQ MQ2.ADMINQ
SENDQ MQ1.SENDQ.Q1 MQ2.SENDQ.Q1
SENDQ MQ1.SENDQ.Q2 MQ2.SENDQ.Q2
SENDQ MQ1.SENDQ.Q3 MQ2.SENDQ.Q3
REP SCHEMA ASN1 ASN2

PREPARE ACTION

  1. BIND REPLICATION PROGRAM

  2. APF AUTHORIZE Q-RELATED PROGRAM

  3. ENABLE Q TABEL DATA CAPTURES

    ALTER TABLE QTABLE DATA CAPTURE CHANGES;

  4. CONFIG CDB TABLE

    远程DB2连接使用DRDA协议,在使用之前需要配置对应的CDB信息。CDB 主要是Qapply用来连接Qcapture的属性配置,第一条SQLP配置了本端的连接信息,后面3条SQL配置了,对端的连接信息。

    NODE1:

    INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0A', 'DB0A', '');
    INSERT INTO SYSIBM.IPNAMES(LINKNAME, SECURITY_OUT, USERNAMES, IPADDR) VALUES ('DB0B', 'P', 'O', 'demo.TEST.com');
    INSERT INTO SYSIBM.USERNAMES(TYPE, LINKNAME, NEWAUTHID, PASSWORD) VALUES ('O', 'DB0B', 'TESTUSER', 'NOTTELLYOU');
    INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0B', 'DB0B', '9000');

    NODE2:

    INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0B', 'DB0B', '');
    INSERT INTO SYSIBM.IPNAMES(LINKNAME, SECURITY_OUT, USERNAMES, IPADDR) VALUES ('DB0A', 'P', 'O', 'DEMO.TEST.com');
    INSERT INTO SYSIBM.USERNAMES(TYPE, LINKNAME, NEWAUTHID, PASSWORD) VALUES ('O', 'DB0A', 'BRIDDEL', 'XXXXXXXX');
    INSERT INTO SYSIBM.LOCATIONS(LOCATION, LINKNAME, PORT) VALUES ('DB0A', 'DB0A', '8000');

    MQ OBJECT DEFINE

    定义MQ传输使用的各种对象。

     
    		
    		
    DEFINE QLOCAL(MQ1.ADMINQ) DESC('ADMINQ OF MQ2 IN SITEA') PUT(ENABLED) GET(enabled) SHARE DEFSOPT(SHARED) MAXDEPTH(1000) DEFPSIST(YES)
    DEFINE QLOCAL(MQ1.RESTARQ) DESC('RESTARTQ OF MQ2 IN SITEA') PUT(ENABLE) GET(ENABLE) SHARE DEFSOPT(SHARED) MAXDEPTH(1) INDEXTYPE(MSGID) DEFPSIS(YES)
    DEFINE QLOCAL(MQ1.SENDFQ.Q1) REPLACE DESC('LOCAL SEND Q FOR FX Q') PUT(ENABLED) GET(ENABLED) SHARE DEFSOPT(SHARED) DEFPSIS(YES) MAXDEPTH(99999) INDEXTYPE(MSGID)       
    DEFINE QMODEL('IBMQREP.SPILL,QMODELQ') REPLACE DEFSOPT(SHARED) MAXDEPTH(99999999) MSGDLVSQ(FIFO) DEFTYPE(PERMDYN)
    DEFINE CHANNEL('MQ1.TO.MQ2') REPLACE CHLTYPE(SDR) TRPTYPE(TCP) DISCINT(0) DESC('SENDER CHANNEL TO MQ2') XMITQ(XMQ1) CONNAME('10.2.1.1(9001)')

DEFPSIST

YES means that unless instructed otherwise, the Queue Manager logs transactions to these queues and can recover those messages in the event of failure or restart.



Q CONTROL TABLES -PARMTABLES

All control tables that are located on a node have to have the same schema.both the Q Capture and Q Apply on NODE 1 will use the replication schema ASN1 AND CONTROL TABLE ON NODE USE ASN2.

这里配置Q脚本是通过 ASNCLP配置的。它既支持交互方式,也支持批量方式。

举例说明

SET QMANAGER "MQ1A" FOR NODE 1;
SET QMANAGER "MQ1B" FOR NODE 2;
CREATE CONTROL TABLES FOR NODE 1 USING
CAPPARMS
RESTARTQ "MQ1.RESTARTQ"
ADMINQ "MQ1.ADMINQ"
MONITOR INTERVAL 10000
APPPARMS
IN ZOS PAGE LOCK DB DEMODB QCNTLAP CREATE
ROW LOCK DB DEMODB2 QCNTLAR CREATE
MONITOR INTERVAL 10000;
CREATE CONTROL TABLES FOR NODE 2 USING
CAPPARMS
RESTARTQ "MQ2.RESTARTQ"
ADMINQ "MQ2.ADMINQ"
MONITOR INTERVAL 10000
APPPARMS
IN ZOS PAGE LOCK DB RBBDEMO QCNTLAP CREATE
ROW LOCK DB RBBDEMO QCNTLAR CREATE
MONITOR INTERVAL 10000;

Create Q Maps

Two Q Maps were used for the first part of this exercise. Even though a single Q

Map can efficiently contain hundreds of Q Subscriptions, you might want to

separate your subscriptions into multiple Q Maps for administrative purposes.

这里即确定了表的订阅关系。

CREATE REPLQMAP BIDIND1_TO_BIDIND2_MAP1 ( NODE 1, NODE 2 ) USING
ADMINQ "BIDIND1.ADMINQ"
RECVQ "BIDIND1.TO.BIDIND2.DATAQ1"
SENDQ "BIDIND1.TO.BIDIND2.DATAQ1"
NUM APPLY AGENTS 2;
CREATE REPLQMAP BIDIND1_TO_BIDIND2_MAP2 ( NODE 1, NODE 2 ) USING
ADMINQ "BIDIND1.ADMINQ"
RECVQ "BIDIND1.TO.BIDIND2.DATAQ2"
SENDQ "BIDIND1.TO.BIDIND2.DATAQ2"
NUM APPLY AGENTS 2;

Create Q subscriptions

定义一张表属于哪一个Qmap,以及是否进行Q复制。

#
# Use MAP1 for TABLE1 and TABLE2
#
SET CONNECTION SOURCE DB1A.ASNB1 TARGET DB1C.ASNB2
REPLQMAP BIDIND1_TO_BIDIND2_MAP1 ;
SET TABLES (DB1.ASN1.TABLE1);
CREATE QSUB SUBTYPE U
FROM NODE DB1A.ASNB1 SOURCE HAS LOAD PHASE N
TARGET CONFLICT ACTION I
FROM NODE DB1C.ASNB2 SOURCE HAS LOAD PHASE N
TARGET CONFLICT ACTION F;

Q HOUSE-KEEPINT WORK ACTION

CHANLE ACTION:

START CHANNEL(MQ1A.TO.MQ1B)

DISPLAY CHSTATUS(MQ1BA.TO.MQ1B)

START CHANNEL(MQ1B.TO.MQ1A)

DISPLAY CHSTATUS(MQ1B.TO.MQ1A)

MQ QUEUE ACTION

+MQ2 CLEAR QLOCA(MQ2.SENDQ.Q1)
DISPLAY QLOCAL(MQ2.SENDQ.Q1) CURDEPTH

QAPPLY

Latency analysis

关于Q latency它与我们搭建Q的目标息息相关,特别是对IDC RPT/ROT 两个指标的影响巨大。因此分析Q lantency变成了DBA一项必须掌握的技能。要想进行latency分析,你必须明确latency是如何定义的,现有的工具以及如何进行调整。

LATENCY OUTLOOK

Control tables for replication monitoring

? IBMQREP_CAPMON

? IBMQREP_CAPQMON

? IBMQREP_APPLYMON

LATENCY

上图中每一个数字对应的动作如下:

t1 = transaction committed at the source site

t2 = the Q Capture program read the transaction from the log

t3 = the Q Capture program committed the message to the send queue

t4 = the Q Apply program read the message from the receive queue

t5 = the Q Apply successfully committed the transaction at the target site

对应的Q END-TO-END latency=T5-T1

即一个交易在目标端与源端commit之间的时间间隔。影响Q latency的因素一个有10个,具体的分布如下:

DETAIL LATENCY

需要说明的是背景为蓝色的影响因素对Q 的配置以及SQL语句关联较大。而白色背景部分的影响相对可控,前提是MQ,DBMS WORK WELL.

下面给出上述10个影响因素的定义,对应的monitor信息,以及如何调优。

seq Factor impacting latency Relevant statistics
1 Q Capture log latency IBMQREP_CAPMON:CURRENT_LOG_TIME
2 Q Capture time reading DB2 log records (DB2 IFI calls) IBMQREP_CAPMON:LOGREAD_API_TIME
3 Q Capture time waiting for memory IBMQREP_CAPMON:LOGRDR_SLEEPTIME,NUM_END_OF_LOGS
4 Q Capture time to put WebSphere MQ messages in a send queue and Q Capture time to commit all WebSphere MQ transactions IBMQREP_CAPQMON: MQPUT_TIME, QFULL_ERROR_COUNT IBMQREP_CAPMON:MQCMIT_TIME
5 WebSphere MQ time sending and staging WebSphere MQ messages not saved in monitor tables
6 Q Apply time to retrieve WebSphere MQ messages from a receive queue IBMQREP_APPLYMON:MQGET_TIME
7 Q Apply time waiting due to transaction dependencies IBMQREP_APPLYMON:DEPENDENCY_DELAY
8 Q Apply waiting for agents to pick up DBMS transactions ready to be applied IBMQREP_APPLYMON:WORKQ_WAIT_TIME
9 Q Apply time retrying SQL due to RI, unique violation,or deadlock IBMQREP_APPLYMON:RETRY_TIME
10 Q Apply time in DB2 processing successful transactions IBMQREP_APPLYMON:DBMS_TIME

CAPTURE LATENCY ANALYSIS

LOGREAD_API_TIME:IBMQREP_CAPMON The time spent in DB2 returning log records to Q Capture
NUM_END_OF_LOGS:IBMQREP_CAPMON table, lists the number of times that the Q Capture program reached the end of the DB2 log
LOGRDR_SLEEPTIME:IBMQREP_CAPMON LOGREADER SLEEP TIME FOR NUM_END_OF+_LOGS OR  it reached its memory limit threshold
MQPUT_TIME:IBMQREP_CAPMONThe time delay spent in WebSphere MQ to put new messages in the send queue 
MQCMIT_TIME: IBMQREP_CAPMON The time delay spent in WebSphere MQ to commit the MQ transactions 
XMITQDEPTH : IBMQREP_CAPQMON The transmit queue depth valu If the transmit queue depth keeps growing and there is a lot of I/O to the physical pageset of the transmit queue, this means that the capacity of the MQ channel has been exceeded

TRAN_BATCH_SZIE

平均每一个交易的大小 = MQ_BYTES/TRANS_PUBLISHED

平均每一个消息所包含的交易数目=MQ_MESSAGES /TRANS_PUBLISHED,该值应该近似等于TRAN_BATCH_SIZE

Q Apply latency

WORKQ_WAIT_TIME:IBMQREP_APPLYMON table tracks how long it takes transactions that are ready to be applied at the target site to be picked up by a Q Apply agent
RETRY_TIME     :IBMQREP_APPLYMON Q Apply is retrying some of the SQL statements included in
the replicated transactions,SUCH RI,UI,TIMEOUT,DEADLOCK
The DBMS_TIME  :IBMQREP_APPLYMON  tracks the average time spent in DB2 applying the transactions for a given Q Apply monitor interval. The transactions include user tables (replicated tables) and Q Replication tables (control table data that tracks the Q Apply progress)

QREP RECOVERY

STARTLRSN: lsn value from where it is safe to start reading the DB2 log

commit lsn: lsn value up to which it is safe to skip committed transactions as Q Capture is reading the log records, because the transactions were previously published.

本文所有的内容均整理自互联网,仅供参考学习,如有涉及版权问题,请自行删除本文,谢谢。

QREP RECOVERY

阅读(6015) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~