无聊之人--除了技术,还是技术,你懂得
分类: DB2/Informix
2016-03-02 22:45:04
本篇博文整理DATA SHARING GROUP 实现中最重要的组件CF,首先介绍给出了CF架构下系统架构,然后介绍了为什么要引入CF。接着从产品install的角度考告诉DBA CFRM 是怎么一回事,如何定义,生效,以及以及参数对应的意义。介绍完install 部分后,介绍了CF request的两种类型,同步请求,异步请求,以及同步请求如何根据heuristic algorithm 进行转化。CF的包含三种数据类型的结构,LIST,CACHE,LOCK.list部分比较简单,而lock部分我们在A DB2 Performance Tuning Roadmap --DIVE INTO LOCK 中已经介绍了它的构成,这里重点介绍了cache的内容,虽然和前面的博客内容有点重复DSG CF 之cache 漫谈 系统,但是这里的重点强调了实际生产中比较重要的store in cache的内容,对DBA而言GBP,最后给出了生产运维中需要密切关注的GBP shortage ,这里仅给出GBP shortage 的定义,具体如何监控,优化等后续整理。
有的人可能会有疑问是不是有点偏离A DB2 Performance Tuning Roadmap 系列,其实这些内容都是你进行不可或缺的基本功,只有对这些内容有了全面的了解,后续才能对症下药。
CF OVERVIEWCF ARCHITECTURE SIMPRE OVERVIEWWHY CF IMPORTED?CF-CFRM POLICYDEFINE-UTILITIESIXCL1DSUIXCMIAPUCFRM ACTIVE AND HOUSE-KEEPING WORKSStructure Allocation ProceduresCF STRUCTURE PROPERTITIES RELATED TO DBACF REQUESTCF Synchronous Request ProcessingCF Asynchronous Request ProcessingSync CF Request Heuristic AlgorithmLIST-SCACACHE-OVERVIEWCACHE STRUCTURECACHE STRUCTURE USAGESTORE-IN CACHE STRUCTURE(GBP) PERFORMANCE FACTORCF CACHE STRUCTURE REPORT(RMF)LOCK-DSNGROUP_LOCK1SCENARIO OF GBP SHORTAGEREF
这里我们首先给出一个简单的关于CF的框架,使用对CF大体感官上有一个理解。这些内容我们在后面都会给出自己的理解。
个人理解本质上CF的出现就是为DATA SHARING 而生,与oracle RAC 利用纯软件机制实现DBMS 不同,IBM同时利用软硬件实现了DATA SHARING GROUP.在RAS上都有很好的表现,这里没有贬低oracle的意思,仅是个人理解~~CF主要用来做高速缓存
CF can be used as a high speed caching facility
data consistency / buffer validation
ability to maintain a shared copy of data in cache structure in CF
ability to keep track of shared data that does not reside in CF permanent storage (i.e. disk)
local storage (i.e. z/OS or subsystem buffers)
high speed data access
Shared data can be stored in cache structure and made available to every system in sysplex
Invalid local copy of data can be refreshed with CF cached copy
CF access faster than I/O subsystem cache
FUNCITON: Creates the CFRM Couple Data Set
CONTENT OF DATASETS
POLICY
CF structure definitions used to set up and run a parallel sysplex, and are stored on the CFRM couple data set
Formatting the CFRM Couple Data Set
由于CFRM policy DATASETS的定义是自说明的,这里不在进行解释。
DATA TYPE(CFRM) ITEM NAME(POLICY) NUMBER(6) ITEM NAME(CF) NUMBER(2) ITEM NAME(STR) NUMBER(200) ITEM NAME(CONNECT) NUMBER(32)
Sample CFRM Policy GBP Definition
STRUCTURE NAME(groupname_GBP1) SIZE(20000) INITSIZE(15000) PREFLIST(CF01, xxxx) EXCLLIST(groupname_LOCK1,groupname_SCA)
由于这里几个参数很是重要,这里进行一一说明
size:STRUCTURE的最大大小,单位是KB
INITSIZE:FIRST DB2 CONNECT 时allocate的大小,或者是rebuild是充分分配的大小
PREFLIST:可用的CF清单,XES根据清单顺序选择可用的CF来分配结构
EXCLLIST:
FUNCTION:Creates CFRM Policy
ACTIVE COMMAND
SETXCF START,POLICY,TYPE=CFRM,POLNAME=(xxxx)
D XCF,STR
IXC359I 13.52.26 DISPLAY XCF
STRNAME | ALLOCATION TIME | STATUS |
---|---|---|
DSNDB0A_GBP0 | 06/08/2006 15:12:41 | ALLOCATED |
DSNDB0B_GBP0 | NOT ALLOCATED | |
... | ... | ... |
ISTGENERIC | 06/08/2006 13:14:47 | POLICY CHANGE PENDING |
ISTGENERIC used by VTAM
*two IXC * structures, used by XCF signaling services. These structures are extremely important to data sharing, as they support XCF communication services, which are heavily used by DB2 and IRLM
当DB2/IRLM第一次connect,XES会进行一系列的条件判断,如图。如果PREFLIST失败,则在PREFLIST第二个candidate CF进行allocate
Placement Control for z/OS CF Structures
ENFORCEORDER(YES) causes the system to enforce the order of CFs in the preference list in the process of structure allocation.
ENFORCEORDER(NO) Allows the User To Override XES Decision On Structure Location
REBUILDPERCENT
This is used in recovery situations, and should always be 1.SPECIFIED OF 1,DON'T LEAVE BLANK
3.100% Loss O
FLASHCOPY IS A SITUATION LIKE THIS,YOU NEED TO RECOVER
If the GBP was defined with AUTOREC(YES), DB2 does an internal recovery of the GRECP page sets, accessing the logs of members that updated the affected page sets. If AUTOREC is set to NO for the GBP,then manual START DATABASE commands must be issued to recover the affected page sets.
f Connectivity-HOW TO RECOVER
DUPLEXED-Sets the DUPLEXing attribute
DISABLED - Duplexing turned off; run in simplex mode
ENABLED - Run with duplexed GBPs
ALLOWED - Duplexing can be started with a command
Sizing Control for DB2 Structures
NITSIZE(1500) MINSIZE(1500) ALLOWAUTOALT(YES) FULLTHRESHOLD(85)
SIZE是比较重要的一个参数,其中INITSIZE是分配时候的 SIZE
MINISIZE时使该结构正常提供服务所需要的做小大小
ALLOWAUTOALT参数用来说明XES是否可以根据需要,比如发生STORAGE SHORTAGE,XES自动调整STRUCTURE的RATIO,或者是INITSIZE,以减缓STORAGE SHORTAGE。
针对GBP,首先尝试调整RATIO,如果不能解决STORAGE SHORAGE,然后增加INITSIZE.
其实最开始没有计划整理这块内容,但是后面需要了解performance 相关的内容时,需要,因此这里梳理一下。这里的request的exploiter 是SUBSYSTEM 如DB2,他们发起request,然后CF处理这些request。
请求的类型分为两种,Asynchronous requests,Synchronous requests
影响request的因素如下:
Speed of requesting CPU--Larger processor will ‘waits faster’ for a response
Busy conditions (Subchannel, path)
Requests sent to a CF may encounter busy conditions
Subchannel busy
Path busy
If busy condition is encountered
Some can be queued
Other request must complete synchronously (i.e. lock request)
CF accesses can be classified into two categories
Synchronous requests
Asynchronous requests
Time it takes to transmit data to the CF
CF link performance
Speed of data over link
Distance – Geographically Dispersed Parallel Sysplex?
Speed of CF processor
Shared LPAR versus dedicated LPAR
Shared CF versus Dedicated CF
Requesting processor spins waiting for CF request to complete
Two types of sync requests
Those that must continuously run as synchronous
Lock requests - XES spins
Those that start out as sync
But converted to async if doing so helps performance
Sync cache/list requests - XES changes to async
Requesting processor can do other processing while the CF request is queued
Immediate reply is not required
Async cache/list requests - XES queues
Serialized access is not required
Some async requests may have started out as sync requests
Sync cache/list requests - XES changes to async
其实这块内容更加复杂,大体上有个印象就Ok了。它的特点是:
‘self learning’ function available
出现这种算法的根本原因是在不影响性能的情况下,提高CPU/Cf的利用率
Fact: Long running Sync CF requests use more CPU on the sender
During sync requests, the requesting z/OS system / LPAR spin waiting for the CF response
CF response time for sync requests are monitored List, lock, and cache requests times compared to threshold value All long (only long) requests are converted to async No mater the reason Threshold used determined by certain factors Whether a simplex or duplex environment Lock and non-lock requests are normalized by processor types
Since it is heuristic
The decision threshold is continuously reevaluated by allow every nth sync request to be issued as unchanged and then z/OS compares the response time of request with the current threshold values So in coupling facility measurements you will still see sync elongated requests
DEFINITION: There is one list structure per data sharing group used as a Shared Communication Area (SCA) for the members of the group.
contents OF SCA
Logical Page List (LPL) Database Exception Table (DBET) Boot Strap Data Set (BSDS) - names of all members Log data set - names of all members Enabling RBA value LRSN delta Copy Pending Write Error ranges Image copy data for certain system data spaces
AIM
The SCA is also used to coordinate startup
DB2 uses the SCA to coordinate recovery--GROUP RESTART
CF can be used as a high speed caching facility
Cache structure made up of
directory to keep track of registered data elements
optionally, data elements
Usage of cache structure
data consistency / buffer validation
ability to maintain a shared copy of data in cache structure in CF
ability to keep track of shared data that does not reside in CF
permanent storage (i.e. disk)
local storage (i.e. z/OS or subsystem buffers)
high speed data access
Shared data can be stored in cache structure and made available to every system in sysplex
Invalid local copy of data can be refreshed with CF cached copy
CF access faster than I/O subsystem cache
CACHE WORK MECHNNISM
DIRECTORY ENTIES/ DATA ENTIES
Used to keep track of data entries that are shared among multiple systems
Every system that has a copy of a particular piece of shared data has a registration entryin this portion of the cache structure.
It is this directory whose entries are used to generate cross invalidation signals to indicate that a record in a local cache buffer may be invalid
--- > Used to contain a cached version of the data > > Optional
CONCRETE OVERVIEW
cache的使用方式共有三种,在DSG CF 之cache 漫谈 中我们已经给出详细的介绍,
STORE-IN CACHE的工作方式,对应的DB2 的STRUCTURE 就是GBP,后续和LOCAL BUFFER POOL 一起进行整理,这里仅是简单的描述一下GBP的引入使得我们通常的二级缓存变成了三级缓存形式,即LBP+GBP+DISK CACHE的结构,这里不考虑CPU内的L1 CACHE,L2 CACHE。使得数据的共享跨DB2 system 实现共享访问。
STORE THOUGH CACHE与STORE-IN CACHE的区别主要在于更新数据时,是否需要同步写磁盘,STORE-IN CACHE异步写磁盘,即CASTOUT过程。
这里我们研究一下影响GBP 性能的因素,主要有:
SIZE OF DIRECTORY ENTRY
Is the size too small
Forcing cross invalidates
Forcing castout processing
SIZE OF DATA ENTRY
Is the size too small
Forcing cross invalidates
Forcing castout processing
SIZE OF DATA ENTRY RELATIVE TO DIRECORY
READ FROM GBP
When shortage of space occurs--DIRECTORY RECLAIM
Directory entries for unchanged data are reclaimed via LRU algorithm
Buffer invalidation on host systems must occur
CF notifies all systems with a registered interest in the structure
Access times will suffer if the data needs to be re-accessed I/O must occur
Castout processing
Natural or being forced due to too small size of data entry
REQUEST Breakdown of sync versus async requests
DETAIL REASON Since many cache structures are duplex
SUBCH – delay due to Subchannel busy
PR WT – delay due to waiting on a peer to send
PR CMP – delay due to waiting on a peer to complete
READ/WRITE
READS - Number of read hits
Count of the number of times the CF returned data on a read request by any connector
WRITES - Number of writes to the CF structure
Count of times a connector placed changed or unchanged data into the CF
structure
Conditions of Interest – Reads versus Writes
One key usage of a cache structure is to take advantage of caching the data in the CF for
data sharing
Prefer to avoid file I/O
High Writes versus Low Reads
Never getting the benefit of caching the data
Condition may indicate:
Insufficient structure space allocated, and data entries (and perhaps directory entries) are being
discarded by the coupling facility space management routines
Inappropriate allocation of the ratio of directory entry to data elements is causing the data entries
to be discarded by the coupling facility space management routines
Note: For duplexed structures, expect secondary structure to have no/few reads
CASTOUT
Number of times cast-out processing occurred (changed data)
This is a count of the number of times a connector retrieved a changed data entry,wrote the data to DASD and caused the changed attribute to be reset to unchanged.
Castouts due to reclaims is not desirable and will adversely effect the data base manager and/or the user of the data base manager
This counter is of interest for store-in cache structures (i.e. DB2 group buffer pool structures) in determining the volume of changed data being removed from the structure
Note: This counter is not an indicator of the number times cast out processing was performed during the RMF interval.
A large amount of cast out activity on a single structure may warrant additional cache
structures and redirecting locally buffered data to different cache structure.
Cast out processing by the connectors must keep pace with the rate at which changed data is placed in the structure
When all directory or data elements are associated with changed data, no new data items can be registered or written to the structure.
Data particular to Cache Structures (DATA ACCESS)==XI
ta particular to Cache Structures (DATA ACCESS)
XI’s - This is the number of times a data item residing in a local buffer pool was marked invalid by the coupling facility during the interval
XI'S count values are seen for directory, store-in and store-thru caches. This count reflects both the amount of data sharing among the users of the cache and the amount of write/update activity against the data bases.
To the cache structure user, this means the data item must be re-acquired from DASD or perhaps the coupling facility structure, and interest in the item must be re-registered in the coupling facility structure
There are several "XI counts" obtained from the coupling facility which are consolidated into this
value. They are:
XI for Write
XI for Name Invalidation
XI for Complement Invalidation
XI for Local Cache Vector Entry Replacement
关于这部分内容,在A DB2 Performance Tuning Roadmap --DIVE INTO LOCK 中已经进行了介绍,使得跨系统并发控制成为可能,保证了数据的并发控制。
这里在补充 一下,查看系统DB2 RETAINED LOCKS的命令如下:
Displaying Retained Locks with a z/OS Command F IRA1IRLM,STATUS,ALLD
Sizes Need to be Estimated,其实相对来说,LIST,LOCK的大小比较号estimated,比较讨厌的是GBP的大小,如果GBP不可用,后果很严重。
What if No Room in the GBP Cache for Updated Pages?
What if No Room in the GBP for Directory Entries
上述两个问题的出现,导致CF需要一宗自适应的机制来自动的调整directory与element的大小,CF提供了2中机制来进行调节,一种是DIRECTORY RECLAIM(看成GBP内部调整)--即XES根据需要,与CFCC一起调整DIRECTORY与element的比率,第二种是即 GBP的auto alter(放大GBP)机制,即XES根据需要,与CFCC一起调整整个GBP的大小Structure Extensions 。
关于DIRECTORY RECLAIM的就是GBP 根据LRU算法,将最不经常访问的directory steal out.
It will scan those directory entries pointing to locally cached pages,and find the oldest one – the page that has not been referenced for ‘awhile’, and has moved to the end of the least recently used chain. It will steal, or ‘reclaim’ that entry for the new page’s information
这种情况只有在GBP当前不能在提供服务时才会进行DIRECTORY RECLAIM,如directory shortage,element shortage.
DSNB787I - RECLAIMS FOR DIRECTORY ENTRIES = 1512 FOR DATA ENTRIES = 5510 CASTOUTS = 12102 DSNB788I - CROSS INVALIDATIONS DUE TO DIRECTORY RECLAIMS = 1642 DUE TO WRITES = 3124 EXPLICIT = 0
*DSNB787I *displays counters for RECLAIMS: the number of times DB2
needed to ‘steal’ entries for new requests, such as for a directory entry
DSNB788I reports the number of cross-invalidations performed by the CF. If the first counter, ‘DUE TO DIRECTORY RECLAIMS’*, *is too high, and there is a large amount of re-referencing of locally cached data pages, there is a high performance cost
这里需要注意的是,由directory reclaim所触发的cross invalidation,同directory reclaim 的数目并不相等,前者更大。即触发一个directory entry reclaim的同时,可能会触发多个XI。 这样导致的结果就是每一个由directory ENTRY reclaim所产生的XI都会产生一个同步IO,这对系统来说是引入XI的副作用(sideeffect).
DSNB789I - REGISTER PAGE LIST = 545025 RETRIEVE CHANGED PAGES = 11106 RETRIEVE CLEAN PAGES = 7102 FAILED READS DUE TO LACK OF STORAGE = 0
‘REGISTER PAGE LIST’ *- *Number of RPL requests to the CF to register a
list of pages (when PREFETCH - sequential or list - is in effect.)
‘RETRIEVE CHANGED PAGES*’ *- CF requests to retrieve pages marked as
‘changed’ in the CF as a result of feedback from registering that page.
‘FAILED READS DUE TO LACK OF STORAGE*’ - *Number of CF reads
that failed because of lack of storage for a directory entry.
本文所有的内容均整理自互联网,仅供参考学习,如有涉及版权问题,请自行删除本文,谢谢。