小秦我在这里总结了下DB2中比较常用的动态视图和监控表函数。等有空的时候会再发一篇和监控有关的文章,介绍一下DB2中监控的具体组件以及相关的监控模式、监控参数。
查看容器信息:
db2 “select tbsp_name,container_name,container_type,total_pages,usable_pages,accessible,1.0*fs_total_size_kb/1024 as fs_total_size_mb,1.0*fs_used_size_kb/1024 as fs_used_size_mb,1.0*fs_used_size_kb/fs_total_size_kb as fs_used_pct from SYSIBMADM.CONTAINER_UTILIZATION”
查看app的rows read/rows return:
db2 “select application_handle,rows_read_per_rows_returned from sysibmadm.mon_connection_summary”
查看当前正在执行的耗时最长的SQL:
db2 “select elapsed_time_sec,activity_state,activity_type,application_handle from sysibmadm.mon_current_sql order by elapsed_time_sec desc fetch first 5 rows only”
查看缓冲池的效率:
db2 “SELECT BP_NAME, AVG_WRITE_TIME, SYNC_WRITES_PERCENT,AVG_SYNC_WRITE_TIME, AVG_ASYNC_WRITE_TIME FROM SYSIBMADM.MON_BP_UTILIZATION”
查看某个服务类中的各种耗时元素信息,单位是毫秒:
db2 “SELECT SUM(TOTAL_WAIT_TIME) AS WAIT,SUM(TOTAL_COMPILE_PROC_TIME) AS COMPILE,SUM(TOTAL_IMPLICIT_COMPILE_PROC_TIME) AS IMP_COMPILE,SUM(TOTAL_SECTION_PROC_TIME) AS SECTION,SUM(total_routine_user_code_proc_time) as ROUTINE_USER_CODE,SUM(TOTAL_COMMIT_PROC_TIME) AS COMMIT,SUM(TOTAL_REORG_PROC_TIME) AS REORG,SUM(TOTAL_RUNSTATS_PROC_TIME) AS RUNSTATS,SUM(TOTAL_ROLLBACK_PROC_TIME) AS ROLLBACK,SUM(TOTAL_LOAD_PROC_TIME) AS LOAD,SUM(total_connect_request_proc_time) as CONNECT_REQUEST FROM TABLE(MON_GET_SERVICE_SUBCLASS(‘SYSDEFAULTUSERCLASS’,’SYSDEFAULTSUBCLASS’,NULL))”
查看总的耗用时间百分比:
db2 “WITH PCTPROC AS (SELECT SUM(TOTAL_SECTION_TIME) AS SECT_TIME, SUM(TOTAL_SECTION_PROC_TIME) AS SECT_PROC_TIME,SUM(TOTAL_COMPILE_TIME) AS COMP_TIME, SUM(TOTAL_COMPILE_PROC_TIME) AS COMP_PROC_TIME,SUM(TOTAL_IMPLICIT_COMPILE_TIME) AS IMP_C_TIME, SUM(TOTAL_IMPLICIT_COMPILE_PROC_TIME) AS IMP_C_PROC_TIME,SUM(TOTAL_COMMIT_TIME) AS COMMIT_TIME, SUM(TOTAL_COMMIT_PROC_TIME) AS COMMIT_PROC_TIME,SUM(TOTAL_ROLLBACK_TIME) AS ROLLBACK_TIME, SUM(TOTAL_ROLLBACK_PROC_TIME) AS ROLLBACK_PROC_TIME,SUM(TOTAL_RUNSTATS_TIME) AS RUNSTATS_TIME, SUM(TOTAL_RUNSTATS_PROC_TIME)AS RUNSTATS_PROC_TIME,SUM(TOTAL_REORG_TIME) AS REORG_TIME, SUM(TOTAL_REORG_PROC_TIME) AS REORG_PROC_TIME,SUM(TOTAL_LOAD_TIME) AS LOAD_TIME, SUM(TOTAL_LOAD_PROC_TIME) AS LOAD_PROC_TIME FROM TABLE(MON_GET_CONNECTION(NULL, -2)) AS METRICS) SELECT CASE WHEN SECT_TIME > 0 THEN DEC((FLOAT(SECT_PROC_TIME) / FLOAT(SECT_TIME)) * 100,5,1) ELSE NULL END AS SECT_PROC_PCT, CASE WHEN COMP_TIME > 0 THEN DEC((FLOAT(COMP_PROC_TIME) / FLOAT(COMP_TIME)) * 100,5,1) ELSE NULL END AS COMPILE_PROC_PCT,CASE WHEN IMP_C_TIME > 0 THEN DEC((FLOAT(IMP_C_PROC_TIME) / FLOAT(IMP_C_TIME)) * 100,5,1) ELSE NULL END AS IMPL_COMPILE_PROC_PCT, CASE WHEN ROLLBACK_TIME > 0 THEN DEC((FLOAT(ROLLBACK_PROC_TIME) / FLOAT(ROLLBACK_TIME)) * 100,5,1) ELSE NULL END AS ROLLBACK_PROC_PCT, CASE WHEN COMMIT_TIME > 0 THEN DEC((FLOAT(COMMIT_PROC_TIME) / FLOAT(COMMIT_TIME)) * 100,5,1) ELSE NULL END AS COMMIT_PROC_PCT, CASE WHEN RUNSTATS_TIME > 0 THEN DEC((FLOAT(RUNSTATS_PROC_TIME) / FLOAT(RUNSTATS_TIME)) * 100,5,1) ELSE NULL END AS RUNSTATS_PROC_PCT, CASE WHEN REORG_TIME > 0 THEN DEC((FLOAT(REORG_PROC_TIME) / FLOAT(REORG_TIME)) * 100,5,1) ELSE NULL END AS REORG_PROC_PCT, CASE WHEN LOAD_TIME > 0 THEN DEC((FLOAT(LOAD_PROC_TIME) / FLOAT(LOAD_TIME)) * 100,5,1) ELSE NULL END AS LOAD_PROC_PCT FROM PCTPROC”
根据lock名获取信息:
db2 “SELECT SUBSTR(NAME,1,20) AS NAME,SUBSTR(VALUE,1,50) AS VALUE FROM TABLE( MON_FORMAT_LOCK_NAME(‘03001100000000000000000054’)) as LOCK”
MON_GET_ACTIVITY:只会的获取正在运行的application信息。换句话说,应该就不能查看历史信息了,除非使用ACTIVITY事件监视器:
一些基础信息:
[db2inst1@DB2_105 ~]$ db2 “select application_handle,LOCAL_START_TIME,uow_id,activity_id,parent_uow_id,parent_activity_id,activity_state,activity_type FROM table(MON_GET_ACTIVITY(NULL, -1))”
第一组:
db2 “select application_handle,uow_id,activity_id,total_cpu_time,rows_read,rows_returned,query_cost_estimate FROM table(MON_GET_ACTIVITY(NULL, -1))”
第二组:
db2 “select application_handle,uow_id,activity_id,direct_reads,direct_writes,effective_isolation,effective_lock_timeout,effective_query_degree FROM table(MON_GET_ACTIVITY(NULL, -1))”
第三组:
db2 “select application_handle,uow_id,activity_id,direct_reads,direct_writes,effective_isolation,effective_lock_timeout,effective_query_degree FROM table(MON_GET_ACTIVITY(NULL, -1))”
第四组:
db2 “select application_handle,uow_id,activity_id,stmt_text,eff_stmt_text FROM table(MON_GET_ACTIVITY(NULL, -1))”
获取mempool信息,单位都是KB:
db2 “SELECT varchar(memory_set_type, 20) AS set_type,varchar(memory_pool_type,20) AS pool_type,varchar(db_name, 20) AS dbname,memory_pool_used,memory_pool_used_hwm FROM TABLE(MON_GET_MEMORY_POOL(NULL, CURRENT_SERVER, -2))”
MON_GET_PKG_CACHE_STMT:和上面那个MON_GET_ACTIVITY的区别在于,上面那个东西只会的去查找正在运行的XXX,而这个则从package cache里找。不过,对于那些package cache中已经不存在的东东来说,就看不到啦,要看的话开package cache监视器:
第一组:
db2 “select section_type,effective_isolation,num_executions FROM TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2))”
第二组:
db2 “select lock_wait_time,log_buffer_wait_time,log_disk_wait_time,evmon_wait_time,diaglog_write_wait_time,pool_read_time,pool_write_time,direct_read_time,direct_write_time,prefetch_wait_time,total_section_proc_time FROM TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2))”
获取Bufferpool命中率:
db2 “WITH BPMETRICS AS (SELECT bp_name, pool_data_l_reads + pool_temp_data_l_reads +pool_index_l_reads + pool_temp_index_l_reads +pool_xda_l_reads + pool_temp_xda_l_reads as logical_reads, pool_data_p_reads + pool_temp_data_p_reads +pool_index_p_reads + pool_temp_index_p_reads +pool_xda_p_reads + pool_temp_xda_p_reads as physical_reads,member FROM TABLE(MON_GET_BUFFERPOOL(”,-2)) AS METRICS) SELECT VARCHAR(bp_name,20) AS bp_name,logical_reads,physical_reads,CASE WHEN logical_reads > 0 THEN DEC((1 – (FLOAT(physical_reads) / FLOAT(logical_reads))) * 100,5,2) ELSE NULL END AS HIT_RATIO,member FROM BPMETRICS”
或
db2 “select substr(bp_name,1,30) as BP_NAME,data_hit_ratio_percent,index_hit_ratio_percent,prefetch_ratio_percent from sysibmadm.mon_bp_utilization where bp_name not like ‘IBMSYSTEM%'”
查看热表:
db2 “select substr(tabschema,1,15) as tabschema,substr(tabname,1,15) as tabname,TAB_TYPE,TABLE_SCANS,ROWS_READ,(ROWS_INSERTED+ROWS_UPDATED+ROWS_DELETED) as rows_IUD from table(MON_GET_TABLE(null,null,null)) order by rows_read desc”
查看索引的相关信息:
db2 “select substr(mon.tabname,1,16) as table,member,substr(cat.indname,1,16) as IX_Name,mon.IID as Index_id,mon.index_scans, mon.index_only_scans from table(mon_get_index(NULL,NULL,-2)) as mon,syscat.indexes as cat where mon.tabname=cat.tabname and mon.tabschema=cat.tabschema and mon.iid=cat.iid order by mon.tabname”
查看排序的相关信息:
db2 “with dbcfg1 as(select int(value) as sheapthres_shr from sysibmadm.dbcfg where name=’sheapthres_shr’) select sheapthres_shr,sort_shrheap_allocated,dec((100*sort_shrheap_allocated)/sheapthres_shr,5,2) as sheap_alloc_pct,dec((100*sort_shrheap_top)/sheapthres_shr,5,2) as MAX_SHEAP_ALLOC_PCT,total_sorts,total_sort_time,1.0 * total_sort_time/(total_sorts + 1) as avg_sort_time,sort_overflows,1.0 * sort_overflows / (total_sorts+1) as overflow_pct,active_sorts from dbcfg1,sysibmadm.snapdb”
查看包缓存的一些信息:
db2 “with dbcfg1 as(select int(value) as pckcachesz from sysibmadm.dbcfg where name=’pckcachesz’) select pckcachesz,pkg_cache_lookups,pkg_cache_inserts,pkg_cache_num_overflows,100*pkg_cache_size_top/(pckcachesz*4096) as pkg_cache_alloc_pct from dbcfg1,sysibmadm.snapdb”
查看编目缓存的一些信息:
db2 “with dbcfg1 as(select int(value) as catcachesz_pages from sysibmadm.dbcfg where name=’catalogcache_sz’) select catcachesz_pages,CAT_CACHE_LOOKUPS,CAT_CACHE_INSERTS,CAT_CACHE_OVERFLOWS,(1.0-1.0*CAT_CACHE_INSERTS/CAT_CACHE_LOOKUPS) as cat_insert_pct,100*CAT_CACHE_SIZE_TOP/(catcachesz_pages*4096) as cat_cache_alloc_pct from dbcfg1,sysibmadm.snapdb”
查看日志使用信息:
db2 “select 1.0*TOTAL_LOG_AVAILABLE/1024/1024 as total_log_available_mb,1.0*TOTAL_LOG_USED/1024/1024 as total_log_used_mb,1.0*SEC_LOG_USED_TOP/1024/1024 as sec_log_used_top_mb,1.0*TOT_LOG_USED_TOP/1024/1024 as total_log_used_top,1.0*SEC_LOGS_ALLOCATED/1024/1024 as sec_logs_allocated_mb,APPL_ID_OLDEST_XACT from TABLE(SNAP_GET_DB(CAST (NULL AS VARCHAR(128)), -2))”
查看日志速率:
db2 “select log_reads,(1.0*LOG_READ_TIME_S+1.0*LOG_READ_TIME_NS)/1000000000.0 as log_read_time_s,LOG_WRITES,(1.0*LOG_WRITE_TIME_S + 1.0*LOG_WRITE_TIME_NS)/1000000000.0 as log_write_time_s,NUM_LOG_WRITE_IO,NUM_LOG_READ_IO,(1.0*LOG_READ_TIME_S+1.0*LOG_READ_TIME_NS)/1000000000.0/(1.0+num_log_read_io) as read_io_speed_s,(1.0*LOG_WRITE_TIME_S + 1.0*LOG_WRITE_TIME_NS)/1000000000.0/(1.0+num_log_write_io) as write_io_speed_s,NUM_LOG_PART_PAGE_IO,NUM_LOG_BUFFER_FULL,NUM_LOG_DATA_FOUND_IN_BUFFER FROM TABLE(SNAP_GET_DB(CAST (NULL AS VARCHAR(128)), -2))”
语句执行时间,可以调整那个order by来看:
db2 “select num_executions,total_act_time as total_time_ms,(total_act_time/num_executions) as avg_time_ms,total_sorts,(total_sorts/num_executions) as sorts_per_stmt,substr(stmt_text,1,35) as sql_stmt from table(mon_get_pkg_cache_stmt(‘d’,NULL,NULL,-1)) as dyn_cache where num_executions > 0 order by 2 desc fetch first 5 rows only”
查看长时间运行的SQL语句:
db2 “select substr(application_name,1,16) as appl_name,elapsed_time_sec as elapsed_secondes,substr(activity_state,1,20) as status,substr(session_auth_id,1,16) as auth_id,total_cpu_time,rows_returned,substr(stmt_text,1,30) as sql_stmt from sysibmadm.mon_current_sql order by 2 desc”
查看当前app的wait time:
db2 “select application_handle as app_id,total_wait_time,pool_read_time,pool_write_time,log_disk_wait_time,log_buffer_wait_time,tcpip_send_wait_time,tcpip_recv_wait_time,lock_wait_time from table(mon_get_connection(NULL,-1)) order by total_wait_time”
查看锁等:
db2 “select substr(lw.hld_application_name,1,10) as hold_app,substr(lw.hld_userid,1,10) as holder,substr(HLD_CURRENT_STMT_TEXT,1,16) as holder_stmt,substr(lw.req_application_name,1,10) as wait_app,substr(lw.req_userid,1,10) as waiter,substr(REQ_STMT_TEXT,1,16) as waiter_stmt,lw.lock_mode,lw.lock_object_type,substr(lw.tabname,1,10) as tabname,substr(lw.tabschema,1,10) as schema,lw.lock_wait_elapsed_time as waiting_seconds from sysibmadm.mon_lockwaits lw”
查看持有的锁信息:
db2 “select substr(conn.application_name,1,10) as application,substr(conn.system_auth_id,1,10) as authid,conn.num_locks_held as number_of_locks,conn.lock_escals as escalations,conn.lock_timeouts as lock_timeouts,conn.deadlocks as deadlocks,(conn.lock_wait_time/100) as lock_wait_time_ms from table(mon_get_connection(null,-1)) as conn”
查看锁内存相关:
db2 “with dbcfg1 as ( select float(bigint(value)*4096) as locklist from sysibmadm.dbcfg where name=’locklist’),dbcfg2 as (select float(bigint(value)) as maxlocks from sysibmadm.dbcfg where name=’maxlocks’) select dec((lock_list_in_use/locklist)*100,4,1) as LOCK_LIST_USED_PCT,dec((lock_list_in_use/(locklist*(maxlocks/100))*100),4,1) as TO_MAXLOCK_PCT,appls_cur_cons as NUMBER_OF_CONS,lock_list_in_use/appls_cur_cons as avg_lock_mem_per_con_bytes from dbcfg1,dbcfg2,sysibmadm.snapdb”
查看全表扫描的信息:
db2 “select substr(session_auth_id,1,10) as auth_id,substr(application_name,1,20) as appl_name,io_wait_time_percent as io_wait_pct,rows_read_per_rows_returned as rows_read_vs_returned from sysibmadm.mon_connection_summary”
查看脏页换取的百分比:
db2 “with db_snap as(select float(pool_drty_pg_steal_clns) as pg_steal,float(pool_drty_pg_thrsh_clns) as chg_pg_thrsh,float(pool_lsn_gap_clns) as softmax,float(pool_drty_pg_steal_clns+pool_drty_pg_thrsh_clns+pool_lsn_gap_clns) as total_clns from sysibmadm.snapdb where (pool_drty_pg_steal_clns+pool_drty_pg_thrsh_clns+pool_lsn_gap_clns) <> 0) select dec((pg_steal/total_clns)*100,4,1) as steals_pct,dec((chg_pg_thrsh/total_clns)*100,4,1) as threshold_pct,dec((softmax/total_clns)*100,4,1) as softmax_pct from db_snap”
查看预取方面的信息(asny读就是直接读数据,而不是从索引读数据,且一开始读的时候还不在缓存池中,sync读就死直接从索引读数据,且不用从索引在去扫表):
db2 “with bp_info as( select substr(bp_name,1,30) as bp_name,unread_prefetch_pages,pool_async_data_reads+pool_async_index_reads as async_reads,pool_data_p_reads+pool_index_p_reads+pool_temp_data_p_reads+pool_temp_index_p_reads as total_reads from table(mon_get_bufferpool(null,-1)) as bp_stats where bp_name not like ‘IBMSYSTEM%’) select bp_name,unread_prefetch_pages,dec(100*(total_reads-async_reads)/total_reads,5,2) as sync_reads_pct,dec(100*unread_prefetch_pages/total_reads,5,2) as unread_pages_pct from bp_info”
查看memory pool的信息:
db2 “select pool_id,pool_secondary_id,pool_cur_size,pool_watermark from sysibmadm.snapdb_memory_pool order by 1,2”
查看最老事务的信息:
db2 “select substr(uow.workload_occurrence_state,1,20) as status,substr(uow.session_auth_id,1,10) as authid,uow.application_handle as appl_handle,int(uow.uow_log_space_used/1024/1024) as log_used_MB,uow.total_act_time as total_activity_time_ms,uow.total_act_wait_time as total_activity_wait_time,uow.uow_start_time as uow_start_time from sysibmadm.snapdb db,table(mon_get_unit_of_work(null,-1)) as uow where uow.application_handle=db.appl_id_oldest_xact”
查看表空间相关信息:
db2 “select substr(tbsp_name,1,30) as tablespace_name,tbsp_type as type,TBSP_USING_AUTO_STORAGE,TBSP_AUTO_RESIZE_ENABLED,substr(tbsp_state,1,20) as status,(tbsp_total_size_kb/1024) as size_MB,(100-tbsp_utilization_percent) as free_space_pct,(((100-tbsp_utilization_percent)*tbsp_usable_size_kb)/100000) as free_space_mb from sysibmadm.mon_tbsp_utilization”
获取PATH的信息:
db2 “select substr(type,1,20) as type,substr(path,1,50) as path from sysibmadm.dbpaths order by type”
当前connection的一些信息,如平均CPU使用时间、ROWS_READ_PER_ROWS_RETURNED:
db2 “select application_handle as app_hdl,total_app_commits,total_app_rollbacks,avg_rqst_cpu_time,ROWS_READ_PER_ROWS_RETURNED from sysibmadm.MON_CONNECTION_SUMMARY”
当前connection中wait time在总请求时间中的所占比例:
db2 “select application_handle as app_hdl,ROUTINE_TIME_RQST_PERCENT,RQST_WAIT_TIME_PERCENT from sysibmadm.MON_CONNECTION_SUMMARY”
当前connection中各种WAIT TIME所占时间百分比(IO里没有算LOG的时间):
db2 “select application_handle as app_hdl,IO_WAIT_TIME_PERCENT,LOCK_WAIT_TIME_PERCENT,AGENT_WAIT_TIME_PERCENT,NETWORK_WAIT_TIME_PERCENT from sysibmadm.MON_CONNECTION_SUMMARY”
当前connection中实际干活的用的时间的百分比:
db2 “select application_handle as app_hdl,SECTION_SORT_PROC_TIME_PERCENT,(SECTION_PROC_TIME_PERCENT-SECTION_SORT_PROC_TIME_PERCENT) as SECTION_NON_SORT_TIME_PERCENT,COMPILE_PROC_TIME_PERCENT,TRANSACT_END_PROC_TIME_PERCENT,UTILS_PROC_TIME_PERCENT from sysibmadm.MON_CONNECTION_SUMMARY”
获取当前rows read/rows returned前10的SQL:
db2 “select APPLICATION_HANDLE,substr(ACTIVITY_STATE,1,15) as state,substr(ACTIVITY_TYPE,1,15) as type,ELAPSED_TIME_SEC,TOTAL_CPU_TIME,ROWS_READ,ROWS_RETURNED,QUERY_COST_ESTIMATE,substr(STMT_TEXT,1,15) as stmt from sysibmadm.mon_current_sql order by ROWS_READ/(ROWS_RETURNED+1) desc fetch first 10 rows only”
获取热表信息:
db2 “SELECT varchar(tabschema,20) as tabschema, varchar(tabname,20) as tabname, sum(rows_read) as total_rows_read, sum(rows_inserted) as total_rows_inserted, sum(rows_updated) as total_rows_updated, sum(rows_deleted) as total_rows_deleted,sum(TABLE_SCANS) as TABLE_SCANS FROM TABLE(MON_GET_TABLE(”,”,-2)) AS t GROUP BY tabschema, tabname ORDER BY TABLE_SCANS DESC fetch first 20 rows only”
–Time Span Monitor SQL
1.整体级别查看(在服务类、workload级别):
查看某个服务类中的各种耗时元素信息,单位是毫秒。先用这个看下某个服务子类下的耗时的实际情况:
db2 “SELECT SUM(TOTAL_WAIT_TIME) AS WAIT,SUM(TOTAL_COMPILE_PROC_TIME) AS COMPILE,SUM(TOTAL_IMPLICIT_COMPILE_PROC_TIME) AS IMP_COMPILE,SUM(TOTAL_SECTION_PROC_TIME) AS SECTION,SUM(total_routine_user_code_proc_time) as ROUTINE_USER_CODE,SUM(TOTAL_COMMIT_PROC_TIME) AS COMMIT,SUM(TOTAL_REORG_PROC_TIME) AS REORG,SUM(TOTAL_RUNSTATS_PROC_TIME) AS RUNSTATS,SUM(TOTAL_ROLLBACK_PROC_TIME) AS ROLLBACK,SUM(TOTAL_LOAD_PROC_TIME) AS LOAD,SUM(total_connect_request_proc_time) as CONNECT_REQUEST FROM TABLE(MON_GET_SERVICE_SUBCLASS(‘SYSDEFAULTUSERCLASS’,’SYSDEFAULTSUBCLASS’,NULL))”
老版本的DB2用这个:
db2 “SELECT SUM(TOTAL_WAIT_TIME) AS WAIT,SUM(TOTAL_COMPILE_PROC_TIME) AS COMPILE,SUM(TOTAL_IMPLICIT_COMPILE_PROC_TIME) AS IMP_COMPILE,SUM(TOTAL_SECTION_PROC_TIME) AS SECTION,SUM(TOTAL_COMMIT_PROC_TIME) AS COMMIT,SUM(TOTAL_REORG_PROC_TIME) AS REORG,SUM(TOTAL_RUNSTATS_PROC_TIME) AS RUNSTATS,SUM(TOTAL_ROLLBACK_PROC_TIME) AS ROLLBACK,SUM(TOTAL_LOAD_PROC_TIME) AS LOAD FROM TABLE(MON_GET_SERVICE_SUBCLASS(‘SYSDEFAULTUSERCLASS’,’SYSDEFAULTSUBCLASS’,NULL))”
或用这个基于XML的:
db2 “SELECT SUBSTR(T.SERVICE_SUPERCLASS_NAME,1,19) as SUPERCLASS,SUBSTR(T.SERVICE_SUBCLASS_NAME,1,19) as SUBCLASS,T.MEMBER,SUBSTR(U.METRIC_NAME, 1,20) AS METRIC_NAME,SUBSTR(U.PARENT_METRIC_NAME,1,20) AS PARENT_NAME,U.TOTAL_TIME_VALUE,U.COUNT FROM TABLE(MON_GET_SERVICE_SUBCLASS_DETAILS(NULL, NULL, -2)) AS T, TABLE(MON_FORMAT_XML_TIMES_BY_ROW(T.DETAILS)) AS U where U.PARENT_METRIC_NAME is null or U.PARENT_METRIC_NAME=’TOTAL_RQST_TIME’ order by superclass,subclass,total_time_value desc”
然后,如果发现等待时间过长,就要用下面这个去看下到底是哪一部分的等待时间过长:
db2 “SELECT sum(agent_wait_time) as agent,sum(wlm_queue_time_total) as wlm,sum(lock_wait_time) as lock,sum(log_buffer_wait_time) as log_buffer,sum(log_disk_wait_time) as log_disk,sum(tcpip_recv_wait_time+tcpip_send_wait_time) as tcpip,sum(ipc_recv_wait_time+ipc_send_wait_time) as ipc,sum(fcm_recv_wait_time+fcm_send_wait_time) as fcm,sum(audit_subsystem_wait_time+audit_file_write_wait_time) as audit,sum(diaglog_write_wait_time) as diaglog,sum(pool_write_time) as pool_write,sum(pool_read_time) as pool_read,sum(direct_read_time) as direct_read,sum(direct_write_time) as direct_write,sum(evmon_wait_time) as evmon,sum(total_extended_latch_wait_time) as latch,sum(prefetch_wait_time) as prefetch,sum(comm_exit_wait_time) as comm_exit,sum(ida_send_wait_time+ida_recv_wait_time) as ida,sum(cf_wait_time) as cluster_caching_facility,sum(reclaim_wait_time) as reclaim,sum(spacemappage_reclaim_wait_time) as space_map FROM TABLE(MON_GET_SERVICE_SUBCLASS(‘SYSDEFAULTUSERCLASS’,’SYSDEFAULTSUBCLASS’,NULL))”
老版本的用这个:
db2 “SELECT sum(agent_wait_time) as agent,sum(wlm_queue_time_total) as wlm,sum(lock_wait_time) as lock,sum(log_buffer_wait_time) as log_buffer,sum(log_disk_wait_time) as log_disk,sum(tcpip_recv_wait_time+tcpip_send_wait_time) as tcpip,sum(ipc_recv_wait_time+ipc_send_wait_time) as ipc,sum(fcm_recv_wait_time+fcm_send_wait_time) as fcm,sum(pool_write_time) as pool_write,sum(pool_read_time) as pool_read,sum(direct_read_time) as direct_read,sum(direct_write_time) as direct_write FROM TABLE(MON_GET_SERVICE_SUBCLASS(‘SYSDEFAULTUSERCLASS’,’SYSDEFAULTSUBCLASS’,NULL))”
或者,用下面这个更加方便的基于XML的东东,这里看到的TOTAL_WAIT_TIME是下面总的时间之和:
db2 “SELECT SUBSTR(TFXML.WORKLOAD_NAME, 1, 13) AS WORKLOAD_NAME,SUBSTR(WAITS.METRIC_NAME, 1, 25) AS METRIC_NAME,WAITS.TOTAL_TIME_VALUE,WAITS.COUNT FROM TABLE( MON_GET_WORKLOAD_DETAILS( NULL, -2 ) ) AS TFXML,TABLE( MON_FORMAT_XML_WAIT_TIMES_BY_ROW(TFXML.DETAILS)) AS WAITS ORDER BY WAITS.TOTAL_TIME_VALUE DESC”
PS:
可以用这个看看某一部分的PROC时间和总时间,来看下WAIT时间是否正常:
db2 “SELECT SUBSTR(T.SERVICE_SUPERCLASS_NAME,1,19) AS SUPERCLASS,SUBSTR(T.SERVICE_SUBCLASS_NAME,1,19) AS SUBCLASS,T.MEMBER,SUBSTR(COMP.METRIC_NAME,1,20) AS METRIC_NAME,SUBSTR(COMP.PARENT_METRIC_NAME,1,20) AS PARENT_NAME,COMP.TOTAL_TIME_VALUE AS TOTAL_TIME,COMP.PROC_TIME_VALUE AS TOTAL_PROC_TIME,COMP.COUNT FROM TABLE (MON_GET_SERVICE_SUBCLASS_DETAILS(NULL, NULL,-2)) AS T,TABLE(MON_FORMAT_XML_COMPONENT_TIMES_BY_ROW(T.DETAILS)) AS COMP WHERE COMP.PARENT_METRIC_NAME IS NOT NULL order by (COMP.TOTAL_TIME_VALUE – COMP.PROC_TIME_VALUE) desc”
2.如果在上面发现了问题,那么就要去定位了。一般问题都是发生在SQL上。换句话说,我们要找到哪条SQL,这条SQL的某个指标过大,造成了上面的WAIT等指标过大。
比如,如果POOL_READ_TIME时间过大,那么:
db2 “select SECTION_TYPE,NUM_EXECUTIONS,POOL_READ_TIME,substr(STMT_TEXT,1,32) as stmt from TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) order by POOL_READ_TIME desc”
又比如,LOCK_WAIT_TIME过大,那么:
db2 “select SECTION_TYPE,NUM_EXECUTIONS,LOCK_WAIT_TIME,LOCK_ESCALS,LOCK_ESCALS,DEADLOCKS,LOCK_TIMEOUTS,substr(STMT_TEXT,1,32) as stmt from TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) order by LOCK_WAIT_TIME desc”
如果DIRECT_READ_TIME过大,那么:
db2 “select SECTION_TYPE,NUM_EXECUTIONS,direct_read_time,substr(STMT_TEXT,1,32) as stmt from TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) order by direct_read_time desc fetch first 5 rows only”
还比如,TOTAL_SECTION_PROC_TIME过大,那么(另外,还可以从CPU角度去看,但这其实就是另外一种衡量的方法了):
db2 “select SECTION_TYPE,NUM_EXECUTIONS,total_section_time,total_section_sort_time,total_col_time,total_routine_non_sect_time,substr(STMT_TEXT,1,32) as stmt from TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) order by total_section_time desc”
一般WAIT的指标都可以在MON_GET_PKG_CACHE_STMT去找对应的指标来看,可以查查Infocenter。
从CPU的角度的化,建议用下面的这个视图MON_PKG_CACHE_SUMMARY (注意,有些单位是微秒,有些则是毫秒):
db2 “select section_type,TOTAL_STMT_EXEC_TIME,AVG_STMT_EXEC_TIME,TOTAL_CPU_TIME,AVG_CPU_TIME,TOTAL_LOCK_WAIT_TIME,AVG_LOCK_WAIT_TIME,TOTAL_IO_WAIT_TIME,AVG_IO_WAIT_TIME,PREP_TIME,ROWS_READ_PER_ROWS_RETURNED,AVG_ACT_WAIT_TIME,AVG_LOCK_ESCALS,AVG_RECLAIM_WAIT_TIME,AVG_SPACEMAPPAGE_RECLAIM_WAIT_TIME,substr(stmt_text,1,16) from SYSIBMADM.MON_PKG_CACHE_SUMMARY”
9.7用下面这个:
db2 “select section_type,TOTAL_STMT_EXEC_TIME,AVG_STMT_EXEC_TIME,TOTAL_CPU_TIME,AVG_CPU_TIME,TOTAL_LOCK_WAIT_TIME,AVG_LOCK_WAIT_TIME,TOTAL_IO_WAIT_TIME,AVG_IO_WAIT_TIME,PREP_TIME,ROWS_READ_PER_ROWS_RETURNED,substr(stmt_text,1,16) from SYSIBMADM.MON_PKG_CACHE_SUMMARY order by AVG_STMT_EXEC_TIME desc fetch first 5 rows only”
或者试试这个,可以按照平均执行时间或排序时间来看下问题SQL:
db2 “select num_executions,AVERAGE_EXECUTION_TIME_S,STMT_SORTS,SORTS_PER_EXECUTION,substr(stmt_text,1,16) as stmt from SYSIBMADM.TOP_DYNAMIC_SQL “