环境:
OS:Red Hat Linux As 5
DB:10.2.0.4
数据库在hang住的情况下,我们通常通过查询v$session_wait和v$lock视图找到相应的等待事件,但使用hanganalyze同样能得到这些信息.如下我们通过使用hangalyze获取到两个会话更新同一行数据的等待事件情况.
1.session1更新行数据
SQL> connect scott/scott
Connected.
SQL> create table tb_hang(id number,remark varchar2(20));
Table created.
SQL> insert into tb_hang values(1,'test');
1 row created.
SQL> commit;
Commit complete.
SQL> select USERENV('sid') from dual;
USERENV('SID')
--------------
146
SQL> update tb_hang set remark='hang' where id=1;
1 row updated.
这个时候不提交
2.session2同样更新session1更新的行
SQL> select USERENV('sid') from dual;
USERENV('SID')
--------------
154
SQL> update tb_hang set remark='hang' where id=1;
这个时候已经hang住了
3.session3使用hangalyze生成trace文件
SQL> connect / as sysdba
Connected.
SQL> oradebug hanganalyze 3;
Hang Analysis in /u01/app/oracle/admin/oracl/udump/oracl_ora_3941.trc
4.查看trace文件的内容
[oracle@hxl ~]$ more /u01/app/oracle/admin/oracl/udump/oracl_ora_3941.trc
/u01/app/oracle/admin/oracl/udump/oracl_ora_3941.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name: Linux
Node name: hxl
Release: 2.6.18-8.el5xen
Version: #1 SMP Fri Jan 26 14:42:21 EST 2007
Machine: i686
Instance name: oracl
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 3941, image: (TNS V1-V3)
*** SERVICE NAME:(SYS$USERS) 2012-06-16 01:13:29.241
*** SESSION ID:(144.14) 2012-06-16 01:13:29.241
*** 2012-06-16 01:13:29.241
==============
HANG ANALYSIS:
==============
Open chains found:
Chain 1 : :
<0/146/5/0x7861b254/3858/SQL*Net message from client>
-- <0/154/5/0x7861c370/3903/enq: TX - row lock contention>
Other chains found:
Chain 2 : :
<0/144/14/0x7861d48c/3941/No Wait>
Chain 3 : :
<0/149/1/0x7861ced8/3806/Streams AQ: waiting for time man>
Chain 4 : :
<0/151/1/0x7861c924/3804/Streams AQ: qmn coordinator idle>
Chain 5 : :
<0/158/5/0x7861da40/3810/Streams AQ: qmn slave idle wait>
Extra information that will be dumped at higher levels:
[level 4] : 1 node dumps -- [REMOTE_WT] [LEAF] [LEAF_NW]
[level 5] : 4 node dumps -- [SINGLE_NODE] [SINGLE_NODE_NW] [IGN_DMP]
[level 6] : 1 node dumps -- [NLEAF]
[level 10] : 13 node dumps -- [IGN]
State of nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/start/finish/[adjlist]/predec
essor):
[143]/0/144/14/0x786fa3dc/3941/SINGLE_NODE_NW/1/2//none
[145]/0/146/5/0x786fc944/3858/LEAF/3/4//153
[148]/0/149/1/0x78700160/3806/SINGLE_NODE/5/6//none
[150]/0/151/1/0x787026c8/3804/SINGLE_NODE/7/8//none
[153]/0/154/5/0x78705ee4/3903/NLEAF/9/10/[145]/none
[154]/0/155/1/0x78707198/3797/IGN/11/12//none
[155]/0/156/1/0x7870844c/3799/IGN/13/14//none
[157]/0/158/5/0x7870a9b4/3810/SINGLE_NODE/15/16//none
[159]/0/160/1/0x7870cf1c/3782/IGN/17/18//none
[160]/0/161/1/0x7870e1d0/3784/IGN/19/20//none
[161]/0/162/1/0x7870f484/3788/IGN/21/22//none
[162]/0/163/1/0x78710738/3786/IGN/23/24//none
[163]/0/164/1/0x787119ec/3774/IGN/25/26//none
[164]/0/165/1/0x78712ca0/3780/IGN/27/28//none
[165]/0/166/1/0x78713f54/3778/IGN/29/30//none
[166]/0/167/1/0x78715208/3776/IGN/31/32//none
[167]/0/168/1/0x787164bc/3770/IGN/33/34//none
[168]/0/169/1/0x78717770/3772/IGN/35/36//none
[169]/0/170/1/0x78718a24/3768/IGN/37/38//none
====================
END OF HANG ANALYSIS
====================
说明:
1.trace中最重要的部分
State of nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/start/finish/[adjlist]/predec
essor):
[145]/0/146/5/0x786fc944/3858/LEAF/9/10//153 --这里的LEAF表示该SID阻塞了predec
essor=153对应的SID=154,即为被阻塞者
[148]/0/149/1/0x78700160/3806/SINGLE_NODE/11/12//none
[150]/0/151/1/0x787026c8/3804/SINGLE_NODE/13/14//none
[153]/0/154/5/0x78705ee4/3903/NLEAF/15/16/[145]/none --这里的NLEAF表示该SID被阻塞了,adjlist对应的145对应的SID=146,即为阻塞者.
在上面的实例中的解释如下:
nodenum:定义每个session的序列号
sid: session的sid
sess_srno: session的Serial#
ospid: OS的进程ID
state: node的状态
adjlist: 表示blocker node
predecessor: 表示waiter node
IN_HANG:这表示该node处于死锁状态,通常还有其他node(blocker)也处于该状态
LEAF/LEAF_NW:该node通常是blocker.通过条目的”predecessor”列可以判断这个node是否是blocker。LEAF说明该NODE没有等待其他资源,而LEAF_NW则可能是没有等待其他资源或者是在使用CPU
NLEAF:通常可以看作这些会话是被阻塞的资源.发生这种情况一般说明数据库发生性能问题而不是数据库hang
IGN/IGN_DMP:这类会话通常被认为是空闲会话,除非其adjlist列里存在node。如果是非空闲会话则说明其adjlist里的node正在等待其他node释放资源。
SINGLE_NODE/SINGLE_NODE_NW:近似于空闲会话.
2.hanganalyze级别的定义
10 Dump all processes (IGN state)
5 Level 4 + Dump all processes involved in wait chains (NLEAF state)
4 Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3 Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2 Only HANGANALYZE output, no process dump at all
一般情况下,我们都定义到级别3,级别3以上的使用需要谨慎,因为级别3以上的,dump信息很多,影响I/O.
3.单节点和RAC下hanganalyze的使用:
单节点:
ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level ';
ORADEBUG hanganalyze
Rac:
ORADEBUG setmypid
ORADEBUG setinst all
ORADEBUG -g def hanganalyze
-- The End --
阅读(4973) | 评论(1) | 转发(1) |