分类: Oracle
2017-10-10 17:05:33
原文链接 http://blog.csdn.net/tianlesoftware/article/details/7237729
如果Oracle数据库hang住了,对Oracle做system dump,或做 hang analyze,是研究和解决问题的有效办法。如果能够连接数据库,并能够进行操作,那么用oradebug是简单快捷的办法。
OracleHANGANALYZE 功能诊断 DBhanging
http://blog.csdn.net/tianlesoftware/article/details/6321961
http://blog.csdn.net/tianlesoftware/article/details/6525628
SYS@anqing1(rac1)> oradebug setmypid
Statement processed.
SYS@anqing1(rac1)> oradebug dump systemstate 10
Statement processed.
SYS@anqing1(rac1)> oradebug tracefile_name
/u01/app/oracle/admin/anqing/udump/anqing1_ora_8725.trc
SYS@anqing1(rac1)> oradebug close_trace
Statement processed.
在系统hung的时候,systemstate基本等同于hanganalyze,可以用于诊断system hung。但是如果直接来查看这个trace file,比较麻烦,这时候可以使用ass.awk 脚本来帮助我们查看trace file。
Ass.Awk 脚本在LTOM的ltom431\ltom\tom_base\tom\src目录下,其名称是:ass109.awk。这个是目前的最新版本,ass.awk 脚本可以帮助我们分析和格式化system state dump。
LTOM的MOS说明:
LTOM - TheOn-Board Monitor User Guide [ID 352363.1]
LTOM 的下载地址:
在看AWK, awk是一种编程语言,用于在linux/unix下对文本和数据进行处理。数据可以来自标准输入、一个或多个文件,或其它命令的输出。
关于AWK的更多内容参考:
http://blog.csdn.net/tianlesoftware/article/details/6278273
我们这里使用的是AWR命令的 -fscripfile 参数,即从脚本文件中读取awk命令。这个脚本就是我们LTOM中的ass109.awk。
示例:
[oracle@rac1 src]$ awk -f ass109.awk /u01/app/oracle/admin/anqing/udump/anqing1_ora_8725.trc
Starting Systemstate 1
...................................
Ass.Awk Version 1.0.9 - Processing/u01/app/oracle/admin/anqing/udump/anqing1_ora_8725.trc
System State 1
~~~~~~~~~~~~~~~~
1:
2: waiting for 'pmon timer' wait
3: waiting for 'DIAG idle wait' wait
4: waiting for 'rdbms ipc message' wait
5: waiting for 'rdbms ipc message' wait
6: waiting for 'ges remote message' wait
7: waiting for 'gcs remote message' wait
8: waiting for 'rdbms ipc message' wait
9: waiting for 'rdbms ipc message' wait
10: waiting for 'rdbms ipc message' wait
11: waiting for 'rdbms ipc message' wait
12: waiting for 'buffer busy waits'(2,89,21) wait
13: waiting for 'rdbms ipc message' wait
14: waiting for 'enq: JS - queue lock'[Enqueue JS-00000000-00000001] wait
15: waiting for 'buffer busy waits'(2,89,21) wait
16: waiting for 'rdbms ipc message' wait
17:
18:
19: waiting for 'rdbms ipc message' wait
20: waiting for 'log file switch (archivingneeded)' wait
21: waiting for 'ASM background timer' wait
22: waiting for 'rdbms ipc message' wait
23: waiting for 'log file switch (archivingneeded)' wait
24: last wait for 'SQL*Net message fromclient'
25: for 'Streams AQ: waiting for messagesin the queue' wait
26: waiting for 'Streams AQ: qmn slave idlewait' wait
27: waiting for 'rdbms ipc message' wait
28: waiting for 'rdbms ipc message' wait
29: waiting for 'class slave wait' wait
30:
31: waiting for 'Streams AQ: qmncoordinator idle wait' wait
32: waiting for 'SQL*Net message fromclient' wait
34: waiting for 'SQL*Net message fromclient' wait
36: waiting for 'jobq slave wait' wait
37: for 'Streams AQ: waiting for timemanagement or cleanup tasks' wait
Blockers
~~~~~~~~
Above is a list of all the processes. If they are waiting for a resource
then it will be given in square brackets. Below is a summary of the
waited upon resources, together with the holder of that resource.
Notes:
~~~~~
o A process id of '???' implies that the holder was not found in the
systemstate.
Resource Holder State
Enqueue JS-00000000-00000001 23: waiting for 'log file switch (archivingneeded)'
Object Names
~~~~~~~~~~~~
Enqueue JS-00000000-00000001
48768 Lines Processed.
[oracle@rac1 src]$
这个使用ass.awk 处理之后的结果就直接查看trace 直观很多:
[oracle@rac1 src]$ head -100/u01/app/oracle/admin/anqing/udump/anqing1_ora_8725.trc
/u01/app/oracle/admin/anqing/udump/anqing1_ora_8725.trc
Oracle Database 10g Enterprise EditionRelease 10.2.0.4.0 - Production
With the Partitioning, Real ApplicationClusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME =/u01/app/oracle/product/10.2.0/db_1
System name: Linux
Node name: rac1
Release: 2.6.18-194.el5
Version: #1 SMP Tue Mar 16 21:52:43 EDT 2010
Machine: i686
Instance name: anqing1
Redo thread mounted by this instance: 1
Oracle process number: 24
Unix process pid: 8725, image: oracle@rac1(TNS V1-V3)
*** 2012-02-06 20:32:20.758
*** ACTION NAME:() 2012-02-06 20:32:20.755
*** MODULE NAME:(sqlplus@rac1 (TNS V1-V3))2012-02-06 20:32:20.755
*** SERVICE NAME:(SYS$USERS) 2012-02-0620:32:20.755
*** SESSION ID:(118.34295) 2012-02-0620:32:20.755
===================================================
SYSTEM STATE
------------
System global information:
processes: base 0x30e0ec78, size 150, cleanup 0x30e1dff8
allocation: free sessions 0x30e5a6c4, free calls (nil)
control alloc errors: 0 (process), 0 (session), 0 (call)
PMON latch cleanup depth: 0
seconds since PMON's last scan for dead processes: 64
system statistics:
313621 logons cumulative
30 logons current
14842838 opened cursors cumulative
….