Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1209364
  • 博文数量: 398
  • 博客积分: 10110
  • 博客等级: 上将
  • 技术积分: 4055
  • 用 户 组: 普通用户
  • 注册时间: 2007-12-23 20:01
个人简介

新博客http://www.cnblogs.com/zhjh256 欢迎访问

文章分类

全部博文(398)

文章存档

2012年(1)

2011年(41)

2010年(16)

2009年(98)

2008年(142)

2007年(100)

我的朋友

分类: Oracle

2007-12-28 21:02:57

等待事件:续

       在一个以读为主的系统中,log file syncenqueue等待事件通常是很小的。

       buffer busy waits:进程希望访问当前不在内存中的数据块,但是其他进程已经发出I/O请求读取该块到内存中;在内存中,但是不在兼容模式,如当前模式相对于一致性模式;

       free buffer waits:会话需要一个空闲缓冲以将数据块读入并且等待一个为脏的缓冲可用。通常是因为DBWR写脏缓冲的速度不够快;

       log file switch completion:会话等待日志切换完成,通常这样才能产生更多的重做;

       undo segment extension:会话等待撤销段扩张或收缩;

       write complete waits:会话等待一个请求的缓冲写入磁盘,缓冲在正在写期间不能使用;

       log file switch completion等待事件是一个日志切换影响的主要指示器。而且,如果日志切换性能很差,log buffer space通常也会立刻出现。调整日志切换的主要目标是最小化 log file switch completion并清除log buffer space

    调整归档时第一件需要确保的事是LGWR没有等待ARCn完成归档一个日志文件。第二个需要考虑的是ARCn活动时的影响对前台进程最小化。而这两个方面通常是刚好相对的,因此最好的目标就是调整ARCn使其足够快,并在可以证明其影响了前台进程时降低其速度。

 

       关于等待事件不能提供的信息主要是CPU密集的操作,主要还包括逻辑读,等待latch时的自旋,语句解析。这是因为Oracle进程花费CPU在解析语句或在繁忙的latch上自旋,并没有发生等待。

       Statspack中,报告了详细地等待事件信息以及CPU使用率,这些信息是非常有帮助的,但是其中报告的信息并不是来自于等待接口视图,而是v$sysstat视图。

       Oracle 9i Release 2开始,基于时间的统计自动收集,并且某些时间的单位为微秒。

 

收集等待事件信息

       具有SYSDBADBMS_SUPPORTDBMS_SYSTEM权限的用户都可以跟踪其他会话。

v$system_event视图

Name                            Null?    Type

    ------------------------------- -------- ----

    EVENT                                    VARCHAR2(64)

    TOTAL_WAITS                              NUMBER

    TOTAL_TIMEOUTS                           NUMBER

    TIME_WAITED                              NUMBER

    AVERAGE_WAIT                             NUMBER

    TIME_WAITED_MICRO                        NUMBER

TIME_WAITED_MICROTIME_WAITED的信息相同,区别是除以1000000得到秒。

v$session_event视图

Name                            Null?    Type

    ------------------------------- -------- ----

    SID                                      NUMBER

    EVENT                                    VARCHAR2(64)

    TOTAL_WAITS                              NUMBER

    TOTAL_TIMEOUTS                           NUMBER

    TIME_WAITED                              NUMBER

    AVERAGE_WAIT                             NUMBER

    MAX_WAIT                                 NUMBER

    TIME_WAITED_MICRO                        NUMBER

       之前,通常使用10046事件进行跟踪,从该版本开始,可以使用PL/SQL内置包,并且通常这更加友好。可使用如下的过程跟踪:

EXECUTE SYS.DBMS_SUPPORT.START_TRACE_IN_SESSION (sid, serial#)
EXECUTE SYS.DBMS_SYSTEM.SET_EV (sid, serial#, 10046, 8, '')

       EXECUTE SYS.DBMS_SUPPORT.START_TRACE

       DBMS_SUPPORT等价与100468。该包需要通过dbmssupp.sql安装。由于DBMS_SYSTEM.SET_EV包允许在任何会话中设置任何调试事件,因此官方不支持。

       关闭跟踪的方法:

EXECUTE SYS.DBMS_SUPPORT.STOP_TRACE

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

ALTER SESSION SET sql_trace = FALSE;

EXECUTE SYS.DBMS_SUPPORT.STOP_TRACE_IN_SESSION (sid, serial#)

oradebug setorapid [Oracle PID from v$process]

oradebug session_event 10046 trace name context forever, level 0

EXECUTE SYS.DBMS_SYSTEM.SET_EV (sid, serial#, 10046, 0, '')

       如下:

=====================

PARSING IN CURSOR #1 len=80 dep=0 uid=502 oct=3 lid=502

    tim=2293771931 hv=2293373707 ad='511dca20'

    SELECT /*+ FULL */ SUM (LENGTH(notes))

    FROM   customer_calls

    WHERE  status = :x

    END OF STMT

    PARSE #1:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=0,tim=2293771931

    BINDS #1:

        bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0

      size=24 offset=0

       bfp=09717724 bln=22 avl=02 flg=05

       value=43

    EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=2293771931

    WAIT #1: nam='SQL*Net message to client' ela= 0 p1=675562835 p2=1 p3=0

    WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=923 p3=8

    WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=931 p3=8

    WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=939 p3=8

    WAIT #1: nam='db file sequential read' ela= 0 p1=17 p2=947 p3=1

    WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1657 p3=8

    WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1665 p3=8

       "ela="指示了延迟,从9i开始以微秒为单位,之前为cs

 

使用等待事件信息提高系统性能

       首先使用Statspack收集,然后查看Top 5 Wait Events等待事件,并对这些主要事件进行研究。其中的一个快照的截图如下:

        Top 5 Wait Events

        ~~~~~~~~~~~~~~~~~                      Wait         % Total

        Event                         Waits         Time (cs)     Wt Time

        -------------------------------------------- ------------     ------------     -------

        buffer busy waits               1,962,372    1,278,649   50.03

        db file sequential read           1,336,870    1,050,878   41.12

        db file scattered read            47,717       49,326      1.93

        direct path write                8,070       40,574       1.59

        latch free                     38,220       31,012       1.21

       我们可以看到最主要的两个事件是buffer busy waitsdb file sequential read。然后我们就会决定对这些事件进行研究:

       查询v$event_name查看相关的参数:

    SQL> SELECT * FROM v$event_name WHERE name = 'buffer busy waits';

    EVENT#       NAME                PARAMETER1  PARAMETER2    PARAMETER3

    ----------      -----------------      ---------            ---------------          ---------------

    75                 buffer busy waits  file#              block#             id

       参数信息能够为你指明正确的研究方向。

       通过以下查询可以知道所有等待特定事件的会话等待的时间统计:

    SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3

    FROM     v$session_wait

    WHERE    event = 'buffer busy waits'

    ORDER BY sid;

    SID EVENT      STATE  SECONDS_IN_WAIT WAIT_TIME  P1  P2     P3

    ---- -----------------   -----   ---------------         --------- ------  ------ -----    ---

    12 buffer busy waits WAITE      1             0           30   62157  130

    31 buffer busy waits WAITE      1             0           30   23558  130

       由于瞬时性的特征,通常要执行几次得到结果。

 

       然后使用以下查询以p1p2做参数:

    SELECT owner, segment_name, segment_type

    FROM   dba_extents

    WHERE  file_id = &absolute_file_number

    AND    &block_number BETWEEN block_id AND block_id + blocks -1;

       这个等待事件的p3原因码metalink上的解释如下:

Reason Code (Id) P3

Reason

<=8.0.6

>=8.1.6

0

0

A block is being read

1003

100

We want to NEW the block but the block is currently being read by another session (most likely for undo).

1007

200

We want to NEW the block but someone else has is using the current copy so we have to wait for them to finish.

1010

230

Trying to get a buffer in CR/CRX mode , but a modification has started on the buffer that has not yet been completed.

1012

-

A modification is happening on a SCUR or XCUR buffer, but has not yet completed

1012 (duplicate)

231

CR/CRX scan found the CURRENT block, but a modification has started on the buffer that has not yet been completed.

1013

130

Block is being read by another session and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefor it will read the CR version of the block.

1014

110

We want the CURRENT block either shared or exclusive but the Block is being read into cache by another session, so we have to wait until their read() is completed.

1014 (duplicate)

120

We want to get the block in current mode but someone else is currently reading it into the cache. Wait for them to complete the read. This occurs during buffer lookup.

1016

210

The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or the session is in discrete TX mode, the session waits for the first time and the second time escalates the block as a deadlock and so does not show up as waiting very long. In this case the statistic: "exchange deadlocks" is incremented and we yield the CPU for the "buffer deadlock" wait event.

1016 (duplicate)

220

During buffer lookup for a CURRENT copy of a buffer we have found the buffer but someone holds it in an incompatible mode so we have to wait.

 

       接下来按照相同的方法需要检查db file sequential read事件,

    SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3

    FROM     v$session_wait

    WHERE    event = 'db file sequential read'

    ORDER BY sid;

       通常这两个事件是有一定的关联的,session A等待单块磁盘I/O将数据从磁盘读入缓冲,而与此同时呢,session B也希望读取相同的块,由于块不在缓存中但是已经有会话开始读取了,因此session B等待buffer busy waits而此时session A正在等待db file sequential read

       然后可以采取长期和短期的措施进行最大幅度的提高。

 

 

Buffer Busy Waits+ Latch Contention

       之前的例子说明了并发读的情况,再看多个Oracle进程尝试同时写缓冲缓存中数据块的多个副本的情况。buffer busy waitscontention将会同时发生。

       通常是发生了很严重的热缓冲,由于需要同时更新的数据在一个块中。

    SQL> SELECT * FROM v$event_name WHERE name = 'latch free';

    EVENT#   NAME    PARAMETER1  PARAMETER2    PARAMETER3

    ----------    ------------- --------------------  ---------------             ---------------

    3         latch free   address         number                 tries

       WAIT #2: nam='latch free' ela= 47004 p1=15113593728 p2=97 p3=0

    WAIT #2: nam='latch free' ela= 14629 p1=15113593728 p2=97 p3=1

    WAIT #2: nam='latch free' ela= 20652 p1=15113593728 p2=97 p3=2

    WAIT #2: nam='latch free' ela= 37737 p1=15113593728 p2=97 p3=3

       然后查询如下:

    SQL> SELECT name

    2  FROM   v$latch

    3  WHERE  latch# = 97;

    NAME

    ----------------------------------------------------------------

    cache buffers chains

Log File Waits

       之前提到过,对于批处理,如果发现性能相当低,通常使用以下脚本进行快速诊断:

        DROP TABLE previous_events;

 

        CREATE TABLE previous_events

        AS

        SELECT SYSDATE timestamp, v$system_event.*

        FROM   v$system_event;

 

        EXECUTE dbms_lock.sleep (30);

 

        SELECT   A.event,

                 A.total_waits - NVL (B.total_waits, 0) total_waits,

                 A.time_waited - NVL (B.time_waited, 0) time_waited

        FROM     v$system_event A, previous_events B

        WHERE    B.event (+) = A.event

        AND      A.event NOT IN

                 (

                 'client message', 'dispatcher timer', 'gcs for action',

                 'gcs remote message', 'ges remote message', 'i/o slave wait',

                 'jobq slave wait', 'lock manager wait for remote message',

                 'null event', 'Null event', 'parallel query dequeue', 'pipe get',

                 'PL/SQL lock timer', 'pmon timer', 'PX Deq Credit: need buffer',

                 'PX Deq Credit: send blkd', 'PX Deq: Execute Reply',

                 'PX Deq: Execution Msg', 'PX Deq: Signal ACK',

                 'PX Deq: Table Q Normal', 'PX Deque Wait', 'PX Idle Wait',

                 'queue messages', 'rdbms ipc message', 'slave wait',

                 'smon timer', 'SQL*Net message to client',

                 'SQL*Net message from client', 'SQL*Net more data from client',

                 'virtual circuit status', 'wakeup time manager'

                 )

        ORDER BY time_waited;

       然后进行解决。

 

Direct Path Read and Write Waits

SQL> SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3

2  FROM     v$session_wait

3  WHERE    event = 'direct path write'

4  ORDER BY sid;

SID EVENT   STATE SECONDS_IN_WAIT WAIT_TIME     P1     P2     P3

---- ----------------- ----- --------------- --------- ------ ------ ------

39 direct path write WAITI               0         0    201     65      7

47 direct path write WAITI               0         0    201   2248      7

      

SQL> SELECT * FROM v$event_name WHERE name = 'direct path write';

EVENT# NAME                   PARAMETER1      PARAMETER2             PARAMETER3

---------- --------------------   ----------- --------------- ---------------

115               direct path write    file number        first dba                  block cnt

      

       然后查询会话等待:

SQL> SELECT tablespace_name, file_id "AFN", relative_fno "RFN"

2  FROM   dba_data_files

3  WHERE  file_id = 201;

       如果没有返回任何行,则需要查询临时表状况:

SQL> SELECT tablespace_name, file_id "AFN", relative_fno "RFN"

2  FROM   dba_data_files

3  WHERE  file_id = 201;

       并且由于直接读绕过缓存,因此直接读的数量不计入逻辑读的数量。因此直接读降低缓存命中率,而出现缓存命中率负值的情况。

 

数据库链接等待情况

       "SQL*Net message from dblink"事件表明一个本地进程提交了一个查询到远程服务器,并等待响应。对于大数量的分布式查询而言,通常不适合于相关查询,如EXISTS,而是一次将数据从本地传送到远程并返回,或者从远程提取到本地再处理以减少网络轮回。

 

阅读(3370) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~