Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2625385
  • 博文数量: 323
  • 博客积分: 10211
  • 博客等级: 上将
  • 技术积分: 4934
  • 用 户 组: 普通用户
  • 注册时间: 2006-08-27 14:56
文章分类

全部博文(323)

文章存档

2012年(5)

2011年(3)

2010年(6)

2009年(140)

2008年(169)

分类: Oracle

2009-05-07 14:34:32

  早上一上班就收到事业部同事的邮件。邮件中提到最近的计划数据收集由原来的0.5小时增加到现在的2小时ASCP的调度计划受到了影响,需要马上解决这个问题。我把最近的计划数据收集请求(实质上是一个请求集,里面包含两个请求:计划数据提取和阶段装入计划 ODS)的日志看了一下,发现从5月1号开始,计划数据收集的时间不断增长,其中计划数据提取这个请求耗时最长。通过metalink的资料得知,当计划数据提取这个请求提交的时候会激活另外一个叫刷新集合快照的请求,等这个请求完成后,数据提取的动作才开始。对比了一下日志发现确实如此。再查看一下刷新集合快照请求完成的时间,很耗时。对比几个请求的时间可以得出这样的结论:计划数据收集请求集依赖于刷新集合快照这个请求的完成,也就是说问题不是计划数据提取慢而是刷新集合快照出现了性能问题。实际上计划数据提取的过程仅仅几分钟而已。
  问题已经定位。那么是什么导致刷新集合快照这个请求出现了性能问题呢?首先有必要对这个请求进行trace。在EBS中找到这个请求的定义,勾上跟踪选项,然后单独提交这个请求,请求跑了一个小时(在测试环境)。tkprof trace文件来看看输出的结果,这里节选了一些关键的输出:
 
Rows     Row Source Operation
-------  ---------------------------------------------------
  19508  HASH JOIN 
  19508   VIEW 
  19508    SORT UNIQUE
  70916     TABLE ACCESS FULL MLOG$_BOM_COMPONENTS_B
 154376   TABLE ACCESS FULL BOM_COMPONENTS_B

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: CHOOSE
  19508   HASH JOIN
  19508    VIEW
  19508     SORT (UNIQUE)
  70916      TABLE ACCESS   GOAL: ANALYZED (FULL) OF
                 'MLOG$_BOM_COMPONENTS_B'
 154376    TABLE ACCESS   GOAL: ANALYZED (FULL) OF 'BOM_COMPONENTS_B'

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                       142        0.05          0.78
  db file scattered read                     122054        0.60        580.18
  latch free                                      5        0.00          0.01
********************************************************************************
SELECT DISTINCT M_ROW$$
FROM
 (SELECT M_ROW$$ FROM "BOM"."MLOG$_BOM_COMPONENTS_B" MLOG$ WHERE "SNAPTIME$$"
  > :1 AND ("DMLTYPE$$" != 'I')) LOG$ WHERE (M_ROW$$) NOT IN (SELECT ROWID
  FROM "BOM"."BOM_COMPONENTS_B" "MAS_TAB$" WHERE ("MAS_TAB$"."OBJ_NAME" IS
  NULL) AND MAS_TAB$.ROWID = LOG$.M_ROW$$)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.02          0          0          0           0
Fetch        1     26.44     458.04     966569    1005840          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     26.44     458.07     966569    1005840          0           0
 
...省略
 
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     3808      1.06      17.63        264       3461          3           0
Execute 210342     96.03    1214.96    2543420    2880409    1032826      407498
Fetch   126796     98.94    1896.00    2557081    3730664     448058      209002
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   340946    196.03    3128.60    5100765    6614534    1480887      616500
Misses in library cache during parse: 389
Misses in library cache during execute: 25
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      11        0.00          0.00
  SQL*Net message from client                    11        0.01          0.02
  db file sequential read                     18975        0.76        122.26
  process startup                                12        0.02          0.22
  PX Deq: Join ACK                               51        0.01          0.03
  PX Deq Credit: send blkd                       80        1.58          1.73
  PX Deq: Parse Reply                            47        0.04          0.14
  PX Deq: Execute Reply                         878        0.09          3.15
  PX Deq: Signal ACK                             93        0.09          2.07
  db file scattered read                     637890        1.39       2716.48
  enqueue                                        30        0.91          0.94
  rdbms ipc reply                                24        0.00          0.01
  buffer busy waits                             171        0.03          0.13
  undo segment extension                       4110        0.00          0.00
  latch free                                     13        0.00          0.01
  log file switch completion                      2        0.12          0.17
  log file sync                                   3        0.04          0.05
  PX Deq Credit: need buffer                     51        0.00          0.00
  PX Deq: Table Q Normal                          1        0.00          0.00
  327  user  SQL statements in session.
 3513  internal SQL statements in session.
 3840  SQL statements in session.
   85  statements EXPLAINed in this session.
********************************************************************************
Trace file: dev_ora_741426_SJH_CR1436888.trc
Trace file compatibility: 9.02.00
Sort options: fchela 
       0  session in tracefile.
     327  user  SQL statements in trace file.
    3513  internal SQL statements in trace file.
    3840  SQL statements in trace file.
     412  unique SQL statements in trace file.
      85  SQL statements EXPLAINed using schema:
           SYSTEM.prof$plan_table
             Default table was used.
             Table was created.
             Table was dropped.
 1030844  lines in trace file.

--如果你是ORACLE DBA这里的数据已经说明了问题,我就不啰嗦了。MLOG$_* 导致了性能问题。

在metalink上找了一些资料,对这个问题有了一个清晰的思路。继续在测试机上测试。单独提交刷新集合快照的请求,这次参数设置的不一样。具体是这样的:Refresh Method:Automatic Refresh
Threshold for Truncating Logs :0  这次运行10几分钟就完了。再次运行计划数据提取,6分钟搞定!这里需要注意的是,你需要单独提交个请求而非由计划数据提取请求去带动。尽管你把刷新集合快照请求的默认参数都设置好了,但是它在计划数据提取请求里依然不起作用。我在这里耽误了一点时间。OK,现在到生产环境做同样的事情。刷新集合快照在生产环境用了20分钟完成。在跑这个请求之前我记录下了表空间的使用情况,因为我知道MLOG$_*表所在的数据文件肯定会减小。最后对比发现果然是这样,其中变化最大的表空间是APPS_TS_SUMMARY,减小了10G。当用户再次跑计划数据收集请求的时候只用了24分钟,刷新集合快照只用了6分钟。你一定想知道这其中的原理吧?OK,看完了以下的介绍,你就会明白了。以下说明来自metalink。
 
Goal

Checked or relevance 26-APR-2008


The Refresh Collection Snapshots can be run as a standalone concurrent request using the following steps

This is recommended as a first step when launching Data Collections for the first time on a new or cloned instance.

Solution

The request is available under the Material Planner responsibility.
OR in 11.5.10 and above in Advanced Supply Chain Planner.
OR using System Administrator, it can be added to the Request Group OM Concurrent Programs
so that it is visible when for ATP Data Collections from Order Management

For Distributed installations with separate Instances for Transactions (ERP Source) and Planning (APS Destination) , this is ALWAYS run on the ERP Source Instance

NAV: Material Planner (or ASCP) / View / Requests / Submit a new Request

Parameters:
1. Refresh Method - Fast / Complete / Automatic 
Fast Refresh will only process changes since the last run of the Refresh Snapshots or Data Collections. 
Complete Refresh will process all the snapshots and completely refresh the snapshot(s). 
Automatic Refresh will truncate the snapshot log and then run a complete refresh of the snapshot(s).
    Note: Automatic Refresh does NOT check the profile MSC: Source Setup Required, so if running for the first time after a patch or in an newly cloned instance, then use Complete Refresh

Note: When running Data Collections request set, Refresh Snapshots always uses a Fast Refresh for performance reasons. This is usually not a problem, unless the snapshot logs have grown very large. Then corrective steps are required.

2. Snapshot Name - Use the Default 'All Snapshots' or choose a snapshot from the list

3. Threshold for Truncating Logs - Used only in Automatic Refresh.
Enter the number of lines to check for in the MLOG$ table.
For example, a value of 50 will truncate the log if 50 or more lines are in the MLOG$ snapshot log table and then perform a complete refresh of the snapshot.
Use 0 when running this to help resolve performance issues.

4. Degree - Refers to the degree of Parallelism to be used by the database to process the request.
Maximum value is 10. The higher the degree of parallelism, the faster the process will complete.
Note: Entering a number higher than 10 will result in the use of 10 for this value.

Also see Data Collections Fails - First Steps to understand the Setup Requests that are run by the Refresh Snapshots concurrent program when MSC: Source Setup required = Y

For Performance or Errors with Refresh Snapshots standalone or as part of the Data Collection request set

  1. First step is to run standalone for All Snapshots with Automatic Refresh and Threshold = 0. Then run Data Collections with a Complete Refresh.
    This step is highly recommended for new or cloned instances before Data Collections request set is launched for the first time.
    The standalone request with Automatic refresh may take a long time to complete if the snapshot logs have become very large. But this should result in better performance when running the Data Collections request set.
  2. IF errors with tables or tablespace occur (ORA-1654, ORA-1653, ORA-1652 ORA-1655, ORA-1651), THEN you will need to add space to the system.  should be helpful for information about managing space requirements for APS.
  3. IF snapshot errors occur (ORA-12008, ORA-12057), THEN run Automatic refresh for the individual snapshot listed in the log file to attempt to correct the problem.
    -- Sometimes the next Data Collections set will error on a different snapshot. IF this occurs, THEN run Automatic refresh for ALL Snapshots to resolve the issue.
  4. An ORA-12008 or ORA-12057 may be followed by a rollback error (ORA-1555, ORA-30036).
    A lot of rollback is required for large data sets. Try adding more space to the rollback
    OF IF rollback (undo) space is already very large (>= 20 GB), and running Automatic or Complete Refresh for ALL Snapshots fails, THEN run the Refresh Snapshots for EACH snapshot individually using Automatic Refresh 
    may also be helpful with information about how to overcome these errors and setting up rollback.
  5. This request can be useful in controlling the growth of the MLOG$ Snapshot Logs that are maintained on these snapshots. Please review MLOG Management
  6. For more information on the latest Data Collections patches please review List of High Priority Patches for the APS Suite .

 
阅读(3014) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~