从事IT基础架构多年,发现自己原来更合适去当老师……喜欢关注新鲜事物,不仅限于IT领域。
分类: Oracle
2009-09-10 11:54:07
: | 146599.1 | 类型: | TROUBLESHOOTING | |
上次修订日期: | 03-SEP-2008 | 状态: | PUBLISHED |
The purpose of this document is to provide an easy to use, step by step guide to resolving ORA-04031 errors.
Contents:
DIAGNOSING AND RESOLVING ORA-04031 ERROR
When any attempt to allocate a large piece of contiguous memory in the shared pool fails Oracle first flushes all objects that are not currently in use from the pool and the resulting free memory chunks are merged. If there is still not a single chunk large enough to satisfy the request the ORA-04031 error is returned. NOTE: These errors can occur on an ASM instance as well. The default shared_pool_size should be sufficient in most environments, but can be increased if you are experiencing ORA-04031 errors.
The message that you will get when this error appears is the following:
1. Instance parameters related with the Shared Pool
Ideally, this parameter should be large enough to satisfy any request scanning for memory on the reserved list without flushing objects from the shared pool. Since the operating system memory may constraint the size of the shared pool, in general, you should set this parameter to 10% of the SHARED_POOL_SIZE parameter.
Inadequate Sizing: The first thing is determining if the ORA-04031 error is a result of lack of contiguous space in the library cache by verifying the following from V$SHARED_POOL_RESERVED:
REQUEST_FAILURES is > 0 and LAST_FAILURE_SIZE is <
SHARED_POOL_RESERVED_MIN_ALLOCor
REQUEST_FAILURES is 0 and LAST_FAILURE_SIZE is < SHARED_POOL_RESERVED_MIN_ALLOC
If this is the case, consider lowering SHARED_POOL_RESERVED_MIN_ALLOC to
allow the database putting more objects into the shared pool reserved space and then increase the
SHARED_POOL_SIZE if the problem is not resolved.
NOTE: A bug was discoverd where LAST_FAILURE_SIZE can be wrong in cases where multiple pools are used. The value in LAST_FAILURE_SIZE can be a sum of failure sizes across all pools. This is fixed as of 9.2.0.7, 10.1.0.4, and 10.2.x.
Fragmentation: If this is not the case, then you must determine if the ORA-04031 was a result of fragmentation in the library cache or in the shared pool reserved space by following this rule:
REQUEST_FAILURES is > 0 and LAST_FAILURE_SIZE is >
SHARED_POOL_RESERVED_MIN_ALLOC.
To resolve this consider increasing SHARED_POOL_RESERVED_MIN_ALLOC to lower
the number of objects being cached into the shared pool reserved space and
increase SHARED_POOL_RESERVED_SIZE and SHARED_POOL_SIZE to increase the
available memory in the shared pool reserved space.
Another consideration: - Pre-9i, changing OPTIMIZER_MAX_PERMUTATIONS to 2000 can reduce shared pool space pressure
- Oracle BUGs
Oracle recommends to apply the latest patchser available for your platform. Most of the ORA-4031 errors related to BUGs can be avoided by applying these patchsets. The following table summarize the most common BUGs related with this error, possible workaround and the patchset that fixes the problem.
BUG Description Workaround Fixed ORA-4031 / SGA leak of PERMANENT memory occurs for buffer handles. _db_handles_cached = 0 8172, 901 ORA-4031 due to leak / cache buffer chain contention from AND-EQUAL access Not available 8171, 901 Bug:1318267
Not PublicINSERT AS SELECT statements may not be shared when they should be if TIMED_STATISTICS. It can lead to ORA-4031 _SQLEXEC_PROGRESSION_COST=0 8171, 8200 Bug:1193003
Not PublicCursors may not be shared in 8.1 when they should be Not available 8162, 8170, 901 ORA-4031/excessive "miscellaneous"
shared pool usage possible.
(many PINS)None-> This is known to affect the XML parser. 8174, 9013, 9201 KGLHDDEP PROBLEM IN RAC
Slow SGA memory leak in internal permanent space (KGL handle). Backports are available on various platforms and release levelsRestart problem node at intervals
(flushing shared pool doesn't
clear permanent structures.9207, 10105, 10201 Several number of BUGs related
to ORA-4031 errors were fixed
in the 9.2.0.5 patchsetN/A 9205
- ORA-4031 when compiling Java code:
If you run out of memory while compiling a java code (within loadjava or deployejb), you should see an error:A SQL exception occurred while compiling: : ORA-04031: unable to allocate bytes of shared memory ("shared pool","unknown object","joxlod: init h", "JOX: ioc_allocate_pal")
The solution is to shut down the database and set JAVA_POOL_SIZE to a larger value. The mention of "shared pool" in the error message is a misleading reference to running out of memory in the "Shared Global Area". It does not mean you should increase your SHARED_POOL_SIZE. Instead, you must increase your JAVA_POOL_SIZE, restart your server, and try again.
See
- Small shared pool size
In many cases, a small shared pool can be the cause of the ORA-04031 error.The following information will help you to adjust the size of the shared pool:
- Library Cache Hit Ratio
The hit ratio helps to measure the usage of the shared pool based on how many times a SQL/PLSQL statement needed to be parsed instead of being reused. The following SQL statement help you to calculate the library cache hit ratio:
SELECT SUM(PINS) "EXECUTIONS",
SUM(RELOADS) "CACHE MISSES WHILE EXECUTING"
FROM V$LIBRARYCACHE;If the ratio of misses to executions is more than 1%, then try to reduce the library cache misses by increasing the shared pool size.
Scripts used to be available for calculating a "best" size for the Shared Pool. Problems arose over time due to changes to internal memory structures and, in some cases, those older scripts could account for certain memory areas twice presenting you with percentages greater than 100%. There are some adjusted scripts available at:
: ORA-4031 Common Analysis/Diagnostic Scripts
However, with the introduction of the memory advisors (with 9.2x) and auto-tuning (with 10g Release 2), these estimation scripts are not as useful.: PERFORMANCE TUNING USING 10g ADVISORS AND MANAGEABILITY FEATURES
- Shared Pool Fragmentation:
Every time a SQL or PL/SQL statement needs to be executed the parse representation is loaded in the library cache requiring a specific amount of free contiguous space. The first resource where the database scans is the free memory available in the shared pool. Once the free memory is exhausted, the database looks for reusing an already allocated piece not in use. If a chunk with the exact size is not available, the scan continues looking for space based on the following criteria:- The chunk size is larger than the required sizeThen that chunk is split and the remaining free space is added to the appropriate free space list. When the database is operating in this way for a certain period of time the shared pool structure will be fragmented.
- The space is contiguous
- The chunk is available (not in use)When the shared pool is suffering fragmentation ORA-04031 errors (when the database cannot find a contiguous piece of free memory) may occur. Also as a concequence , the allocation of a piece of free space takes more time an the performance may be affected (the "chunk allocation" is protected by a single latch called "shared pool latch" which is held during the whole operation). However, ORA-4031 errors don't always affect the performance of the database.
If the SHARED_POOL_SIZE is large enough, most ORA-04031 errors are a result of dynamic sql fragmenting the shared pool. This can be caused by:
o Not sharing SQL
o Making unnecessary parse calls (soft)
o Setting session_cached_cursors too high
o Not using bind variablesTo reduce fragmentation you will need to address one or more of the causes described before. In general to reduce fragmentation you must analyze how the application is using the shared pool and maximize the use of sharable cursors.
Please refer to , which describes these options in greater detail. This note contains as well further detail on how the shared pool works.
The following views will help you to identify non-sharable versions of SQL/PLSQL text in the shared pool:
- V$SQLAREA View
This view keeps information of every SQL statement and PL/SQL block executed in the database. The following SQL can show you statements with literal values or candidates to include bind variables:
SELECT substr(sql_text,1,40) "SQL",
count(*) ,
sum(executions) "TotExecs"
FROM v$sqlarea
WHERE executions < 5
GROUP BY substr(sql_text,1,40)
HAVING count(*) > 30
ORDER BY 2;Note: The number "30" in the having section of the statement can be adjusted as needed to get more detailed information.
- X$KSMLRU View
There is a fixed table called x$ksmlru that tracks allocations in the shared pool that cause other objects in the shared pool to be aged out. This fixed table can be used to identify what is causing the large allocation.
If many objects are being periodically flushed from the shared pool then this will cause response time problems and will likely cause library cache latch contention problems when the objects are reloaded into the shared pool.
One unusual thing about the x$ksmlru fixed table is that the contents of the fixed table are erased whenever someone selects from the fixed table. This is done since the fixed table stores only the largest allocations that have occurred. The values are reset after being selected so that subsequent large allocations can be noted even if they were not quite as large as others that occurred previously. Because of this resetting, the output of selecting from this table should be carefully kept since it cannot be retrieved back after the query is issued.
To monitor this fixed table just run the following:
SELECT * FROM X$KSMLRU WHERE ksmlrsiz > 0;
This view can only be queried by connected as the SYS.
- X$KSMSP View (Similar to Heapdump Information)
Using this view you will be able to find out how the free space is currently allocated, which will be helpful to undrestand the level of fragmentation of the shared pool. As it was described before, the first place to find a chunck big enough for the cursor allocation is the free list. The following SQL shows the chunks available in the free list:
select '0 (<140)' BUCKET, KSMCHCLS, KSMCHIDX, 10*trunc(KSMCHSIZ/10) "From",
count(*) "Count" , max(KSMCHSIZ) "Biggest",
trunc(avg(KSMCHSIZ)) "AvgSize", trunc(sum(KSMCHSIZ)) "Total"
from x$ksmsp
where KSMCHSIZ<140
and KSMCHCLS='free'
group by KSMCHCLS, KSMCHIDX, 10*trunc(KSMCHSIZ/10)
UNION ALL
select '1 (140-267)' BUCKET, KSMCHCLS, KSMCHIDX,20*trunc(KSMCHSIZ/20) ,
count(*) , max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize", trunc(sum(KSMCHSIZ)) "Total"
from x$ksmsp
where KSMCHSIZ between 140 and 267
and KSMCHCLS='free'
group by KSMCHCLS, KSMCHIDX, 20*trunc(KSMCHSIZ/20)
UNION ALL
select '2 (268-523)' BUCKET, KSMCHCLS, KSMCHIDX, 50*trunc(KSMCHSIZ/50) ,
count(*) , max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize", trunc(sum(KSMCHSIZ)) "Total"
from x$ksmsp
where KSMCHSIZ between 268 and 523
and KSMCHCLS='free'
group by KSMCHCLS, KSMCHIDX, 50*trunc(KSMCHSIZ/50)
UNION ALL
select '3-5 (524-4107)' BUCKET, KSMCHCLS, KSMCHIDX, 500*trunc(KSMCHSIZ/500) ,
count(*) , max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize", trunc(sum(KSMCHSIZ)) "Total"
from x$ksmsp
where KSMCHSIZ between 524 and 4107
and KSMCHCLS='free'
group by KSMCHCLS, KSMCHIDX, 500*trunc(KSMCHSIZ/500)
UNION ALL
select '6+ (4108+)' BUCKET, KSMCHCLS, KSMCHIDX, 1000*trunc(KSMCHSIZ/1000) ,
count(*) , max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize", trunc(sum(KSMCHSIZ)) "Total"
from x$ksmsp
where KSMCHSIZ >= 4108
and KSMCHCLS='free'
group by KSMCHCLS, KSMCHIDX, 1000*trunc(KSMCHSIZ/1000);Note: The information available in this view is the same that is generated as part of a HEAPDUMP level 2.
Also be aware that running this query too often is likely to cause other memory issues in the shared pool.
There is also a port specific bug filed on HP and 10g where running queries on x$ksmsp will hang the database.
If the result of the above query shows that must of the space available is on the top part of the list (meaning available only in very small chuncks). It is very likely that the error is due to a heavy fragmentation.
You can also use this view as follows to review overall memory usage in the SGA
SQL> SELECT KSMCHCLS CLASS, COUNT(KSMCHCLS) NUM, SUM(KSMCHSIZ) SIZ,
To_char( ((SUM(KSMCHSIZ)/COUNT(KSMCHCLS)/1024)),'999,999.00')||'k' "AVG SIZE"
FROM X$KSMSP GROUP BY KSMCHCLS;
CLASS NUM SIZ AVG SIZE
-------- ---------- ---------- ------------
R-free 12 8059200 655.86k <= Reserved List
R-freea 24 960 .04k <= Reserved List
free 331 151736448 447.67k <= Free Memory
freeabl 4768 7514504 1.54k <= Memory for user / system processing
perm 2 30765848 15,022.39k <= Memory allocated to the system
recr 3577 3248864 .89k <= Memory for user / system processing
a) if free memory (SIZ) is low (less than 5mb or so) you may need to increase the shared_pool_size and shared_pool_reserved_size.
b) if perm continually grows then it is possible you are seeing system memory leak.
c) if freeabl and recr are always huge, this indicates that you have lots of cursor info stored that is not releasing.
d) if free is huge but you are still getting 4031 errors, (you can correlate that with the reloads and invalids causing fragmentation)
4. ORA-04031 error and Large Pool
The Large pool is an optional memory area that can be configured to provide large memory allocations for one of the following operations :session memory for the multi-threaded server and the Oracle XA interface. The memory ( Buffers ) for Oracle backup and restore operations and for I/O server processes. Parallel Execution messaging buffers. The Large pool does not have a LRU list. It is different from reserved space in the shared pool, which uses the same LRU list as other memory allocated from the shared pool.
Chunks of memory are never aged out of the large pool,memory has to be explicitly allocated and freed by each session.
If there is no free memory left when a request is made then an ORA-4031 will be signalled similar to this :ORA-04031: unable to allocate XXXX bytes of shared memory
("large pool","unknown object","session heap","frame")Few things can be checked when this error occurs:
1- Check V$SGASTAT and see how much memory is used and free using the following SQL statement:
SELECT pool,name,bytes FROM v$sgastat where pool = 'large pool';2- You can also take a heapdump level 32 to dump the large pool heap and check free chunks sizes.
Memory is allocated from the large pool in chunks of LARGE_POOL_MIN_ALLOC bytes to help avoid fragmentation. Any request to allocate a chunk size less LARGE_POOL_MIN_ALLOC will be allocated with size of LARGE_POOL_MIN_ALLOC. In general you may see more memory usage when using Large Pool compared to Shared Pool.
Usually to resolve an ORA-4031 in the large pool the LARGE_POOL_SIZE size must be increased.
5. ORA-04031 and SHARED POOL FLUSHING
There are several techniques to increase cursor sharability so that shared pool fragmentation is reduce as well as likeability of ORA-4031 errors. The best way is by modifying the application to use bind variables. Another workaround when the application cannot be modified is using CURSOR_SHARING to a value different of EXACT (Be aware that this may cause changes in execution plan, so it is advisable to test the application first). When none of the above techniques can be used and fragmentation is considearble heavy in the system, flushing the shared pool might help alliviating the fragmentation. However some considerations must be taken into account:
- Flushing the shared pool will cause that all the cursor that are not in use are removed from the library cache. Therefore just after the shared pool flusing is issued, most of the SQL and PL/SQL cursors will have to be hard parsed. This will increase the CPU usage of the system and will also increase the latch activity.
- When applications don't use bind variables and have heavy possibilities of many users doing frequen similar operations (like in OLTP systems) it is common that soon after the flush is issued the fragmentation is back in place. So be advice that flushing the shared pool is not always the solution for a bad application.
- For large shared pool flushing the shared pool may cause a halt of the system, specially when the instance is very active. It is recommended to flush the shared pool during off-peak hours.
6. Advanced analysis to ORA-04031error
If none of the techniques provided cannot resolve the occurence of ORA-04031 errors, additional tracing may be needed to get a snapshot of the shared pool when the problem is in place.
Modify the init.ora paramater to add the following events to get a trace file with additional information about the problem:
event = "4031 trace name errorstack level 3"
event = "4031 trace name HEAPDUMP level 2"
Note: This parameter will take not effect unless the instance is bounced.
Starting with 9.2.0.5, instead of requesting heapdump level 1,2 , 3 or 32 you can use level those same levels plus (536870912).
This will generate the 5 largest subheaps AND the 5 largest heap areas within each of those.
If the problem is reproducible, the event can be set at session level using the following statement before the execution of the faulty SQL statement:
SQL> alter session set events '4031 trace name errorstack level 3';
SQL> alter session set events '4031 trace name HEAPDUMP level 536870914';This trace file should be sent to Oracle Support for troubleshooting.
Important Note: In Oracle 9.2.0.5 and Oracle 10g releases a trace file is generated BY DEFAULT every time an ORA-4031 error occurs, and can be located in the user_dump_dest directory. If your database version is one of these, you don't need to follow the steps described before to generate additional tracing.
RELATED DOCUMENTS
FAQ: ORA-4031
How to Calculate Your Shared Pool Size
Understanding and Tuning the Shared Pool
Resolving Shared Pool Fragmentation In Oracle7
Tuning Library Cache Latch Contention
ORA-4031 / Continuos Growth of 'miscellaneous' in v$sgastat when STATISTICS_LEVEL is set to TYPICAL or ALL
ORA-4031 with calls to ksfd_alloc_sgabuffer, ksfd_alloc_contig_buffer, ksfd_get_contig_buffer