系统表现出来的是:“mmon进程lock住了一些sys的对象,然后这个进程的cpu使用率会到100%”
做了debug后,trace文件的内容如下:
*** ACTION NAME:(Remote-Flush Slave Action) 2011-10-25 20:00:08.996
*** MODULE NAME:(MMON_SLAVE) 2011-10-25 20:00:08.996
*** SERVICE NAME:(SYS$BACKGROUND) 2011-10-25 20:00:08.996
*** SESSION ID:(2553.18657) 2011-10-25 20:00:08.996
WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0
WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65483
WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0
ksfdgo()+1488<-ksfdaio1()+9848<-kfkUfsIO()+594<-kfkDoIO()+631<-kfkIOPriv()+616<-kfdIOPriv()+95<-kfioSubmitIO()+503<-kfioRequestPriv()+166<-kfioRequest()+689<-ksfd_osmgo()+1286<-ksfdgo()+1488<-ksfdaio1()+9848<-ksfqwr()+335<-kcflfi()+670<-kcvrsz()+1131<-ktfbfcsz()+657
<-ktfbfxtnd()+237<-ktfbtgex1()+2461<-ktsxs_add()+1480<-ktspnr_next()+1206<-ktr***ec()+437<-ktspbmphwm()+1229<-ktspmvhwm()+49<-ktsp_bump_hwm()+191<-ktspgsp_cbk()+983<-kdisnew()+304<-kdisnewle()+125<-kdisle()+4556<-kdiins0()+26993<-kauxsin()+3965<-insidx()+2509
<-insflush()+466<-insrow()+933<-insdrv()+589<-inscovexe()+399<-in***ecStmtExecIniEngine()+85<-in***e()+384<-opiexe()+9334<-kpoal8()+2295<-opiodr()+1184<-kpoodrc()+38<-rpiswu2()+409<-kpoodr()+554<-upirtrc()+2101<-kpurcsc()+125<-kpuexecv8()+1705<-kpuexec()+2643
<-OCIStmtExecute()+41ssd_unwind_bp: unhandled instruction at 0x14fdbdf instr=6a
ssd_unwind_bp: unhandled instruction at 0x14fc333 instr=68
<-kewrose_oci_stmt_exec()+62<-kewrgwxf1_gwrsql_exft_1()+284<-kewrgwxf_gwrsql_exft()+451<-kewrews_execute_wr_sql()+52<-kewrftbs_flush_table_by_sql()+188<-kewrft_flush_table()+223<-kewrftec_flush_table_ehdlcx()+805<-kewrfat_flush_all_tables()+1243<-kewrfsr_flush_snapshot_r()+173
<-kewrrfs_remote_flush_slave()+1002<-kebm_slave_main()+221<-ksvrdp()+1159<-opirip()+748<-opidrv()+583<-sou2o()+114<-opimai_real()+317<-main()+116<-__libc_start_main()+219<-_start()+42
*** 2011-10-25 23:20:17.038
ssd_unwind_bp: unhandled instruction at 0x14fdbdf instr=6a
ssd_unwind_bp: unhandled instruction at 0x14fc333 instr=68
*** 2011-10-26 08:48:54.726
Received ORADEBUG command 'dump errorstack 3' from process Unix process pid: 1591, image:
*** 2011-10-26 08:48:54.726
ksedmp: internal or fatal error
Current SQL statement for this session:
insert into wrh$_sysstat (snap_id, dbid, instance_number, stat_id, value) select :snap_id, :dbid, :instance_number, stat_id, value from v$sysstat order by stat_id
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31 call ksedst1() 000000000 ? 000000001 ?
7FBFFD6590 ? 7FBFFD65F0 ?
7FBFFD6530 ? 000000000 ?
ksedmp()+610 call ksedst() 000000000 ? 000000001 ?
7FBFFD6590 ? 7FBFFD65F0 ?
7FBFFD6530 ? 000000000 ?
ksdxfdmp()+1153 call ksedmp() 000000003 ? 000000001 ?
7FBFFD6590 ? 7FBFFD65F0 ?
7FBFFD6530 ? 000000000 ?
看到前面加粗的部分就知道个大概了,AIO不足,
session的等待表现为:
SO: 0x159d85068, type: 4, owner: 0x15f94e478, flag: INIT/-/-/0x00
(session) sid: 2553 trans: (nil), creator: 0x15f94e478, flag: (100051) USR/- BSY/-/-/-/-/-
DID: 0002-02E5-00000030, short-term DID: 0000-0000-00000000
txn branch: (nil)
oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
service name: SYS$BACKGROUND
last wait for 'Data file init write' wait_time=0.000016 sec, seconds since wait started=46124
count=1, intr=100, timeout=ffffffff
blocking sess=0x(nil) seq=224
Dumping Session Wait History
for 'Data file init write' count=1 wait_time=0.000016 sec
count=1, intr=100, timeout=ffffffff
for 'Data file init write' count=1 wait_time=0.000016 sec
count=1, intr=100, timeout=ffffffff
for 'Data file init write' count=1 wait_time=0.000035 sec
count=1, intr=100, timeout=ffffffff
for 'Data file init write' count=1 wait_time=0.614215 sec
count=1, intr=100, timeout=ffffffff
for 'CSS operation: action' count=1 wait_time=0.000080 sec
function_id=41, =0, =0
for 'CSS initialization' count=1 wait_time=0.000004 sec
解决问题的办法也很简单:
增加fs.aio-max-nr 的值,比如本例中增加到fs.aio-max-nr = 1048576即可以解决该问题,
参考metalink :1313555.1、9949948.8
这个问题归属于一个Bug: 9949948
阅读(3286) | 评论(0) | 转发(0) |