Oracle的共享内存段-HZdT7e4-ChinaUnix博客

HZdT7e4的ChinaUnix博客

首页　| 　博文目录　| 　关于我

HZdT7e4

博客访问： 2154540
博文数量： 1647
博客积分： 80000
博客等级：元帅
技术积分： 9980
用户组：普通用户
注册时间： 2008-10-13 15:15

文章分类

全部博文（1647）

未分配的博文（1647）

文章存档

2011年（1）

2008年（1646）

我的朋友

最近访客

推荐博文

Oracle的共享内存段

分类：

2008-10-28 18:10:43

最近看到ITPUB上有这样一个帖子，觉得有点意思，收录一下，以为借鉴。

这位朋友的Apache和运行在同一台主机上：

平台是redhat as 3 ，oracle 9204.
其他应用是apache，resin等。

因为以前发现apache运行时间长以后会出现共享内存不足的错误，具体错误信息如下：

[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm 2 No such file or directory
[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm /home/apache/logs/shm.file
[Fri Apr 13 06:00:03 2007] [warn] pid file /home/apache/logs/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Fri Apr 13 06:00:03 2007] [emerg] (28)No space left on device: Couldn't create accept lock

为了解决这个问题，这位同学的解决方法是：

因此，我写了一个脚本，来定时检测并清理。一直很有效。

当Apache和跑在同一台主机上时，这个脚本就出现了Bug：

前一段时间，新开了一个小应用，也是apache的应用，由于没地方放了，就放到oracle机器上了，一直运行比较好；
今天早上接到信息，说新开的这个apache应用服务停止了，打开log一看，又是共享内存的问题，二话不说，把原来的脚本在系统上跑了一遍，restart apache，ok。系统可以了。
过了几分钟。问题大了，说oracle服务宕了。赶紧检查，ps -ef|oracle 服务都没了

由于脚本中缺少必要的判断，Oracle的共享内存段也别清除，所以Oracle数据库也挂了，alterlog文集中记录了如下信息：

Errors in file /opt/oracle/admin/sc1/bdump/sc1_reco_5195.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_smon_5193.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
RECO: terminating instance due to error 27157
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/udump/sc1_ora_23824.trc:
ORA-27153: wait operation failed
ORA-27300: OS system dependent operation:semop failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwwait2
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_lgwr_5189.trc:

Oracle数据库是需要再系统上分配共享内存段的，这个是基本的常识，在故障之后，这位同学才想起来：

[root@oracle]# ipcs -s

------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 4849664 nobody 600 1
0x00000000 4882433 nobody 600 1
0x00000000 4915202 nobody 600 1
0x00000000 4947971 nobody 600 1
0x00000000 4980740 nobody 600 1
0xbeae576c 5111813 oracle 640 201
0xbeae576d 5144582 oracle 640 201
0xbeae576e 5177351 oracle 640 201
0xbeae576f 5210120 oracle 640 201
0xbeae5770 5242889 oracle 640 201
0x00000000 5275658 nobody 600 1
0x00000000 5308427 nobody 600 1
0x00000000 5341196 nobody 600 1
0x00000000 5373965 nobody 600 1
0x00000000 5406734 nobody 600 1
0x00000000 5439503 nobody 600 1
0x00000000 5472272 nobody 600 1
0x00000000 5505041 nobody 600 1

果然有oracle的共享内存，而我的脚本没有判断。如果只是删除apache用户的共享内存，可以这样

ipcs -s | grep apache | perl -e 'while () { @a=split(/\s+/); print `ipcrm sem $a[1]`}'

如果大家谁的应用和我这个类似，一定注意。

其实这个故障还是一个低价的故障，首先如果我们在不同的上运行同一个脚本，严谨的做法是需要经过检查、，以确认其正常运行性，未经过靠猜想是不值得信任的。
其次，作为严谨的一个方面，权限及运行脚本的用户身份是需要明确的，root用户执行任何操作都相当危险，应该慎之又慎。我在有些习惯DBA需要养成一文中对这方面曾有探讨。

话又说回来，如果这是一个重要的业务数据库，这样的操作引发的故障将是极为恐怖的（当然重要的系统这样的错误基本上也不会发生），所以作为一个DBA应该对自己的行为三思、多思而后行。

-The End-

-----

【责编:Youping】

--------------------next---------------------

阅读(273) | 评论(0) | 转发(0) |

上一篇：Oracle压力测试之orabm

下一篇：Java克隆(Clone)的应用

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6