Oracle数据库的故障解决
1. ORA-00257
sqlplus /nolog
SQL*Plus: Release 10.2.0.1.0 - Production on 星期二 7月 25 10:44:18 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
SQL> connect / as sysdba
已连接。
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------- ---------- ---------- ---------- --- --------------------------------------- --------------
1 1 101 52428800 1 NO CURRENT 3621973 24-7月 -06
2 1 99 52428800 1 NO INACTIVE 3600145 24-7月 -06
3 1 100 52428800 1 NO INACTIVE 3611932 24-7月 -06
发现ARC状态为NO,表示系统没法自动做归档
SQL> select * from v$recovery_file_dest;
NAME SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
------------------------------------------------------------------------------------------------------------------
/oracle/flash_recovery_area 2147483648 2134212608 0 35
SQL> select * from v$flash_recovery_area_usage;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------- -------------- -------------- -------------
CONTROLFILE 0 0 0
ONLINELOG 0 0 0
ARCHIVELOG 69.97 0 40
BACKUPPIECE 30.01 0 2
IMAGECOPY 0 0 0
FLASHBACKLOG 0 0 0
已选择6行。
发现ARCHIVELOG占近70%,BACKUPPIRCR占了30%,这样FLASH_RECOVERY_AREA空间的空间已经被完全占据了
根据数据库目前可用存储空间为200GB、FLASH_RECOVERY_AREA空间为2GB的实际情况,把FLASH_RECOVERY_AREA的空间修改为20GB。
SQL> alter system set DB_RECOVERY_FILE_DEST_SIZE=20g;
系统已更改。
SQL> select * from v$recovery_file_dest;
------------------------------------------------------- ---------- -----------------------------------
NAME SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
----------- ---------- ----------------- ------------- -------------- ---------- ---------- ------------
/oracle/flash_recovery_area 2.1475E+10 2264587776 0 38
这时再查看日志的状态,发现REDO LOG处于正常的归档状态。
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------- ---------- ---------- ---------- --- -------------------------------------------- --------------
1 1 101 52428800 1 YES ACTIVE 3621973 24-7月 -06
2 1 102 52428800 1 NO CURRENT 3650399 25-7月 -06
3 1 100 52428800 1 YES INACTIVE 3611932 24-7月 -06
SQL> select * from v$flash_recovery_area_usage;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE 0 0 0
ONLINELOG 0 0 0
ARCHIVELOG 7.6 0 43
BACKUPPIECE 4.21 0 2
IMAGECOPY 0 0 0
FLASHBACKLOG 0 0 0
已选择6行。
SQL>
2. ORA-600
日志文件中错误信息:
Mon Apr 16 14:37:52 2007
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
SCN scheme 2
Using log_archive_dest parameter default value
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 9.2.0.1.0.
System parameters with non-default values:
processes = 150
timed_statistics = TRUE
shared_pool_size = 50331648
large_pool_size = 8388608
java_pool_size = 33554432
control_files = f:\ora\oradata\gzsb\control01.ctl, f:\ora\oradata\gzsb\control02.ctl, f:\ora\oradata\gzsb\control03.ctl
db_block_size = 8192
db_cache_size = 25165824
compatible = 9.2.0.0.0
db_file_multiblock_read_count= 16
fast_start_mttr_target = 300
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain =
instance_name = xxxxxx
dispatchers = (PROTOCOL=TCP) (SERVICE=gzsbXDB)
job_queue_processes = 10
hash_join_enabled = TRUE
background_dump_dest = c:\oracle\admin\gzsb\bdump
user_dump_dest = c:\oracle\admin\gzsb\udump
core_dump_dest = c:\oracle\admin\gzsb\cdump
sort_area_size = 524288
db_name = xxxxx
open_cursors = 300
star_transformation_enabled= FALSE
query_rewrite_enabled = FALSE
pga_aggregate_target = 25165824
aq_tm_processes = 1
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
CJQ0 started with pid=8
QMN0 started with pid=9
Mon Apr 16 14:37:54 2007
starting up 1 shared server(s) ...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Mon Apr 16 14:37:55 2007
ALTER DATABASE MOUNT
Mon Apr 16 14:38:00 2007
Successful mount of redo thread 1, with mount id 1403199684.
Mon Apr 16 14:38:00 2007
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE MOUNT
Mon Apr 16 14:38:00 2007
ALTER DATABASE OPEN
Mon Apr 16 14:38:00 2007
Beginning crash recovery of 1 threads
Mon Apr 16 14:38:00 2007
Started first pass scan
Mon Apr 16 14:38:00 2007
Completed first pass scan
62 redo blocks read, 4 data blocks need recovery
Mon Apr 16 14:38:00 2007
Started recovery at
Thread 1: logseq 43, block 3, scn 0.0
Recovery of Online Redo Log: Thread 1 Group 1 Seq 43 Reading mem 0
Mem# 0 errs 0: F:\ORA\ORADATA\GZSB\REDO01.LOG
Mon Apr 16 14:38:00 2007
Ended recovery at
Thread 1: logseq 43, block 65, scn 0.2970218
4 data blocks read, 4 data blocks written, 62 redo blocks read
Crash recovery completed successfully
Mon Apr 16 14:38:00 2007
Thread 1 advanced to log sequence 44
Thread 1 opened at log sequence 44
Current log# 2 seq# 44 mem# 0: F:\ORA\ORADATA\GZSB\REDO02.LOG
Successful open of redo thread 1.
Mon Apr 16 14:38:00 2007
SMON: enabling cache recovery
Mon Apr 16 14:38:00 2007
Undo Segment 1 Onlined
Undo Segment 2 Onlined
Undo Segment 3 Onlined
Undo Segment 4 Onlined
Undo Segment 5 Onlined
Undo Segment 6 Onlined
Undo Segment 7 Onlined
Undo Segment 8 Onlined
Undo Segment 9 Onlined
Undo Segment 10 Onlined
Successfully onlined Undo Tablespace 1.
Mon Apr 16 14:38:00 2007
SMON: enabling tx recovery
Mon Apr 16 14:38:00 2007
Database Characterset is ZHS16GBK
Mon Apr 16 14:38:00 2007
Errors in file c:\oracle\admin\gzsb\bdump\gzsb_smon_5196.trc:
ORA-00600: internal error code, arguments: [4194], [98], [75], [], [], [], [], []
Mon Apr 16 14:38:01 2007
Errors in file c:\oracle\admin\gzsb\udump\gzsb_ora_3648.trc:
ORA-00600: 内部错误代码,参数: [4194], [92], [84], [], [], [], [], []
Mon Apr 16 14:39:02 2007
Recovery of Online Redo Log: Thread 1 Group 2 Seq 44 Reading mem 0
Mem# 0 errs 0: F:\ORA\ORADATA\GZSB\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 2 Seq 44 Reading mem 0
Mem# 0 errs 0: F:\ORA\ORADATA\GZSB\REDO02.LOG
Mon Apr 16 14:39:02 2007
Errors in file c:\oracle\admin\gzsb\udump\gzsb_ora_3648.trc:
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [92], [84], [], [], [], [], []
Error 607 happened during db open, shutting down database
USER: terminating instance due to error 607
Instance terminated by USER, pid = 3648
ORA-1092 signalled during: ALTER DATABASE OPEN...
查看资料得到:
Would seem that the error is:
ORA-600 [4194] "Undo Record Number Mismatch While Adding Undo Record".
Refer to Metalink note 39283.1.
In future use Metalink note 153788.1 (Subject: Troubleshoot an ORA-600
Error Using the ORA-600 Argument Lookup Tool).
由于数据库只能在MOUNT状态所以
select * from v$rollname;
select * from undo$;
select * from v$tablespace;都不能使用
1、通过错误信息可以确定当前回滚段是1-10,所以修改PFILE文件将下面这个隐含参数加到文件中去:
._corrupted_rollback_segments='_SYSSMU1$','_SYSSMU2$','_SYSSMU3$','_SYSSMU4$','_SYSSMU5$','_SYSSMU6$','_SYSSMU7$','_SYSSMU8$','_SYSSMU9$','_SYSSMU10$'
STARTUP PFILE='XXXX'
可以看到数据库已经正常启动
2、create undo tablespace undotbs2 datafile 'F:\ORA\ORADATA\GZSB\undotbs2.dbf';
alter system set undo_tablespace=undotbs2;
drop tablespace undotbs2;
3、修改参数文件,变更undo表空间,并取消_corrupted_rollback_segments设置:
*.undo_tablespace='UNDOTBS2'
4、startup pfile='xxxxx'
create spfile from pfile;
shutdown immediate
statup
故障解决,进行全库备份
3. 10201上一个严重的BUG
环境 10201,AIX53
但据ORACLE解释,在任何操作系统版本都有此问题。
现象:监听器启动后,隔一段时间(长短不定),就会出现无法连接: 若是用10201版本的SQLPLUS,则会出现 NO LISTENER。
9207 版本的SQLPLUS,则会出现:没反应,HANG住。
原因:10201 版本上的一个BUG:4518443。其会自动创建一个子监听器,当出现此情况时,监听器将会挂起。
/opt/oracle/product/10g/network/log/listener.log有如下语句:
WARNING: Subscription for node down event still pending
检查是否真因为此BUG造成此现象:
$ ps -ef | grep tnslsnr
ora10g 8909 1 0 Sep 15 ? 902:44 /u05/10GHOME/DBHOME/bin/tnslsnr sales -inherit
ora10g 22685 8909 0 14:19:23 ? 0:00 /u05/10GHOME/DBHOME/bin/tnslsnr sales –inherit
正常情况只有一个监听器,而此BUG则会出现两个监听器。
解决方法:打补丁4518443 或者在listener.ora 文件里加入:
SUBSCRIBE_FOR_NODE_DOWN_EVENT_=OFF
其中, 是数据库的监听器的名称。如:默认情况下,监听器名为:LISTENER 。则语句就是:
SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER=OFF
重启监听程序:
lsnrctl stop
lncrctl start
4. ORA-01031: insufficient privileges的解决方法
1。检查sqlnet.ora 文件.
sqlnet.ora 文件损坏或格式不对可以导致出现该问题。
sqlnet.ora 文件可能存放路径为
$TNS_ADMIN/sqlnet.ora
如果没有设置$TNS_ADMIN默认在$ORACLE_HOME/network/admin/sqlnet.ora
或
$HOME/sqlnet.ora
(1). 可以从别的机器拷贝一个文件过来,注意备份原来的sqlnet.ora。
---检查sqlnet.ora 文件内容
(2). 检查SQLNET.AUTHENTICATION_SERVICES
如果没有使用dblink.检查该行并设置
SQLNET.AUTHENTICATION_SERVICES = (BEQ,NONE)
(3). SQLNET.CRYPTO_SEED
在unix 下不需要该参数。如果存在该行,注释掉或删掉
(4).AUTOMATIC_IPC
如果该参数为 ON,将强制使用"TWO_TASK" 连接
最好设置为OFF
AUTOMATIC_IPC = OFF
2.检查相关文件的权限配置。
找到$TNS_ADMIN/*
$ cd $TNS_ADMIN
$ chmod 644 sqlnet.ora tnsnames.ora listener.ora
$ ls -l sqlnet.ora tnsnames.ora listener.ora
-rw-r--r-- 1 oracle dba 1628 Jul 12 15:25 listener.ora
-rw-r--r-- 1 oracle dba 586 Jun 1 12:07 sqlnet.ora
-rw-r--r-- 1 oracle dba 82274 Jul 12 15:23 tnsnames.ora
3.检查操作系统相关设置。
(1). $ORACLE_HOME环境变量是否设置正确
% cd $ORACLE_HOME
% pwd
如果错误,请重新设置:
sh or ksh: ----------
$ ORACLE_HOME=
$ export ORACLE_HOME
Example:
$ ORACLE_HOME=/u01/app/oracle/product/7.3.3
$ export ORACLE_HOME
csh: ----
% setenv ORACLE_HOME Example:
% setenv ORACLE_HOME /u01/app/oracle/product/7.3.3
另外$ORACLE_HOME路径应为实际路径,不应是目录连接(ln -s)
(2) $ORACLE_SID是否设置正确;
% echo $ORACLE_SID
(3).确信没有设置$TWO_TASK
检查 "TWO_TASK" 是否设置:
sh, ksh or on HP/UX only csh:
-----------------------------------
env |grep -i two
- or -
echo $TWO_TASK
csh:
----
setenv |grep -i two
如果有返回行比如:
TWO_TASK=
- or -
TWO_TASK=PROD
就需要取消着这些环境变量设置 :
sh or ksh:
----------
unset TWO_TASK
csh:
----
unsetenv TWO_TASK
(4) 检查oracle 文件的权限:
% cd $ORACLE_HOME/bin
% ls -l oracle
权限应为:rwsr-s--x, or 6751.
如果不是:
% chmod 6751 oracle
(5). 检查当前所连接的操作系统用户是否是"osdba" 并且已经定义在:
"$ORACLE_HOME/rdbms/lib/config.s"
or
"$ORACLE_HOME/rdbms/lib/config.c".
通常应为dba
% id uid=1030(oracle) gid=1030(dba)
可以如果"gid" 是 "dba" , "config.s" or "config.c"
里面应该有: /* 0x0008 15 */ .ascii "dba\0"
如果没有添加目前的操作系统用户到dba 组,或则手工编辑更改config.c并且:%relink oracle
(6).所需要的文件系统是否正确mount
%mount
(7) 目前身份是否是"root" 并且操作系统环境变量 "USER", "USERNAME", and "LOGNAME" 没有设置成"root".
root用户是特例,除非当前组是dba 组,否则不能connect internal.
把root用户当前组改为dba组:
# newgrp dba
-----最好不要以root管理数据库;
(8).检查"/etc/group" :
是否存在重复行
% grep dba /etc/group
dba::1010:
dba::1100:
如果有,删掉没有用的。
(9).确信停掉的instance没有占用内存资源
比如:ipcs -b
T ID KEY MODE OWNER GROUP SEGSZ
Shared Memory:
m 0 0x50000ffe --rw-r--r-- root root 68
m 1601 0x0eedcdb8 --rw-r----- oracle dba 4530176
可以看到1601 被oracle 使用,删掉.
-------注意是否启动了多个instance
% ipcrm -m 1601
(10).如果同时还有ora-12705 错误检查一下环境变量:
"ORA_NLS", "ORA_NLS32", "ORA_NLS33" ,"NLS_LANG".
(11).检查 "ORACLE_HOME" and "LD_LIBRARY_PATH 环境变量:
$ LD_LIBRARY_PATH=$ORACLE_HOME/lib
$ export LD_LIBRARY_PATH
$ ORACLE_HOME=/u01/app/oracle/product/8.0.4
$ export ORACLE_HOME
(12).当前的instance 所再的磁盘是否有足够的磁盘空间
df -k
(13).用户对/etc/passwd 是否有读权限。
(14).如果使用mts 方式,确信你的连接使用dedicade server 方式。
(15).安装ORACLE所需操作系统补丁是否打全。ORACLE 是否已经补丁到最新