ORACLE数据文件被删除，相关操作是否还能继续？-TOMSYAN-ChinaUnix博客

美丽人生yanshoupeng.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

TOMSYAN

博客访问： 2896181
博文数量： 599
博客积分： 16398
博客等级：上将
技术积分： 6875
用户组：普通用户
注册时间： 2009-11-30 12:04

个人简介

WINDOWS下的程序员出身，偶尔也写一些linux平台下小程序，后转行数据库行业，专注于ORACLE和DB2的运维和优化。同时也是ios移动开发者。欢迎志同道合的朋友一起研究技术。数据库技术交流群：58308065，23618606

文章分类

全部博文（599）

移动开发（2）
防火墙（1）
脚本（6）
HADOOP（6）
透明网关（5）
工具（2）
DSI（2）
热点话题讨论（2）
oracle internal（28）
Jonathan Lewis（2）
ASKTOM（12）
OLAP（1）
C/Pro*c/c++（23）
PL/SQL（10）
ORACLE全文索引（5）
DB2（28）
METALINK（27）
AIX（4）
DELPHI（5）
性能优化（19）
分布式数据库管理（1）
书籍笔记（4）
备份与恢复（7）
数据库安装及升级（18）
资料下载（43）
Oracle高可用（32）
Linux（7）
Oracle（244）
未分配的博文（53）

文章存档

2014年（12）

2013年（56）

2012年（199）

2011年（105）

2010年（128）

2009年（99）

我的朋友

相关博文

ORACLE数据文件被删除，相关操作是否还能继续？

分类： Oracle

2012-05-17 10:52:28

前几天和朋友一起吃饭的时候，聊到如果一个数据文件被删除了,那么在这个文件中相关表是否还能操作。

在UNIX平台上,打开的文件都有一个文件句柄与之关联,一般的文件被删除,相应的文件句柄并没有释放。
此时我们是可以继续对这个文件进行操作的，直到文件句柄被释放。

看如下一个例子：

[oracle@dbtest ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Thu May 17 10:16:48 2012

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select * from scott.t ;

        ID NAME
---------- ------------------------------------------------------------
         1 cpic
         2 huateng
         3 yanshoupeng

SQL> select dbms_rowid.rowid_relative_fno(rowid) file# from scott.t;

     FILE#
----------
         4
         4
         4

SQL> select name from v$dbfile where file#=4;

NAME
--------------------------------------------------------------------------------
/test/orcl/orcl/users01.dbf

SCOTT下的表T存放到了文件 /test/orcl/orcl/users01.dbf 中。

SQL> insert into scott.t values(4,'a');

1 row created.

SQL> commit;

Commit complete.

SQL> alter system checkpoint ;

System altered.

SQL> select * from scott.t ;

        ID NAME
---------- ------------------------------------------------------------
         4 a
         1 cpic
         2 huateng
         3 yanshoupeng

日常的操作是没有任何问题的。

此时我们删除这个文件。

SQL> ! rm -rf /test/orcl/orcl/users01.dbf

SQL> ! ls -ltr /test/orcl/orcl/
total 2625104
-rw-r--r--    1 oracle   oinstall        373 May 16 10:52 log.bbd
-rw-r-----    1 oracle   oinstall   52429312 May 16 14:21 redo01.log
-rw-r-----    1 oracle   oinstall   20979712 May 16 22:00 temp01.dbf
-rw-r-----    1 oracle   oinstall   52429312 May 17 05:00 redo02.log
-rw-r-----    1 oracle   oinstall   31465472 May 17 10:18 undotbs01.dbf
-rw-r-----    1 oracle   oinstall 513810432 May 17 10:18 system01.dbf
-rw-r-----    1 oracle   oinstall 314580992 May 17 10:18 sysaux01.dbf
-rw-r-----    1 oracle   oinstall   96477184 May 17 10:18 abcd_2.dbf
-rw-r-----    1 oracle   oinstall   96477184 May 17 10:18 abcd_1.dbf
-rw-r-----    1 oracle   oinstall   96477184 May 17 10:18 abcd_0.dbf
-rw-r-----    1 oracle   oinstall   52429312 May 17 10:20 redo03.log
-rw-r-----    1 oracle   oinstall    5488640 May 17 10:20 control03.ctl
-rw-r-----    1 oracle   oinstall    5488640 May 17 10:20 control02.ctl
-rw-r-----    1 oracle   oinstall    5488640 May 17 10:20 control01.ctl

SQL> select * from scott.t ;

        ID NAME
---------- ------------------------------------------------------------
         4 a
         1 cpic
         2 huateng
         3 yanshoupeng

SQL> delete from scott.t;

4 rows deleted.

SQL> commit;

Commit complete.

SQL> insert into scott.t values(5,'b');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from scott.t;

ID NAME
---------- ------------------------------------------------------------
5 b

SQL> alter system checkpoint;

System altered.

SQL> select * from scott.t;

ID NAME
---------- ------------------------------------------------------------
5 b

可以看到数据库还能继续操作的，像没有删除一样。

而且数据库日志中一点错误也没有。

虽然文件被删除了，但是相应的文件句柄并没有释放,ORACLE相关的进程通过这个文件句柄还能继续对文件进行操作，就像没有被删除一样。
以前的时候很多朋友都不理解，为何当前的日志文件被删除，ORACLE没有crash掉，还能继续运行，其实是一个原因。

[oracle@dbtest ~]$ ps -ef | grep ora_dbw
oracle 884830       1   0 14:21:27      - 0:01 ora_dbw2_orcl
oracle 758204 406292   0 10:21:29 pts/7 0:00 grep ora_dbw
oracle 275238       1   0 14:21:27      - 0:01 ora_dbw3_orcl
oracle 709866       1   0 14:21:27      - 0:01 ora_dbw1_orcl
oracle 395236       1   0 14:21:27      - 0:01 ora_dbw0_orcl
[oracle@dbtest ~]$ cd /proc/395236
[oracle@dbtest 395236]$ ls
as      cred    ctl     cwd     fd      lwp     map     object psinfo sigact status sysent
[oracle@dbtest 395236]$ cd fd
[oracle@dbtest fd]$ ls
0   1   10 12 13 14 15 16 2   3   4   5   6   7   8   9
[oracle@dbtest fd]$ ls -ltr
total 3976
-r--r--r--    1 oracle   oinstall     849408 May 10 2008 16
-r--r--r--    1 oracle   oinstall     849408 May 10 2008 12
p---------    0 oracle   oinstall          0 May 16 10:54 9
-rw-rw----    0 oracle   oinstall          0 May 16 14:21 7
--w-------    1 oracle   oinstall        858 May 16 14:21 5
--w-------    1 oracle   oinstall     157667 May 17 05:00 8
--w-------    1 oracle   oinstall     157667 May 17 05:00 6
-rw-rw----    1 oracle   oinstall       1544 May 17 10:19 15
-rw-rw----    1 oracle   oinstall       1544 May 17 10:19 10
c---------    1 oracle   oinstall      2, 2 May 17 10:21 4
c---------    1 oracle   oinstall      2, 2 May 17 10:21 3
c---------    1 oracle   oinstall      2, 2 May 17 10:21 2
c---------    1 oracle   oinstall      2, 2 May 17 10:21 1
c---------    1 oracle   oinstall      2, 2 May 17 10:21 0
-rw-rw-rw-    1 oracle   oinstall          0 May 17 10:21 14
-rw-rw-rw-    1 oracle   oinstall          0 May 17 10:21 13

如果平台是LINUX平台，那么进入目录fd的时候，你将会看到文件和文件句柄的对应关系，及其那些文件被删除了。
我的平台是AIX的，和LINUX平台不太一样，需要借助lsof命令来查看。
遗憾的是我这边的lsof命令不能用，不能给大家演示。

假设被删除的文件的句柄是10，那么我们可以通过如下命令将文件COPY回去：

cp 10 /test/orcl/orcl/users01.dbf

然后进行一下recover就行了，在没有备份的情况下，如果能及时的把误删文件COPY回去，就能及时挽救我们的数据库。

大家可以参考这篇文章：

lsof aix 使用 lsof 查找打开的文件

如果打开这个文件的进程被关闭了，相应的文件句柄将会释放，此时就再也看不到这个文件了。

我们SHUTDOWN数据库，然后再启动就会报错：

SQL> select * from v$recover_file;

no rows selected

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1610612736 bytes
Fixed Size                  2084400 bytes
Variable Size             385876432 bytes
Database Buffers         1207959552 bytes
Redo Buffers               14692352 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/test/orcl/orcl/users01.dbf'

SQL> select * from v$recover_file;

     FILE# ONLINE         ONLINE_STATUS
---------- -------------- --------------
ERROR                                                                CHANGE#
----------------------------------------------------------------- ----------
TIME
-------------------
         4 ONLINE         ONLINE
FILE NOT FOUND                                                             0

日志文件中有如下错误：

[oracle@dbtest ~]$ tail -f /u01/oracle/admin/orcl/bdump/alert_orcl.log
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Wed May 16 14:32:04 2012
OS Pid: 516540 executed alter system set events 'immediate trace name heapdump level 10'
Wed May 16 14:58:51 2012
System State dumped to trace file /u01/oracle/admin/orcl/udump/orcl_ora_516540.trc
Thu May 17 05:00:33 2012
Thread 1 advanced to log sequence 18 (LGWR switch)
Current log# 3 seq# 18 mem# 0: /test/orcl/orcl/redo03.log
Thu May 17 10:47:56 2012
Shutting down instance (abort)
License high water mark = 5
Instance terminated by USER, pid = 394304
Thu May 17 10:48:03 2012
Starting ORACLE instance (normal)
sskgpgetexecname failed to get name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 en1 10.192.14.0 configured from OCR for use as a cluster interconnect
Interface type 1 en0 10.192.39.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
IMODE=BR
ILAT =18
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters with non-default values:
processes                = 150
sessions                 = 170
__shared_pool_size       = 352321536
__large_pool_size        = 16777216
__java_pool_size         = 16777216
__streams_pool_size      = 0
sga_target               = 1610612736
control_files            = /test/orcl/orcl/control01.ctl, /test/orcl/orcl/control02.ctl,

/test/orcl/orcl/control03.ctl
db_block_size            = 8192
__db_cache_size          = 1207959552
compatible               = 10.2.0.3.0
log_archive_dest_1       = LOCATION=/test/orcl/arch
log_archive_format       = %t_%s_%r.arc
db_file_multiblock_read_count= 16
db_recovery_file_dest    = /test/orcl/flashback
db_recovery_file_dest_size= 2147483648
undo_management          = AUTO
undo_tablespace          = UNDOTBS1
remote_login_passwordfile= EXCLUSIVE
db_domain                =
dispatchers              = (PROTOCOL=TCP) (SERVICE=orclXDB)
job_queue_processes      = 10
background_dump_dest     = /u01/oracle/admin/orcl/bdump
user_dump_dest           = /u01/oracle/admin/orcl/udump
core_dump_dest           = /u01/oracle/admin/orcl/cdump
audit_file_dest          = /u01/oracle/admin/orcl/adump
db_name                  = orcl
open_cursors             = 300
pga_aggregate_target     = 1646264320
Thu May 17 10:48:04 2012
Oracle instance running with ODM: Veritas 5.0 ODM Library, Version 1.1
cluster interconnect IPC version:
        VERITAS IPC '5.0.3.400' 04:49:57 Apr 16 2010
IPC Vendor 86 proto 76
Version 1.0
PMON started with pid=2, OS id=632182
DIAG started with pid=3, OS id=705994
PSP0 started with pid=4, OS id=570926
LMON started with pid=5, OS id=472866
LMD0 started with pid=6, OS id=640160
MMAN started with pid=7, OS id=884854
DBW0 started with pid=8, OS id=741972
DBW1 started with pid=9, OS id=373816
DBW2 started with pid=10, OS id=525912
DBW3 started with pid=11, OS id=557894
LGWR started with pid=12, OS id=395244
CKPT started with pid=13, OS id=366402
SMON started with pid=14, OS id=697382
RECO started with pid=15, OS id=668764
CJQ0 started with pid=16, OS id=415276
MMON started with pid=17, OS id=480782
Thu May 17 10:48:04 2012
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=18, OS id=456092
Thu May 17 10:48:04 2012
starting up 1 shared server(s) ...
Thu May 17 10:48:04 2012
lmon registered with NM - instance id 1 (internal mem no 0)
Thu May 17 10:48:05 2012
Reconfiguration started (old inc 0, new inc 2)
List of nodes:
0
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and enqueues cleaned out
Resources remastered 0
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Reconfiguration complete
Thu May 17 10:48:05 2012
ALTER DATABASE   MOUNT
Thu May 17 10:48:09 2012
Setting recovery target incarnation to 2
Thu May 17 10:48:09 2012
Successful mount of redo thread 1, with mount id 1311324261
Thu May 17 10:48:09 2012
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu May 17 10:48:09 2012
ALTER DATABASE OPEN
Thu May 17 10:48:09 2012
Errors in file /u01/oracle/admin/orcl/bdump/orcl_dbw0_741972.trc:
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/test/orcl/orcl/users01.dbf'
ORA-17503: ksfdopn:4 Failed to open file /test/orcl/orcl/users01.dbf
ORA-17500: ODM err:File does not exist
ORA-1157 signalled during: ALTER DATABASE OPEN...

此时我们就只能通过备份恢复数据库了。

SQL> ! rman target /

Recovery Manager: Release 10.2.0.4.0 - Production on Thu May 17 10:49:10 2012

connected to target database: ORCL (DBID=1301704889, not open)

RMAN> restore datafile 4;

Starting restore at 2012-05-17 10:49:14
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=149 devtype=DISK

channel ORA_DISK_1: restoring datafile 00004
input datafile copy recid=1 stamp=782931126 filename=/tmp/users01.dbf
destination for restore of datafile 00004: /test/orcl/orcl/users01.dbf
channel ORA_DISK_1: copied datafile copy of datafile 00004
output filename=/test/orcl/orcl/users01.dbf recid=5 stamp=783514156
Finished restore at 2012-05-17 10:49:17

RMAN> recover datafile 4;

Starting recover at 2012-05-17 10:49:20
using channel ORA_DISK_1

starting media recovery

archive log thread 1 sequence 7 is already on disk as file /test/orcl/arch/1_7_773924155.arc
archive log thread 1 sequence 8 is already on disk as file /test/orcl/arch/1_8_773924155.arc
archive log thread 1 sequence 9 is already on disk as file /test/orcl/arch/1_9_773924155.arc
archive log thread 1 sequence 10 is already on disk as file /test/orcl/arch/1_10_773924155.arc
archive log thread 1 sequence 11 is already on disk as file /test/orcl/arch/1_11_773924155.arc
archive log thread 1 sequence 12 is already on disk as file /test/orcl/arch/1_12_773924155.arc
archive log thread 1 sequence 13 is already on disk as file /test/orcl/arch/1_13_773924155.arc
archive log thread 1 sequence 14 is already on disk as file /test/orcl/arch/1_14_773924155.arc
archive log thread 1 sequence 15 is already on disk as file /test/orcl/arch/1_15_773924155.arc
archive log thread 1 sequence 16 is already on disk as file /test/orcl/arch/1_16_773924155.arc
archive log thread 1 sequence 17 is already on disk as file /test/orcl/arch/1_17_773924155.arc
archive log filename=/test/orcl/arch/1_7_773924155.arc thread=1 sequence=7
archive log filename=/test/orcl/arch/1_8_773924155.arc thread=1 sequence=8
archive log filename=/test/orcl/arch/1_9_773924155.arc thread=1 sequence=9
archive log filename=/test/orcl/arch/1_10_773924155.arc thread=1 sequence=10
archive log filename=/test/orcl/arch/1_11_773924155.arc thread=1 sequence=11
archive log filename=/test/orcl/arch/1_12_773924155.arc thread=1 sequence=12
archive log filename=/test/orcl/arch/1_13_773924155.arc thread=1 sequence=13
archive log filename=/test/orcl/arch/1_14_773924155.arc thread=1 sequence=14
archive log filename=/test/orcl/arch/1_15_773924155.arc thread=1 sequence=15
media recovery complete, elapsed time: 00:00:14
Finished recover at 2012-05-17 10:49:37

RMAN> exit

Recovery Manager complete.

SQL> alter database open;

Database altered.

SQL>

阅读(2454) | 评论(0) | 转发(0) |

上一篇：DSI401e:Advanced Support Skills

下一篇：(内附总结的讨论结果）大DML最佳实践讨论，参与即有机会获赠oracle rac日记一本！

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6