实例恢复- Partial Availability状态-云中的二舅-ChinaUnix博客

云中的二舅的ChinaUnix博客yunzhongdeangle.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

云中的二舅

博客访问： 1182961
博文数量： 178
博客积分： 2776
博客等级：少校
技术积分： 2809
用户组：普通用户
注册时间： 2012-03-22 15:36

文章分类

全部博文（178）

未分配的博文（178）

文章存档

2014年（3）

2013年（66）

2012年（109）

我的朋友

相关博文

实例恢复- Partial Availability状态

分类： Oracle

2013-01-07 09:33:33

北京oracle培训SMON的作用还包括RAC环境中的Instance Recovery，注意虽然Instance Recovery可以翻做实例恢复，但实际上和我们口头所说的实例恢复是不同的。我们口头语言所说的实例恢复很大程度上是指Crash Recovery崩溃恢复，Instance Recovery与Crash Recovery是存在区别的：针对单实例(single instance)或者RAC中所有节点全部崩溃后的恢复，我们称之为Crash Recovery。而对于RAC中的某一个节点失败，存活节点(surviving instance)试图对失败节点线程上redo做应用的情况，我们称之为Instance Recovery。对于Crash Recovery更多的内容可见<还原真实的cache recovery>一文。

现象

Instance Recovery期间分别存在cache recovery和ges/gcs remaster2个recovery stage,注意这2个舞台的恢复是同时进行的。cache recovery的主角是存活节点上的SMON进程，SMON负责分发redo给slave进程。而实施ges/gcs remaster的是RAC专有进程LMON。

整个Reconfiuration的过程如下图:

注意以上Crash Detected时数据库进入部分可用(Partial Availability)状态，从Freeze Lockdb开始None Availability，到IR applies redo即前滚时转换为Partial Availability，待前滚完成后会实施回滚，但是此时数据库已经进入完全可用(Full Availability)状态了，如下图：

The
graphic illustrates the degree of database availability during each step of Oracle instance recovery:

A. Real Application Clusters is running on multiple nodes.

B. Node failure is detected.

C.
The enqueue part of the GRD is reconfigured; resource management is redistributed to the surviving nodes. This operation occurs relatively quickly.

D. The cache part of the GRD is reconfigured and SMON reads the redo log of the failed
instance to identify the database blocks that it needs to recover.

E. SMON issues the GRD requests to obtain all the database blocks it needs for recovery. After the requests are complete, all other blocks are accessible.

F.
The Oracle server performs roll forward recovery. Redo logs of the failed threads are applied to the database, and blocks are available right after their recovery is completed.

G. The Oracle server performs rollback recovery. Undo blocks
are applied to the database for all uncommitted transactions.

H. Instance recovery is complete and all data is accessible.

Note: The dashed line represents the blocks identified in step 2 in the previous slide. Also, the
dotted steps represent the ones identified in the previous slide.

我们来实际观察一下Instance Recovery的过程:
INST 1:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------

SQL> alter system set event='10426 trace name context forever,level 12' scope=spfile; -- 10426 event Reconfiguration trace event

阅读(870) | 评论(0) | 转发(0) |

上一篇：设置10513事件禁止SMON恢复死事务

下一篇：续-实例恢复- ORACLE instance shut down

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6