Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1782453
  • 博文数量: 413
  • 博客积分: 8399
  • 博客等级: 中将
  • 技术积分: 4325
  • 用 户 组: 普通用户
  • 注册时间: 2011-06-09 10:44
文章分类

全部博文(413)

文章存档

2015年(1)

2014年(18)

2013年(39)

2012年(163)

2011年(192)

分类: Oracle

2012-11-02 14:25:47

Oracle的启动分成了三个部分:nomount, mount, open三个阶段。

1. nomount
SQL> startup nomount 
上面的命令根据 $ORACLE_SID 寻找到启动参数文件,然后根据参数文件中的指定的各种参数,比如sga_target, sga_max_size, pga_aggregate_target, shared_pool_size, db_cache_size等等参数,分配内存,启动各种后台进程,比如DBWn, LGWR, CKPT, PMON, SMON等等。

所以启动到nomount的过程仅仅需要启动参数文件pfile/spfile即可。该过程也仅仅是分配内存,启动后台进程而已。而该过程的最小化要求是:在pfile/spfile中有一个db_name参数即可。

启动参数的寻找顺序为:
$ORACLE_HOME/dbs/spfile.ora,如果不存在,则寻找$ORACLE_HOME/dbs/spfile.ora;如果还不存在,则寻找$ORACLE_HOME/dbs/init.ora,不存在,则继续寻找$ORACLE_HOME/dbs/init.ora。
还不存在,则报错:找不到文件。
[oracle@redhat4 dbs]$ ls
hc_jiagulun.dat  initdw.ora               init.ora.backup  lkJULIA        spfilejiagulun.ora.backup
hc_julia.dat     initjiagulun.ora.backup  lkJIAGULUN       orapwjiagulun  spfilejiagulun.ora.backup2
[oracle@redhat4 dbs]$ sqlplus /nolog

SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 2 14:53:06 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

idle> conn / as sysdba;
Connected to an idle instance.
idle> startup nomount;
ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file '/u01/app/oracle/product/10.2.0/db_1/dbs/initjiagulun.ora'
idle> !echo "db_name=jiagulun" > /u01/app/oracle/product/10.2.0/db_1/dbs/initjiagulun.ora

idle> startup nomount;
ORACLE instance started.

Total System Global Area  113246208 bytes
Fixed Size                  1218004 bytes
Variable Size              58722860 bytes
Database Buffers           50331648 bytes
Redo Buffers                2973696 bytes
idle> ! cat initjiagulun.ora
db_name=jiagulun

idle> ! ps -ef | grep oracle
oracle   18597 18596  0 14:56 pts/6    00:00:00 sqlplus
oracle   18600     1  0 14:56 ?        00:00:00 ora_pmon_jiagulun
oracle   18602     1  0 14:56 ?        00:00:00 ora_psp0_jiagulun
oracle   18604     1  0 14:56 ?        00:00:00 ora_mman_jiagulun
oracle   18606     1  0 14:56 ?        00:00:00 ora_dbw0_jiagulun
oracle   18608     1  0 14:56 ?        00:00:00 ora_lgwr_jiagulun
oracle   18610     1  0 14:56 ?        00:00:00 ora_ckpt_jiagulun
oracle   18612     1  0 14:56 ?        00:00:00 ora_smon_jiagulun
oracle   18614     1  0 14:56 ?        00:00:00 ora_reco_jiagulun
oracle   18616     1  0 14:56 ?        00:00:00 ora_mmon_jiagulun
oracle   18618     1  0 14:56 ?        00:00:00 ora_mmnl_jiagulun
oracle   18619 18597  0 14:56 ?        00:00:00 oraclejiagulun (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL                                    =beq)))

idle> !ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x12900534 589834     oracle    640        115343360  10
0x36010028 786444     oracle    640        115343360  11

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x9c2dffd4 229376     oracle    640        44
0x7df2e688 1015809    oracle    640        44

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

可以看到,仅仅根据db_name=jiagulun就可以启动到nomount阶段。启动了后台进程,分配了内存。
因为Oracle启动到nomount阶段,需要的条件很少。所以在如果Oracle尽然不能启动到nomount阶段,那么一般是OS的问题,比如OS的设置有问题等,OS补丁的问题。

2. mount阶段
在启动参数文件中,包括控制文件的地址信息,从nomount阶段,到mount阶段,就是根据启动参数文件中的信息,找到控制文件。而我们知道控制文件中,包含了各种数据文件的为位置信息,所以mount阶段还对这些数据文件进行存在性验证

[oracle@redhat4 dbs]$ cat initjiagulun.ora
jiagulun.__db_cache_size=130023424
jiagulun.__java_pool_size=4194304
jiagulun.__large_pool_size=4194304
jiagulun.__shared_pool_size=75497472
jiagulun.__streams_pool_size=0
*.audit_file_dest='/u01/app/oracle/admin/jiagulun/adump'
*.background_dump_dest='/u01/app/oracle/admin/jiagulun/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/u01/app/oracle/oradata/jiagulun/control01.ctl','/u01/app/oracle/oradata/jiagulun/control02.ctl','/u01/app/oracle/oradata/jiagulun/control03.ctl','/u01/app/oracle/oradata/jiagulun/control04.ctl'
*.core_dump_dest='/u01/app/oracle/admin/jiagulun/cdump'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_name='jiagulun'
*.db_recovery_file_dest='/u01/app/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=2147483648
*.db_writer_processes=1
*.dispatchers='(PROTOCOL=TCP) (SERVICE=jiagulunXDB)'
*.fast_start_mttr_target=1200
*.job_queue_processes=20
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.pga_aggregate_target=71303168
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.service_names='jiagulun,jgl'
*.sga_target=216006656
*.sql_trace=TRUE
*.timed_statistics=true
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/u01/app/oracle/admin/jiagulun/udump'

我们看到启动参数文件中,包含了control_files,audit_file_dest,background_dump_dest,core_dump_dest,db_recovery_file_dest,user_dump_dest等信息。如果nomount阶段产生错误,也可以写入到这些地方。方便诊断错误。

所以从nomount到mount阶段包括了一下一些过程:
1> 根据启动参数文件中的control_files找到控制文件地址;
2> 根据控制文件中的信息,验证数据文件的存在性,如果不存在,则alert日志中有记载,并且在动态视图v$recover_file中插入相关信息。在open数据库时,就报错,提示找不到文件。当我们将用备份文件恢复,再open时,会要求进行数据库介质恢复。所以介质恢复,包括实例恢复都是在数据库open之前。实例恢复的前滚也是在open之前,回滚才是在open之后
idle> select status from v$instance;
STATUS
------------
MOUNTED
idle> select name from v$datafile;
NAME
------------------------------------------------
/u01/app/oracle/oradata/jiagulun/system01.dbf
/u01/app/oracle/oradata/jiagulun/undotbs01.dbf
/u01/app/oracle/oradata/jiagulun/sysaux01.dbf
/u01/app/oracle/oradata/jiagulun/users01.dbf
/u01/app/oracle/oradata/jiagulun/example01.dbf

idle> !mv /u01/app/oracle/oradata/jiagulun/users01.dbf /u01/app/oracle/oradata/jiagulun/users01.dbf.backup

idle> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

idle> startup mount;
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area  218103808 bytes
Fixed Size                  1218604 bytes
Variable Size              88082388 bytes
Database Buffers          125829120 bytes
Redo Buffers                2973696 bytes
Database mounted.
idle> alter database open;
alter database open
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/u01/app/oracle/oradata/jiagulun/users01.dbf'

idle> select status from v$instance;
STATUS
------------
MOUNTED
idle> select * from v$recover_file;
     FILE# ONLINE  ONLINE_ ERROR               CHANGE# TIME
---------- ------- ------- ---------------- ---------- ------------
         4 ONLINE  ONLINE  FILE NOT FOUND            0

3> 控制文件中的HeartBeat
在mount时,数据库将mount id记录到控制文件,并且启动heartbeat,没3秒中更新一次控制文件中的heartbeat值。alert日志中也记载了mount id的信息:
Fri Nov  2 15:28:30 2012
ALTER DATABASE   MOUNT
MMNL started with pid=12, OS id=18943
Fri Nov  2 15:28:34 2012
Setting recovery target incarnation to 2
Fri Nov  2 15:28:34 2012
Successful mount of redo thread 1, with mount id 2714457118
Fri Nov  2 15:28:34 2012
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT

下面我们看看heartheat:
[oracle@redhat4 dbs]$ sqlplus / as sysdba;
SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 2 16:11:43 2012
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

idle> alter session set events 'immediate trace name CONTROLF level 8';
Session altered.
idle> select status from v$instance;
STATUS
------------
MOUNTED

然后换一个session:
[oracle@redhat4 udump]$ sqlplus / as sysdba;
SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 2 16:11:57 2012
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

idle> alter session set events 'immediate trace name CONTROLF level 8';
Session altered.

[oracle@redhat4 udump]$ diff jiagulun_ora_19030.trc jiagulun_ora_19035.trc > diff.txt
[oracle@redhat4 udump]$ grep heartbeat diff.txt
< heartbeat: 798304597 mount id: 2714439388
> heartbeat: 798304601 mount id: 2714439388
我们看到mount id没有变化,而heartbeat发生了改变。 可以从数据库中查出heartbeat的值:
idle> select cphbt from x$kcccp;
     CPHBT
----------
 798304725
         0
         0
         0
         0
         0
         0
         0

8 rows selected.

4> 其它启动参数文件
[oracle@redhat4 dbs]$ pwd
/u01/app/oracle/product/10.2.0/db_1/dbs
[oracle@redhat4 dbs]$ ls
hc_jiagulun.dat  initjiagulun.ora  lkJIAGULUN     spfilejiagulun.ora
initdw.ora       init.ora.backup   orapwjiagulun  spfilejiagulun.ora.backup
[oracle@redhat4 dbs]$ strings hc_jiagulun.dat
DO NOT DELETE OR OVERWRITE THIS FILE!!!
jiagulun
[oracle@redhat4 dbs]$ strings orapwjiagulun
]\[Z
ORACLE Remote Password file
INTERNAL
6702E8C80DAEF7F8
E07F85E08B8337CA
[oracle@redhat4 dbs]$
在$ORACLE_HOME/dbs的文件几乎都和数据库的启动有关系。
a) orapw 是口令文件,用于SYSDBA/SYSOPER的用户远程登录验证。与启动参数文件中的remote_login_passwordfile有关,具体参见:http://blog.chinaunix.net/uid-25909722-id-3395279.html
b) spfile / pfile 是启动参数文件
c) lk 文件是数据库启动的锁定文件,lk表示lock,用于OS对数据库的锁定。启动时获得锁定,关闭时是否锁定。所以若干非正常关闭时,可能发生锁没有释放的情况。那么下次启动将报错。可重启OS。
d) hc_.dat 文件,hc 表示 health check:

1.What is the $ORACLE_HOME/dbs/hc_.dat file?

   The $ORACLE_HOME/dbs/hc_.dat is created for the instance health check monitoringIt contains information used to monitor the instance health and to determine why it went down if the instance isn't up. The file will be recreated at every instance startup.

2. What happens if the $ORACLE_HOME/dbs/hc_.dat file is deleted?

    if you replace the file with an empty "dummy" copy, you will get an ORA-7445 error. Therefore, if the file gets deleted on the fly while the database is up, or if the file is replaced with a 0 byte file, simply delete the file and restart the database.  The file will be correctly recreated at the next database startup. 

3. open阶段
open阶段,会进行一些十分重要的校验:
1> 首先检查数据文件头中的检查点计数(Checkpoint CNT)和控制文件中的检查点计数是否一致。因为每一次执行检查点事件,都会在控制文件,数据文件等地方自增Checkping CNT.
我们可以使用 alter session set events 'immediate trace name controlf level 8' 来转存控制文件信息,查看ckeckpint CNT.
2> 检查数据文件头部的SCN和控制文件中记录的该文件的结束SCN是否一致。来确定是否需要对该文件进行恢复。
在对所有数据文件的检查都完成后,打开数据库,锁定数据文件,同时将控制文件中每个数据文件的结束SCN设置为无穷大

4. 数据库的关闭
分为三个阶段:close, dismount, shutdown. 
shutdown分成了四种方式:
1> shutdown normal: 需要等待所有用户断开连接,然后关闭数据库;
2> shutdown transactional: 需要等待所有事物完成,然后关闭数据库;
3> shutdown immediate: 立即回滚所有事物,将所有缓存写回磁盘,然后关闭数据库;
4> shutdown abort: 立即关闭数据库,相当于断电,没有任何处理动作;
前三种方式,在再次启动数据库时,都不需要进行实例恢复。数据库处于抑制状态。

(本文是《深入解析Oracle》的的读书笔记)
阅读(2690) | 评论(0) | 转发(3) |
给主人留下些什么吧!~~