Chinaunix首页 | 论坛 | 博客
  • 博客访问: 6686636
  • 博文数量: 1005
  • 博客积分: 8199
  • 博客等级: 中将
  • 技术积分: 13071
  • 用 户 组: 普通用户
  • 注册时间: 2010-05-25 20:19
个人简介

脚踏实地、勇往直前!

文章分类

全部博文(1005)

文章存档

2020年(2)

2019年(93)

2018年(208)

2017年(81)

2016年(49)

2015年(50)

2014年(170)

2013年(52)

2012年(177)

2011年(93)

2010年(30)

分类: Oracle

2014-02-15 01:23:46

环境:
OS:Red Hat Linux As 5
DB:10.2.0.5

  之前rac部署完毕后,试着导出ocr,但发现无法导出,报如下的错误.

[root@node1 ~]# /u01/app/oracle/product/10.2.0/crs_1/bin/ocrconfig -export /u01/app/oracle/ocr_export140210_8.bak
PROT-4: Failed to retrieve data from the cluster registry

node1-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :    1043916
         Used space (kbytes)      :       5408
         Available space (kbytes) :    1038508
         ID                       : 1855713603
         Device/File Name         : /dev/raw/raw1
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw3
                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded

node1-> cluvfy comp ocr -n all

Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Verification of OCR integrity was successful.

从ocr检查来看没有任何问题,crsd日志也没有发现有用的信息,计划打算重建OCR,重建步骤大概如下:


1.两个节点停止crs
[root@node1 ~]# /u01/app/oracle/product/10.2.0/crs_1/bin/crsctl stop crs
Stopping resources. This could take several minutes.
Error while stopping resources. Possible cause: CRSD is down.

[root@node2 ~]# /u01/app/oracle/product/10.2.0/crs_1/bin/crsctl stop crs
Stopping resources. This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

2.在每个节点上执行如下的脚本(root用户下执行)
[root@node1 10.2.0]# /u01/app/oracle/product/10.2.0/crs_1/install/rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Feb 13 04:41:13.568 | INF | daemon shutting down
Stopping resources. This could take several minutes.
Error while stopping resources. Possible cause: CRSD is down.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
Cleaning up Network socket directories
[root@node1 10.2.0]#

[root@node2 ~]# /u01/app/oracle/product/10.2.0/crs_1/install/rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources. This could take several minutes.
Error while stopping resources. Possible cause: CRSD is down.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
Cleaning up Network socket directories

3.在主节点上执行rootdeinstall.sh
这里的主节点是执行crs安装过程的那个节点,我这里是在节点1上执行的.
[root@node1 10.2.0]# /u01/app/oracle/product/10.2.0/crs_1/install/rootdeinstall.sh
Removing contents from OCR mirror device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 1.46619 seconds, 7.2 MB/s
Removing contents from OCR device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 2.48259 seconds, 4.2 MB/s
[root@node1 10.2.0]#

4.在主节点上执行root.sh,跟执行步骤3所在的节点上执行.
[root@node1 crs_1]# /u01/app/oracle/product/10.2.0/crs_1/root.sh
WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
No value set for the CRS parameter CRS_OCR_LOCATIONS. Using Values in paramfile.crs
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: node1 node1-priv node1
node 2: node2 node2-priv node2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.

Failure at final check of Oracle CRS stack.
10
该节点的crs无法启动,先不管,继续执行下面的步骤.

5.在另外一个节点上执行
[root@node2 ~]# /u01/app/oracle/product/10.2.0/crs_1/root.sh
WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
No value set for the CRS parameter CRS_OCR_LOCATIONS. Using Values in paramfile.crs
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: node1 node1-priv node1
node 2: node2 node2-priv node2
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        node1
        node2
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
Invalid interface "255.255.255.0/eth0" entered in an input argument.

发现节点2的crs也有问题,crsd错误日志如下:
2014-02-14 03:28:28.510: [ CSSCLNT][1176720]clsssInitNative: connect failed, rc 9
2014-02-14 03:28:28.510: [  CRSRTI][1176720]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-02-14 03:28:29.603: [ COMMCRS][40778640]clsc_connect: (0x88a7d00) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node1_crs
))
2014-02-14 03:28:29.603: [ CSSCLNT][1176720]clsssInitNative: connect failed, rc 9
2014-02-14 03:28:29.603: [  CRSRTI][1176720]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-02-14 03:28:30.707: [ COMMCRS][40778640]clsc_connect: (0x88a7d00) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node1_crs
))
2014-02-14 03:28:30.707: [ CSSCLNT][1176720]clsssInitNative: connect failed, rc 9
这个问题在网上大部分原因是节点通信问题,但是我验证了两个节点通信没有问题,重新执行如上的步骤,问题依旧,最后想到彻底删除crs,然后
重新安装集群软件.
彻底删除crs的步骤可以参考:http://blog.chinaunix.net/uid-77311-id-3298250.html

6.重新安装集群
安装的集群软件是10.2.0.1的,之前的集群软件已经升级到了10.2.0.5,所以先安装10.2.0.1集群软件,但是不执行vipca,
然后后再升级到10.2.0.5,最后才执行vipca.


7.配置ons
[root@node1 ~]# /u01/app/oracle/product/10.2.0/crs_1/bin/racgons add_config node1:6200 node2:6200
WARNING: node1:6200 already configured.
WARNING: node2:6200 already configured.

[root@node1 ~]# /u01/app/oracle/product/10.2.0/crs_1/bin/onsctl ping
Number of configuration nodes retrieved: 2
0: {node = node1, port = 6200}
Adding remote host node1:6200
1: {node = node2, port = 6200}
Adding remote host node2:6200
ons is not running ...


8.配置集群网路接口
在节点1上配置
node1-> $ORA_CRS_HOME/bin/oifcfg iflist
eth0  192.168.1.0  -- public接口
eth1  10.10.10.0   -- 私有通信接口

node1->$ORA_CRS_HOME/bin/oifcfg setif -global eth0/192.168.1.0:public
node1->$ORA_CRS_HOME/bin/oifcfg setif -global eth1/10.10.10.0:cluster_interconnect

node1-> $ORA_CRS_HOME/bin/oifcfg getif
eth0  192.168.1.0  global  public
eth1  10.10.10.0  global  cluster_interconnect


9.使用netca配置监听器

分别在节点1和节点2上将之前的监听文件转移到临时目录
node1->mv $ORACLE_HOME/network/admin/listener.ora /tmp/listener.ora.original_node1
node2->mv $ORACLE_HOME/network/admin/listener.ora /tmp/listener.ora.original_node2
在其中一个节点上使用netca添加监听器,添加完成后可以看到监听器资源已经加入到ocr.
node1-> crs_stat -t
Name           Type           Target    State     Host       
------------------------------------------------------------
ora....E1.lsnr application    ONLINE    ONLINE    node1      
ora.node1.gsd  application    ONLINE    ONLINE    node1      
ora.node1.ons  application    ONLINE    ONLINE    node1      
ora.node1.vip  application    ONLINE    ONLINE    node1      
ora....E2.lsnr application    ONLINE    ONLINE    node2      
ora.node2.gsd  application    ONLINE    ONLINE    node2      
ora.node2.ons  application    ONLINE    ONLINE    node2      
ora.node2.vip  application    ONLINE    ONLINE    node2  

 

 

10.将资源添加到ocr.
添加asm实例(注意大小写),操作只在一个节点上进行.
node1-> $ORA_CRS_HOME/bin/srvctl add asm -i +ASM1 -n node1 -o /u01/app/oracle/product/10.2.0/db_1
node1-> $ORA_CRS_HOME/bin/srvctl add asm -i +ASM2 -n node2 -o /u01/app/oracle/product/10.2.0/db_1

添加数据库
node1-> $ORA_CRS_HOME/bin/srvctl add database -d racdb -o /u01/app/oracle/product/10.2.0/db_1

添加实例
node1-> $ORA_CRS_HOME/bin/srvctl add instance -d racdb -i racdb1 -n node1
node1-> $ORA_CRS_HOME/bin/srvctl add instance -d racdb -i racdb2 -n node2

 

添加之前数据库的服务
node1-> $ORA_CRS_HOME/bin/srvctl add service -d racdb -s s1 -r racdb1 -a racdb2 -P BASIC
node1-> $ORA_CRS_HOME/bin/srvctl add service -d racdb -s s2 -r racdb2 -a racdb1 -P BASIC


添加完成后检查服务情况
node1-> crs_stat -t
Name           Type           Target    State     Host       
------------------------------------------------------------
ora....SM1.asm application    OFFLINE   OFFLINE              
ora....E1.lsnr application    ONLINE    ONLINE    node1      
ora.node1.gsd  application    ONLINE    ONLINE    node1      
ora.node1.ons  application    ONLINE    ONLINE    node1      
ora.node1.vip  application    ONLINE    ONLINE    node1      
ora....SM2.asm application    OFFLINE   OFFLINE              
ora....E2.lsnr application    ONLINE    ONLINE    node2      
ora.node2.gsd  application    ONLINE    ONLINE    node2      
ora.node2.ons  application    ONLINE    ONLINE    node2      
ora.node2.vip  application    ONLINE    ONLINE    node2      
ora.racdb.db   application    OFFLINE   OFFLINE              
ora....b1.inst application    OFFLINE   OFFLINE              
ora....b2.inst application    OFFLINE   OFFLINE              
ora....b.s1.cs application    OFFLINE   OFFLINE              
ora....db1.srv application    OFFLINE   OFFLINE              
ora....b.s2.cs application    OFFLINE   OFFLINE              
ora....db2.srv application    OFFLINE   OFFLINE


node1-> srvctl start asm -n node1
node1-> srvctl start asm -n node2
node1-> srvctl start database -d racdb
node1-> srvctl start service -d racdb

这个时候检查资源运行情况
node1-> crs_stat -t
Name           Type           Target    State     Host       
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    node1      
ora....E1.lsnr application    ONLINE    ONLINE    node1      
ora.node1.gsd  application    ONLINE    ONLINE    node1      
ora.node1.ons  application    ONLINE    ONLINE    node1      
ora.node1.vip  application    ONLINE    ONLINE    node1      
ora....SM2.asm application    ONLINE    ONLINE    node2      
ora....E2.lsnr application    ONLINE    ONLINE    node2      
ora.node2.gsd  application    ONLINE    ONLINE    node2      
ora.node2.ons  application    ONLINE    ONLINE    node2      
ora.node2.vip  application    ONLINE    ONLINE    node2      
ora.racdb.db   application    ONLINE    ONLINE    node1      
ora....b1.inst application    ONLINE    ONLINE    node1      
ora....b2.inst application    ONLINE    ONLINE    node2      
ora....b.s1.cs application    ONLINE    ONLINE    node1      
ora....db1.srv application    ONLINE    ONLINE    node1      
ora....b.s2.cs application    ONLINE    ONLINE    node2      
ora....db2.srv application    ONLINE    ONLINE    node2

node1-> cluvfy stage -post crsinst -n node1,node2

Performing post-checks for cluster services setup

Checking node reachability...
Node reachability check passed from node "node1".


Checking user equivalence...
User equivalence check passed for user "oracle".

Checking Cluster manager integrity...


Checking CSS daemon...
Daemon status check passed for "CSS daemon".

Cluster manager integrity check passed.

Checking cluster integrity...


Cluster integrity check passed


Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Checking CRS integrity...

Checking daemon liveness...
Liveness check passed for "CRS daemon".

Checking daemon liveness...
Liveness check passed for "CSS daemon".

Checking daemon liveness...
Liveness check passed for "EVM daemon".

Checking CRS health...
CRS health check passed.

CRS integrity check passed.

Checking node application existence...


Checking existence of VIP node application (required)
Check passed.

Checking existence of ONS node application (optional)
Check passed.

Checking existence of GSD node application (optional)
Check passed.


Post-check for cluster services setup was successful.

到这里重建ocr完成,重新执行之前的export导出没有问题.

[root@node1 logs]# /u01/app/oracle/product/10.2.0/crs_1/bin/ocrconfig -export /u01/app/oracle/ocr_export140210_8.bak
[root@node1 logs]#

说明:
之前一直有一个自己理解的误区就是ASM实例的参数信息是保留在OCR里的,重建会将这些参数信息清理掉.其实10G里的ASM实例的参数文件是保存在/u01/app/oracle/admin/+ASM/pfile/init.ora,注册ASM实例资源,启动实例的时候会自动读取该文件(所以在彻底删除crs的时候不要将该文件删除掉).


-- The End --

阅读(4450) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~