Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2460283
  • 博文数量: 367
  • 博客积分: 10016
  • 博客等级: 上将
  • 技术积分: 3555
  • 用 户 组: 普通用户
  • 注册时间: 2006-09-04 15:12
个人简介

MEI SHENME

文章分类

全部博文(367)

文章存档

2009年(2)

2008年(1)

2007年(2)

2006年(362)

我的朋友

分类: LINUX

2006-09-21 11:39:22

系统:RedHat Linux AS3     安装软件:ORACLE9i2.0.4
服务器基本配置:双2.0CPU  内存:2GB  

基本参数:
Localhost:  
orasrv1   IP:  rac1pub 192.168.0.220    rac1prv 10.0.0.1   rac1als
orasrv2   IP:  rac2pub 192.168.0.221    rac2prv 10.0.0.2   rac2als



For this article I used the following Oracle setup:
RAC node      Database Name    SID     $ORACLE_BASE   Oracle Datafile Directory
---------------   -------------   ------  ------------   ----------------------------
rac1als/rac1prv   orcl            orcl1     /opt/oracle/      /var/opt/oracle/oradata/orcl
rac2als/rac2prv   orcl            orcl2    /opt/oracle/       /var/opt/oracle/oradata/orcl


安装过程: 
Linux上安装RAC的系统要求
所需补丁
p3006854_9204_LINUX.zip 在运行 runInstaller 之前打.  
p3119415_9204_LINUX.zip 安装oracle之后打.  
p2617419_210_GENERIC.zip 打3119415补丁所需要的补丁.

1.检查系统内核 可以略过

[root @orasrv1 root] # uname -a

Linux orasrv1 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux
binutils 需求
binutils 要求binutils-2.11.90.0.8-12 以上,如
[root @orasrv1 root] # rpm -qa | grep -i binutils
binutils-2.14.90.0.4-26
共享磁盘需求
多节点安装,需要共享磁盘系统,可以是Raw 设备,Ocfs 文件系统,Nfs 网络文件系统等。
在这里我们采用磁盘柜raw设备
2.安装前的准备工作
调整Linux核心参数 ,在/etc/sysctl.conf中增加(根据服务器配置进行修改)
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=0
net.ipv4.ip_local_port_range = 1024 65000  
kernel.sem = 500 64000 100 128
kernel.shmmax = 2147483648
kernel.shmmni = 4096  
kernel.shmall = 2097152  
fs.file-max = 65536  

以上值根据不同的环境可能有变化

3.加载系统状态检查模块 这个模块在AS2.1-E16 以上的核心或者是3.0的核心中是自带的,不需要安装,它取代
了数据库9204 版本的watchdog,所以,我们可以不需要配置watchdog,如果OS核心不够,
可以升级核心。
可以通过如下方法检测是否存在该模块(可以略过)
[root @orasrv1 root] # find /lib/modules -name "hangcheck-timer.o"
/lib/modules/2.4.21-15.EL /kernel/drivers/char/hangcheck-timer.o
/lib/modules/2.4.21-15.ELsmp /kernel/drivers/char/hangcheck-timer.o
你可以运行该模块并检查日志信息
[root @orasrv1 root] # /sbin/insmod hangcheck-timer hangcheck_tick=30
hangcheck_margin=180
在/etc/rc.local下增加
#!/bin/sh
touch /var/lock/subsys/local
/sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180




4.网络节点配置

Server 1 (rac1pub)
Device        IP Address        Subnet        Purpose
eth0        192.168.0.220        255.255.255.0        Connects rac1pub to the public network
eth1        10.0.01        255.255.0.0        Connects rac1pub (interconnect) to rac2pub (rac2prv)
/etc/hosts
127.0.0.1         ciqa.server      ciqaora1
192.168.0.220    rac1pub.ciqa.server   rac1als
10.0.0.1          rac1prv
192.168.0.221    rac2pub.ciqa.server   rac2als
10.0.02           rac2prv


Server 2 (rac2pub)
Device        IP Address        Subnet        Purpose
eth0        192.168.0.221        255.255.255.0        Connects rac2pub to the public network
eth1        10.0.0.2        255.255.0.0        Connects rac2pub (interconnect) to rac1pub (rac1prv)
/etc/hosts
127.0.0.1         ciqa.server      ciqaora2
192.168.0.220    rac1pub.ciqa.server   rac1als
10.0.0.1         rac1prv
192.168.0.221    rac2pub.ciqa.server   rac2als
10.0.02          rac2prv


在网络设置中eth1和eth0,将<当计算机启动时激活设备>;选择
如果确定在单个节点上模拟RAC,那么/etc/hosts文件内容可以类似如下
[oracle@orasrv1 oracle]$ vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 huiheng.localdomain    orasrv1(orasrv2)
10.0.0.1    rac1prv
10.0.0.2    rac2prv
192.168.0.220   rac1pub.huiheng. localdomain rac1als
192.168.0.221   rac2pub.huiheng. localdomain rac2als

配置后重起network
[root @orasrv1 root] # service network restart

1. Check node one using the following procedure:
a. Ping node 2 using the private hostname.
b. Ping node 2 using the public hostname.
c. Ping node 2 using the private IP address.
d. Ping node 2 using the public IP address.

2. Check node two using the following procedure:
a. Ping node 1 using the private hostname.
b. Ping node 1 using the public hostname.
c. Ping node 1 using the private IP address.
d. Ping node 1 using the public IP address.

其中分别代表公用节点名称与私有节点名称,公用节点是网卡1 配置的IP 地址,表示
对外应用程序连接通道;私有节点是网卡2配置的IP地址,用于多个节点之间的通信专用。


5.创建oracle 用户与组  
[root @orasrv1 root] # groupadd oinstall   (在安装oracle时UNIX Group name:oinstall)
[root @orasrv1 root] # groupadd dba  
[root @orasrv1 root] # useradd -g oinstall -G dba oracle  
[root @orasrv1 root] # passwd oracle    passracle

设置oracle对文件的要求:  all RAC nodes
编辑文件:/etc/security/limits.conf 加入以下语句:  
          oracle    soft    nofile    65536  
          oracle    hard    nofile   65536  
          oracle    soft    nproc    16384  
          oracle    hard    nproc    16384  




6.准备目录结构  
[root @orasrv1 root] # su – root
[root @orasrv1 root] # chmod –R 0777 /var/opt
[root @orasrv1 root] # chmod –R 0777 /opt
[root @orasrv1 root] # mkdir -p /var/opt/oracle/oradata/orcl
[root @orasrv1 root] # chmod -R 775 /var/opt/oracle
[root @orasrv1 root] # su - oracle
[oracle @orasrv1 oracle] # export ORACLE_BASE=/opt/ora9
[oracle @orasrv1 oracle] # export ORACLE_HOME=/opt/ora9/product/9.2  
[oracle @orasrv1 oracle] $ mkdir -p /opt/ora9/product/9.2  
[oracle @orasrv1 oracle] $ cd $ORACLE_BASE
[oracle @orasrv1 oracle] $ mkdir -p admin/orcl ---存放配置文件
[oracle @orasrv1 oracle] $ cd admin/orcl
[oracle @orasrv1 oracle] $ mkdir bdump cdump udump createdblog
[oracle @orasrv1 oracle] $ cd $ORACLE_BASE
[oracle @orasrv1 oracle] $ mkdir -p oradata/orcl ----存放数据文件
[oracle @orasrv1 oracle] $ su – root
[root @orasrv1 root] # chown oracle.dba /var/opt/oracle
[root @orasrv1 root] # chown -R oracle.dba /opt/ora9
注:两个节点都要完成

7. 设置节点环境变量   
以下参数在两个平台下公用
打开.bash_profile文件,将如下内容加入:在oracle下   
(命令[oracle @orasrv1 oracle] # vi $HOME/.bash_profile I 进入edit  :w 存盘  :q)
#oracle 9i  
export DISPLAY=” 127.0.0.1:0.0”   
export ORACLE_BASE=/opt/ora9  
export ORACLE_HOME=/opt/ora9/product/9.2  
export PATH=$ORACLE_HOME/binORACLE_HOME/Apache/Apache/binPATH  
export ORACLE_OWNER=oracle  
export ORACLE_SID=orcl1(orcl2)     (数据库全局变量名)
export ORACLE_TERM=xterm    (xterm窗口模式 vt100 终端调试模式)  
export LD_ASSUME_KERNEL=2.4.1  
export THREADS_FLAG=native  
export LD_LIBRARY_PATH=/opt/ora9/product/9.2/libLD_LIBRARY_PATH  
export PATH=/opt/ora9/product/9.2/binPATH  
export NLS_LANG= AMERICAN_AMERICA.zhs16gbk   (设置语言AMERICAN英文)
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
PATH=$PATHHOME/binORACLE_HOME/bin

保存后退出. 执行: source .bash_profile    查看  set | more
然后。退出登录,再次进入,这时候oracle的环境就已经生效了.


在oracle用户下  
# For Cluster Manager
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/oracm/log
# For SQL*Net Listener
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/network/log
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/network/trace

# For database instances
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/rdbms/log
[oracle@orasrv1 oracle] # mkdir -p $ORACLE_HOME/rdbms/audit

# For Oracle Intelligent Agent
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/network/agent/log
[oracle@orasrv1 oracle] $ mkdir -p $ORACLE_HOME/network/agent/reco

8.Raw裸设备   
首先需要划分一系列的分区,需要注意的是,每个设备不能多于15 个分区,Linux 总共不能超过255 个裸设备。 裸设备一般用于共享磁盘系统。可以用如下的方法挂装,在盘阵上创建分区,linux下默认为scsi硬盘
[oracle@orasrv1 oracle] $ su - root
[root @orasrv1 root] # raw /dev/raw/raw1 /dev/sdb5 Cluster
Manager Quorum File
[root @orasrv1 root] # raw /dev/raw/raw2 /dev/sdb6 Shared Configuration file for srvctl
[root @orasrv1 root] # raw /dev/raw/raw3 /dev/sdb7 # spfileorcl.ora
[root @orasrv1 root] # raw /dev/raw/raw4 /dev/sdb8 # control01.ctl
[root @orasrv1 root] # raw /dev/raw/raw5 /dev/sdb9 # control02.ctl
[root @orasrv1 root] # raw /dev/raw/raw6 /dev/sdb10 # indx01.dbf
[root @orasrv1 root] # raw /dev/raw/raw7 /dev/sdb11 # system01.dbf
[root @orasrv1 root] # raw /dev/raw/raw8 /dev/sdb12 # temp01.dbf
[root @orasrv1 root] # raw /dev/raw/raw9 /dev/sdb13 # tools01.dbf
[root @orasrv1 root] # raw /dev/raw/raw10 /dev/sdb14 # undotbs01.dbf
[root @orasrv1 root] # raw /dev/raw/raw11 /dev/sdb15 # undotbs02.dbf
[root @orasrv1 root] # raw /dev/raw/raw12 /dev/sdc5 # undotbs03.dbf
[root @orasrv1 root] # raw /dev/raw/raw13 /dev/sdc6 # users01.dbf
[root @orasrv1 root] # raw /dev/raw/raw14 /dev/sdc7 # redo01.log (Group# 1 Thread# 1)
[root @orasrv1 root] # raw /dev/raw/raw15 /dev/sdc8 # redo02.log (Group# 2 Thread# 1)
[root @orasrv1 root] # raw /dev/raw/raw16 /dev/sdc9 # redo03.log (Group# 3 Thread# 2)
[root @orasrv1 root] # raw /dev/raw/raw17 /dev/sdc10
# orcl_redo2_2.log (Group# 4 Thread# 2)
[root @orasrv1 root] # raw /dev/raw/raw18 /dev/sdc11
# orcl_redo3_1.log (Group# 5 Thread# 3)
[root @orasrv1 root] # raw /dev/raw/raw19 /dev/sdc12
# orcl_redo3_2.log (Group# 6 Thread# 3)
[root @orasrv1 root] # raw /dev/raw/raw20 /dev/sdc13 #examdb.dbf
[root @orasrv1 root] # raw /dev/raw/raw21 /dev/sdc14 #examtemp.dbf
. . . . . . 并添加到 rc.local 文件中

如果检查连接,用如下命令
[root @orasrv1 root] # raw –qa
在启动的时候,自动挂载,请把以上的命令写到/etc/rc.local 中
[root @orasrv1 root] # vi /etc/rc.local
raw /dev/raw/raw1 /dev/sda2
raw /dev/raw/raw2 /dev/sda3
……
再用如下的方法建立软联结,那么就可以和文件系统一样使用裸设备了。
[root @orasrv1 root] # chmod 660 /dev/raw/raw1
[root @orasrv1 root] # chown oracle.dba /dev/raw/raw1
                  ……………………
[root @orasrv1 root] # su - oracle
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw1 /var/opt/oracle/oradata/orcl/CMQuorumFile
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw2
/var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw3 /var/opt/oracle/oradata/orcl/spfileorcl.ora
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw4 /var/opt/oracle/oradata/orcl/control01.ctl
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw5 /var/opt/oracle/oradata/orcl/control02.ctl
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw6 /var/opt/oracle/oradata/orcl/indx01.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw7 /var/opt/oracle/oradata/orcl/system01.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw8 /var/opt/oracle/oradata/orcl/temp01.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw9 /var/opt/oracle/oradata/orcl/tools01.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw10 /var/opt/oracle/oradata/orcl/undotbs01.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw11 /var/opt/oracle/oradata/orcl/undotbs02.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw12 /var/opt/oracle/oradata/orcl/undotbs03.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw13 /var/opt/oracle/oradata/orcl/users01.dbf

[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw14 /var/opt/oracle/oradata/orcl/redo01.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw15 /var/opt/oracle/oradata/orcl/redo02.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw16 /var/opt/oracle/oradata/orcl/redo03.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw17 /var/opt/oracle/oradata/orcl/orcl_redo2_2.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw18 /var/opt/oracle/oradata/orcl/orcl_redo3_1.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw19 /var/opt/oracle/oradata/orcl/orcl_redo3_2.log
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw20 /var/opt/oracle/oradata/orcl/examdb.dbf
[oracle @orasrv1 oracle] $ ln -s /dev/raw/raw21 /var/opt/oracle/oradata/orcl/examtemp.dbf

[oracle @orasrv1 oracle] $ ls -l /var/opt/oracle/oradata/orcl/CMQuorumFile
检查连接情况
After you finished creating the partitions, I recommend that you reboot the kernel on all RAC nodes to make sure all partitions are recognized by the kernel on all RAC nodes:
[oracle @orasrv1 oracle] $ su - root
[root @orasrv1 root] # reboot


# lvscan
$ ls -l ~oracle/oradata/orcl

9.配置远程权限

When you run the Oracle Installer on a RAC node, it will use the rsh feature for copying Oracle software to other RAC nodes. Therefore, the oracle account on the RAC node where runIntaller is launched must be trusted by all other RAC nodes. This means that you should be able to run rsh, rcp, and rlogin on this RAC node against other RAC nodes without a password. The rsh daemon validates users using the /etc/hosts.equiv file and the .rhosts file found in the user's (oracle's) home directory. Unfortunatelly, SSH is not supported.
The following steps show how I setup a trusted environment for the "oracle" account on all RAC nodes.
First make sure the "rsh" RPMs are installed on all RAC nodes:   

在oracle下检查是否安装rsh

[oracle @orasrv1 oracle] $ rpm -q rsh rsh-server

If rsh is not installed, run the following command:
[root @orasrv1 root] # su - root
[root @orasrv1 root] # rpm -ivh rsh-0.17-5.i386.rpm rsh-server-0.17-5.i386.rpm


To enable the "rsh" service, the "disable" attribute in the /etc/xinetd.d/rsh file must be set to "no" and xinetd must be refreshed. This can be done by running the following commands:
将/etc/xinetd.d/rsh变为disable不可执行

[oracle @orasrv1 oracle] $ su - root
[root @orasrv1 root] # chkconfig rsh on
[root @orasrv1 root] # chkconfig rlogin on
[root @orasrv1 root] # service xinetd reload


To allow the "oracle" user account to be trusted among the RAC nodes, create the /etc/hosts.equiv file:
[root @orasrv1 root] # su - oracle
[oracle @orasrv1 oracle] $ su
[root @orasrv1 root] # touch /etc/hosts.equiv
[root @orasrv1 root] # chmod 600 /etc/hosts.equiv
[root @orasrv1 root] # chown root.root /etc/hosts.equiv


执行以上命令行后会在/etc产生空的hosts.equiv文档,用vi 将下面内容添加
[root @orasrv1 root] # vi /etc/hosts.equiv
+rac1prv oracle
+rac2prv oracle
+rac1als oracle
+rac2als oracle

[root @orasrv1 root] # cat /etc/hosts.equiv    

查看是否添加
In the preceding example, the second field permits only the oracle user account to run rsh commands on the specified nodes. For security reasons, the /etc/hosts.equiv file should be owned by root and the permissions should be set to 600. In fact, some systems will only honor the content of this file if the owner of this file is root and the permissions are set to 600.
让oracle用户可执行rsh在相对应的节点,所以/etc/hosts.equiv权限属于root,必须设置为600。

Now you should be able to run rsh against each RAC node without having to provide the password for the oracle account:
现在可以在oracle运行rsh在RAC每个节点  在oracle下执行

[oracle @orasrv1 oracle] $ rsh rac1prv ls -l /etc/hosts.equiv
-rw-------    1 root     root           49 Oct 19 13:18 /etc/hosts.equiv
[oracle @orasrv1 oracle] $ rsh rac2prv ls -l /etc/hosts.equiv
-rw-------    1 root     root           49 Oct 19 14:39 /etc/hosts.equiv

The following changes have to be done on ALL RAC nodes.

1.生成一个CM 管理文件
如果是单节点文件系统,可以用如下命令模拟
[root @orasrv1 root] # su - oracle
[oracle @orasrv1 oracle] $ dd if=/dev/zero of=/opt/ora9/oradata/rac/RacQuorumDisk bs=1024 count=1024


用dd 生成相应的文件,放到准备好的共享磁盘设备上,大小1M 即可。

2.安装OCM 管理软件
如果是9204 for linux,直接选中9204 OCM 安装即可
如果在AS 3.0 上安装,请在安装前进行如下操作
先链接gcc   
[oracle @orasrv1 oracle] $ su - root
[root @orasrv1 root] # mv /usr/bin/gcc /usr/bin/gcc323
[root @orasrv1 root] # ln -s /usr/bin/gcc296 /usr/bin/gcc
[root @orasrv1 root] # mv /usr/bin/g++ /usr/bin/g++323
# if g++ doesn't exist, then gcc-c++ was not installed
[root @orasrv1 root] # ln -s /usr/bin/g++296 /usr/bin/g++


然后打补丁3006854,将补丁拷贝到一个临时目录。
[root @orasrv1 root] # cd $ORACLE_HOME/bin
[root @orasrv1 bin] # unzip p3006854_9204_LINUX.zip
[root @orasrv1 bin] # cd 3006854
[root @orasrv1 bin] # sh rhel3_pre_install.sh


如果在本地X Win 拒绝图形界面,注意设置
[root @orasrv1 root] # xhost +本机名或IP
在安装OCM时确定两个节点可ping通,检查rsh ,用rcp 检测节点间是否可以传送文件

3.Installing Oracle9i Cluster Manager 9.2.0.4.0

To install the Oracle Cluster Manager, insert the Oracle 9i R2 Disk 1 and launch /mnt/cdrom/runInstaller. These steps only need to be performed on one RAC node, the node you are installing from.
将oracle9i R2 Disk1 的文件解到临时目录中,并运行runIstaller,这步只需要在一个RAC节点上运行

[root @orasrv1 root] # su – oracle
[oracle @orasrv1 oracle] $ cd /tmp
[oracle @orasrv1 tmp] $ gunzip p3006854_9204_LINUX.zip
[oracle @orasrv1 tmp] $ cpio –idmv < /tmp/9204_lnx32_release.cpio
[oracle @orasrv1 oracle] $ unset LANG
[oracle @orasrv1 oracle] $ /tmp/Disk1/runIstaller


- Welcome Screen:          Click Next
- Inventory Location:       Click OK     
- Unix Group Name:        Use "oinstall".
- Root Script Window:      Open another window, login as root, and run
[root @orasrv1 root] # sh /tmp/orainstRoot.sh
                             on the node where you are running this installation (runInstaller).
                             After you run the script, click Continue.
- File Locations:          Check the defaults. I used the default values and clicked Next.
- Available Products:      Select "Oracle Cluster Manager 9.2.0.4.0"
- Public Node Information:
     Public Node 1:         rac1pub
     Public Node 2:         rac2pub
                             Click Next.
- Private Node Information:
     Private Node 1:        rac1prv
     Private Node 2:        rac2prv
                             Click Next.
- WatchDog Parameter:      Accept the default value and click Next. We won't use the Watchdog.
- Quorum Disk Information: /var/opt/oracle/oradata/orcl/CMQuorumFile
                              Click Next.
- Summary:                 Click Install
- When installation has completed, click Exit.


Applying Oracle9i Cluster Manager 9.2.0.4.0 Patch Set
one RAC nodes

To patch the Oracle Cluster Manager, launch the installer either from /mnt/cdrom/runInstaller or from $ORACLE_HOME/bin/runInstaller
[oracle @orasrv1 oracle] $ unset LANG
[oracle @orasrv1 oracle] $ /tmp/Disk1/runIstaller


- Welcome Screen:          Click Next
- Inventory Location:       Click OK     
- File Locations:          Check the defaults. I used the default values and clicked Next.
- Available Products:      Select "Oracle Cluster Manager 9.2.0.4.0"
- Public Node Information:
     Public Node 1:         rac1pub
     Public Node 2:         rac2pub
                             Click Next.
- Private Node Information:
     Private Node 1:        rac1prv
     Private Node 2:        rac2prv
                             Click Next.
- WatchDog Parameter:      Accept the default value and click Next. We won't use the Watchdog.
- Quorum Disk Information: /var/opt/oracle/oradata/orcl/CMQuorumFile
                              Click Next.
- Summary:                 Click Install
- When installation has completed, click Exit

节点1会将必要的文件rsh到节点2

4.Configuring Oracle 9i Cluster Manager

ocmargs.ora 配置文件
REMOVE or comment out the following line(s) from the

[oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/ocmargs.ora file:
Watchdogd


注释掉$ORACLE_HOME/oracm/admin/ocmargs.ora 中包含watchdogd 的行

[oracle @orasrv1 oracle] $ more $ORACLE_HOME/oracm/admin/ocmargs.ora
[oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/ocmargs.ora
#watchdogd
oracm
norestart 1800


cmcfg.ora 配置文件
ADJUST the value of the MissCount parameter in the
$ORACLE_HOME/oracm/admin/cmcfg.ora file based on the sum of the hangcheck_tick and hangcheck_margin values. The MissCount parameter must be set to at least 60 and it must be greater than the sum of hangcheck_tick + hangcheck_margin. In my example, hangcheck_tick + hangcheck_margin is 210. Therefore I set MissCount in $ORACLE_HOME/oracm/admin/cmcfg.ora to 215.
[oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/cmcfg.ora:并注释掉所有的Watchdog行
启动OCM   all RAC node
[oracle @orasrv1 oracle] $ cd $ORACLE_HOME/oracm/bin
[oracle @orasrv1 bin] $ su
[oracle @orasrv1 bin] # ./ocmstart.sh

启动完用ps –ef |grep oracm 看一下进程
root      4389     1  0 15:14 ?        00:00:00 oracm
root      4391  4389  0 15:14 ?        00:00:00 oracm
root      4392  4391  0 15:14 ?        00:00:03 oracm
root      4393  4391  0 15:14 ?        00:00:00 oracm
root      4394  4391  0 15:14 ?        00:00:03 oracm
root      4395  4391  0 15:14 ?        00:00:00 oracm
root      4396  4391  0 15:14 ?        00:00:00 oracm
root      4397  4391  0 15:14 ?        00:00:00 oracm
root      4398  4391  0 15:14 ?        00:00:00 oracm
root      4401  4391  0 15:14 ?        00:00:01 oracm
root      4449  4391  0 15:14 ?        00:00:00 oracm
root      4491  4391  0 15:14 ?        00:00:00 oracm
root      9494  4391  0 17:48 ?        00:00:00 oracm
root      9514  4391  0 17:48 ?        00:00:01 oracm
root      9519  4391  0 17:48 ?        00:00:00 oracm
root      9520  4391  0 17:48 ?        00:00:00 oracm
root      9521  4391  0 17:48 ?        00:00:00 oracm
root      9522  4391  0 17:48 ?        00:00:00 oracm
root      9526  4391  0 17:49 ?        00:00:00 oracm
oracle   12000 11685  0 18:22 pts/4    00:00:00 grep oracm


注:cluter manager有时候不能正常启动,会出现以下错误提示:
ocmstart.sh :Error: Restart is too frequent
ocmstart.sh :Info: check the system configuration and fix the problem.
ocmstart.sh: info:After you fixed the problem,remove the timestamp file
ocmstart.sh: Info:”/opt/oracle/product/9.2.0/oracm/log/ocmstart.ts”
这是因为cluster manager不能频繁启动的原因,进行以下操作可以解决马上重新启动cluster manager
[oracle @orasrv1 oracle] $ cd $ORACLE_HOME/oracm/log
[oracle @orasrv1 oracle] $ rm *.ts
[oracle @orasrv1 oracle] $ sh ./ocmstart.sh



5.Installing Oracle9i 9.2.0.4.0 Database

To install the Oracle9i Real Application Cluster 9.2.0.1.0 software, insert the Oracle9iR2 Disk 1 and launch runInstaller. These steps only need to be performed on one node, the node you are installing from.
[oracle @orasrv1 oracle] $ unset LANG
[oracle @orasrv1 oracle] $ /tmp/Disk1/runIstaller


- Welcome Screen:         Click Next
- Cluster Node Selection: Select/Highlight all RAC nodes using the shift key and the left mouse button.
                           Click Next
     Note: If not all RAC nodes are showing up, or if the Node Selection Screen
     does not appear, then the Oracle Cluster Manager (Node Monitor) oracm is probably not
running on all RAC nodes. See Starting and Stopping Oracle 9i Cluster Manager for more information.
- File Locations:         Click Next
- Available Products:     Select "Oracle9i Database 9.2.0.4.0" and click Next
- Installation Types:     Select "Enterprise Edition" and click Next
- Database Configuration: Select "Software Only" and click Next
-        Shared Configuration File Name:
                          Enter the name of an OCFS shared configuration file or the name of
                           the raw device name.
                           Select "/var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile" and click Next
- Summary:                Click Install.
                           When installation has completed, click Exit.


曾经遇见过磁盘空间不够是问题,/opt/oracle 目录下空间不够,可能是当时分区的问题,/opt/oracle挂在了swap下,可将/opt/oracle挂到sda5上,sda5是当时建的扩展分区50gb,并修改文件
vi /etc/fstab 添加相应的值 /dev/sda5  /opt/oracle  ext3  …
cp -r /opt/oracle/ /temp                          
mount /dev/sda5 /opt/oracle                  
cp -r /temp /opt/oracle/
error: you do not have sufficient privileges to write to the specified path.lin component database configuration assistant 9.2.0.0 installation cannot continue for this componet.
/opt/ora9/oradata 权限设置为 oracle.dba

注:安装过程中要非常注意提示操作,不要用装单机Oracle经验来做,有些操作需要所有节点一起做的。安装过程中要确认节点间通信正常,因为安装过程中,会从一个节点将所有oracle9i的安装文件通过网络传入其他节点中。这个安装过程只需要在一个节点进行即可,但要保证所有节点中cluster manager软件正常启动,可以用ps –ef | grep oracm查看每个节点的cluster manager状态。


安装oracle9i补丁
补丁3119415 与2617419 补丁,固定以上的ins_oemagent.mk 错误
[oracle @orasrv1 oracle] $ cd $ORACLE_HOME/bin
[oracle @orasrv1 oracle] $ cp p2617419_220_GENERIC.zip /tmp
[oracle @orasrv1 oracle] $ unzip p2617419_220_GENERIC.zip
[oracle @orasrv1 oracle] $ unzip p3119415_9204_LINUX.zip
[oracle @orasrv1 oracle] $ cd 3119415
[oracle @orasrv1 oracle] $ export PATH=$PATH:/opt/ora9/product/9.2/bin/OPatch
[oracle @orasrv1 oracle] $ export PATH=$PATH:/sbin

[oracle @orasrv1 oracle] $ which opatch
/opt/ora9/product/9.2/bin/OPatch
[oracle @orasrv1 oracle] $ opatch apply  
[oracle @orasrv1 oracle] $ cd $ORACLE_BASE/oui/bin/linux
[oracle @orasrv1 oracle] $ ln -s libclntsh.so.9.0 libclntsh.so



初试化共享文件

安装完毕后创建配置文件
su - root
[root @orasrv1 root] # mkdir -p /var/opt/oracle
[root @orasrv1 root] # touch /var/opt/oracle/srvConfig.loc
[root @orasrv1 root] # chown oracle.dba /var/opt/oracle/srvConfig.loc
[root @orasrv1 root] # chmod 755 /var/opt/oracle/srvConfig.loc


在srvConfig.loc 中间添加srvconfig_loc 参数如下:
srvconfig_loc=/var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile
创建srvConfig.dbf 文件。如果是共享设备,需要创建到共享设备上,如ocfs 文件系统或
者是raw 分区上,那么上面的文件名将有一些差异。
Starting Oracle Global Services
Initialize the Shared Configuration File
Before attempting to initialize Shared Configuration File, make sure that the Oracle Global Services daemon is NOT running, by using the following command:
要保证gsdctl没有启动
# su - oracle
[oracle @orasrv1 oracle] $ gsdctl stat


[root @orasrv1 root] # su - oracle
[oracle @orasrv1 oracle] $ srvconfig -init


NOTE: If you receive a PRKR-1025 error when attempting to run the srvconfig -init command, check that you have the valid entry for "srvconfig_loc" in your /var/opt/oracle/srvConfig.loc file and that the file is owned by "oracle". This entry gets created by the root.sh.
If you receive a PRKR-1064 error when attempting to run the srvconfig -init command, then check if /var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile file is accessable by all RAC nodes:

如果有错误,查看srvconfig.loc的权限应为oracle,在所有节点查看SharedSrvctlConfigFile
[oracle @orasrv1 oracle] $ cd ~oracle/oradata/orcl
[oracle @orasrv1 oracle] $ ls -l SharedSrvctlConfigFile
lrwxrwxrwx  1 oracle  dba   13 May  2 20:17 SharedSrvctlConfigFile ->; /dev/raw/raw2


如果用的是裸设备,你的raw共享磁盘设置的太小,加大空间再试

Start Oracle Global Services
After initializing the Shared Configuration File, you will need to manually start the Oracle Global Services daemon (gsd) to ensure that it works. At this point in the installation, the Global Services daemon should be down. To confirm this, run the following command:
初试化后,手工启动gsdctl,在启动前保证gsdctl是没有启动的
[oracle @orasrv1 oracle] $ gsdctl stat

GSD is not running on the local node
Let's manually start the Global Services daemon (gsd) by running the following command on all nodes in the RAC cluster:
[oracle @orasrv1 oracle] $ gsdctl start

如果有某节点没有启动,按下面的方法检查,如果没有在所有节点启动,在dbac时有错误
Successfully started GSD on local node
Check Node Name and Node Number Mappings
In most cases, the Oracle Global Services daemon (gsd) should successfully start on all local nodes in the RAC cluster. There are problems, however, where the node name and node number mappings are not correct in the cmcfg.ora file on node 2. This does not happen very often, but it has happened to me on at least one occasion.
If the node name and node number mappings are not correct, it will not show up until you attempt to run the Database Configuration Assistant (dbca)—the assistant we will be using later to create our cluster database. The error reported by the DBCA will say something to the effect, "gsd daemon has not been started on node 2".
To check that the node name and number mappings are correct on your cluster, run the following command on both your nodes:

Listing for node1:
[oracle @orasrv1 oracle] $ lsnodes -n
rac1als     0
rac2als     1
Listing for node2:
[oracle @orasrv1 oracle] $ lsnodes -n
rac2als    1
rac1als    0

在启动都没有错误后,添加下面语句到/etc/rc.local
. ~oracle/.bash_profile
rm -rf $ORACLE_HOME/oracm/log/*.ts
$ORACLE_HOME/oracm/bin/ocmstart.sh
su - oracle -c "gsdctl start"
su - oracle -c "lsnrctl start"


Create the Oracle Database

建库之前检查/opt/ora9/  下所有目录的权限  admin  oui  product  包括子目录
[root @orasrv1 root] # xhost +127.0.0.1
[oracle @orasrv1 oracle] $ unset LANG
[oracle @orasrv1 oracle] $ dbca


Screen Name        Response
Type of Database        Select "Oracle Cluster Database" and click "Next"
Operations        Select "Create a database" and click "Next"
Node Selection        Click the "Select All" button to the right. If all of the nodes in your RAC cluster are not showing up, or if the Node Selection Screen does not appear, then the Oracle Cluster Manager (Node Manager) oracm is probably not running on all RAC nodes. For more information, see Starting and Stopping Oracle9i Cluster Manager under the "Installing Oracle9i Cluster Manager" section.

Database Templates        Select "New Database" and click "Next"
Database Identification        Global Database Name:    orcl
SID Prefix:    orcl
Database Features        For your new database, you can keep all database features selected. I typically do. If you want to, however, you can clear any of the boxes to not install the feature in your new database.
Click "Next" when finished.
Database Connection Options        Select "Dedicated Server Mode" and click "Next"
Initialization Parameters        Click "Next"
Database Storage        If you have followed this article and created all symbolic links, then the datafiles for all tablespaces should match the DBCA. I do, however, change the initial size for each tablespace. To do this, negotiate through the navigation tree for all tablespaces and change the value for the following tablespaces:
If you need to, select appropriate files and then click "Next"
Creation Options        Click here for a snapshot of the options I used to create my cluster database
When you are ready to start the database creation process, click "Finish"
Summary        Click "OK"

SGA区的大小
SGA=log_buffer + Large_pool_size + java_pool_size + shared_pool_size + Data buffer
SGA<=物理RAM的1半,SGA不能太小,Oracle性能会差,但是也不能过大,影响操作系统正常运作。
log_buffer(日志缓冲区),通常设置成1M
Large_pool_size,大缓冲池,建议20-30M
Java_pool_size,假如数据库没有使用java,建议10-30M
Shared_pool_size,共享池,这个参数对性能影响很大,通常为物理RAM的10%
Data buffer,数据缓冲区,这个参数对性能影响也很大,建议在确定了SGA的大小,和分配完前面的内存,剩下的都可以分配给Data buffer,这里设置成300M
2. PGA区的大小
在Oracle9iR2上,已经会自动根据情况,评估好PGA区的大小,这里我们使用默认值即可,24M,PGA区一般主要影响Oracle的排序性能.
3.db_block_size
数据块尺寸,每次读写数据库的数据块大小,太小影响将会频繁的读写磁盘,造成性能下降,这个参数默认为8K,我们设置为16K
4.processor
并发进程数,Oracle自动评估会自动评估,默认为150,不能设置太大,设置太大Oracle将为控制大量的并发进程耗费大量的内存。甚至导致内存不足而当机。我们采用了默认值。
5.Session
同时会话数,默认为38,正确的设置应为processor*1.1,我们设置为170
6.max_enables_roles
   最大角色数,这个参数和性能无关,只不过为长远的数据库规划扩展考虑,这个值建议设置大一些,默认30,我们设置为145,最大不能超过148.
7.lock_sga=true
这个参数可以将SGA区锁定在物理内存里,不会被切换到虚拟内存中,可以减少页面的换入换出,从而提高性能,注意,windows不能使用这个参数。
从9iR2开始,默认不使用pfile模式启动数据库,而是采用了spfile启动数据库,这样可以更方便的改变数据库的初始化参数。查看数据库参数和更改参数语句如下
查看语句:
SQL>;show parameters 参数名
更改语句:
SQL>;alter system set 参数名 = ?;
验证 RAC 集群 / 数据库配置
当 DBCA 完成时,您将拥有一个功能完全的、运行的 Oracle RAC 集群。
这一部分提供了可用来验证您的 Oracle9 i RAC 配置的几条命令和 SQL 查询。
--Look for Oracle Cluster Manager--
$ ps -ef | grep oracm | grep -v 'grep'
$ gsdctl stat

GSD is running on the local node
--Using srvctl--
$ srvctl status database -d orcl
Instance orcl1 is running on node rac1als
Instance orcl2 is running on node rac2als
$ srvctl config database -d orcl
rac1als orcl1 /opt/ora9/product/ 9.2
rac2als orcl2 /opt/ora9/product/ 9.2
  
Query gv$instance
SELECT
inst_id
, instance_number inst_no
, instance_name inst_name
, parallel
, status
, database_status db_status
, active_state state
, host_name host
FROM gv$instance
ORDER BY inst_id;


INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST
-------- -------- ---------- --- ------- ----------- ------- -------
1 1 orcl1 YES OPEN ACTIVE NORMAL rac1als
2 2 orcl2 YES OPEN ACTIVE NORMAL rac2als

启动和停止集群
这一部分详细介绍了启动和关闭 Oracle9 i RAC 集群中的例程所必需的各种方法和命令。确保您是以 " oracle " UNIX 用户的身份登录的:
# su - oracle
启动集群
启动所有注册的实例:
$ srvctl start database -d orcl
启动 orcl2 实例:
$ srvctl start instance -d orcl -i orcl2
停止集群
关闭所有的注册实例:
$ srvctl stop database -d orcl>;
利用 immediate 选项关闭 orcl2 实例:
$ srvctl stop instance -d orcl -i orcl2 -o immediate
利用 abort 选项关闭 orcl2 实例:
$ srvctl stop instance -d orcl -i orcl2 -o abort


透明应用程序故障切换 (TAF)
企业要求它们的企业应用程序有 99.99% 或者甚至 99.999% 的可用性是很常见的。考虑一下要确保全年不超过 0.5 小时的停机时间或者甚至没有停机时间将花费多大的代价。为了回应这些大量的高可用性需求,企业正投资于在一个参与系统出现故障时能够提供自动故障切换的机制。当考虑 Oracle 数据库的可用性时, Oracle9 i RAC 提供了一个拥有高级故障切换机制的优越的解决方案。 Oracle9 i RAC 包含了全部在一个集群配置中工作并负责提供持续的可用性的所需组件 — 当集群中的一个参与系统出现故障时,用户可以自动移植到其它的可用系统上。
Oracle9 i RAC 的一个负责故障切换处理的主要组件是透明应用程序故障切换 (TAF) 选件。所有释放连接的数据库连接(和过程 ) 都被重新连接到了集群中的另一个节点上。故障切换对用户是完全透明的。
这最后一部分提供了关于 Oracle9 i RAC 中的自动故障切换如何工作的一个简短的说明。请注意,关于 Oracle9 i RAC 中的故障切换的完整讨论将自成一文。我这里的目的是提供关于它如何工作的一个简洁的概述和示例。
在继续之前要注意的一个重要事项是 TAF 在 OCI 资料库中自动发生。这意味着您的应用程序(客户机)代码不需要修改就可以使用 TAF 。不过,将需要在 Oracle TNS 文件 tnsnames.ora 上执行某些配置步骤。
Listener.ora 配置  

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.220)(PORT = 1521))
      )
    )
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /opt/ora9/product/9.2)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = orcl)
      (ORACLE_HOME = /opt/ora9/product/9.2)
      (SID_NAME = orcl1)
    )
  )
注意: 切记使用 Java 瘦客户机将不能参与 TAF ,因为它从不读取 tnsnames.ora 文件。


设置 tnsnames.ora 文件
在演示 TAF 之前,我们需要在一个非 RAC 客户机上配置一个 tnsnames.ora 文件(如果您已有一台 Windows 计算机)。确保您安装了 Oracle RDBMS 软件。(实际上,您只需要在客户机上安装 Oracle 软件。)
下面是我为了和新的 Oracle 集群数据库连接而在 Windows 客户机上的 %ORACLE_HOME%\network\admin\tnsnames.ora 文件中放入的项目:
ORCL =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.220)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.221)(PORT = 1521))
(LOAD_BALANCE = on)
(FAILOVER = on)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
(FAILOVER_MODE =
(TYPE = session)
(METHOD = basic)
)
)
)
ORCL2 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.221)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SID = orcl2)
      (SERVER = DEDICATED)
    )
  )
ORCL1 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.220)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SID = orcl1)
      (SERVER = DEDICATED)
    )
  )


注:强烈建议客户端的配置文件tnsnames.ora参照服务器的tnsnames.ora的设置,一定会没有问题,不一定盲目按照以上文档模式生成自己的tnsnames.ora文件,因为各种版本问题可能会导致客户端不能稳定的连接到数据库上,所以要确定客户端的配置和服务器端的配置一样。
强烈建议:
在启动群集数据库的时候,使用标准的群集数据库控制语句控制每个节点的数据库启动和关闭。
例如,节点一的实例名为orcl1,节点2的实例名为orcl2
重启数据库过程如下:
在每一个节点上正常运行cluster manager
在每一个节点上gsdctl start成功
启动第一个节点srvctl start instance –d orcl –i orcl1
启动第二个节点srvctl start instance –d orcl –i orcl1
查看所有节点状态srvctl status database –d orcl
注:3,4步操作,也可以用srvctl start database –d orcl代替
显示
Instance orcl1 is running on node rac1pub
Instance orcl2 is running on node rac2pub
表示正常
如果想关掉一个节点的实例:srvctl stop instance –d orcl –i orcl1
建议启动每个节点的数据库不要用startup命令,关闭数据库不要使用shutdown,这些都是单节点的命令,虽然也许可能也可以正常使用,但是还是强烈建议采用官方推荐的语句srvctl,试验证明,使用单机启动或关闭数据库的命令控制群集(RAC)数据库有时候会造成数据库当机的情况。
附录:
客户端的tnsnames.ora的参考配置方法:
ORCL =
  (DESCRIPTION =
    (LOAD_BALANCE = yes)
    (FAILOVER = ON)
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac1pub)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac2pub)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl)
      (FAILOVER_MODE =
        (type = select)
        (method = basic)
      )
    )
  )
ORCL1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac1pub)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = orcl)
      (INSTANCE_NAME = orcl1)
    )
  )

ORCL2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac2pub)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = orcl)
      (INSTANCE_NAME = orcl2)
    )
  )

从我们的 Windows(或其它的非 RAC 客户机)中,以 SYSTEM 用户身份登录集群数据库 (orcl):
$sqlplus system/oracle@orcl
SQL*PlusRelease 9.2.0.3.0 - Production on Mon May 10 21:17:07 2004
Copyright (c) 1982, 2002, Oracle Corporation.All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
SQL>; select instance_name from v$instance;
INSTANCE_NAME
------------- --------- -
orcl1  
      
不要注销上面的 SQL*Plus 会话!
已经运行了上面的查询,现在我们应该使用 abort 选项来关闭 rac1als 上的 orcl1 实例。要执行这一操作,我们可以使用 srvctl 命令行实用程序,如下所示:
# su - oracle
$ srvctl status database -d orcl

Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2


SQL>;shutdown immediate
现在返回SQL 会话,然后重新运行缓冲中的 SQL 语句:
SQL>;connect system/oracle@orcl
SQL>; select instance_name from v$instance;

INSTANCE_NAME
------------- --------- -
orcl2     
   
SQL>; exit
从上面的演示中,我们可以看到上述会话现在已经被故障切换到了 rac2als 上的实例 orcl2 上。
在重新连接时,会有延时。wait
阅读(2514) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~