For this article I used the following Oracle setup: RAC node Database Name SID $ORACLE_BASE Oracle Datafile Directory --------------- ------------- ------ ------------ ---------------------------- rac1als/rac1prv orcl orcl1 /opt/oracle/ /var/opt/oracle/oradata/orcl rac2als/rac2prv orcl orcl2 /opt/oracle/ /var/opt/oracle/oradata/orcl
Server 1 (rac1pub) Device IP Address Subnet Purpose eth0 192.168.0.220 255.255.255.0 Connects rac1pub to the public network eth1 10.0.01 255.255.0.0 Connects rac1pub (interconnect) to rac2pub (rac2prv) /etc/hosts 127.0.0.1 ciqa.server ciqaora1 192.168.0.220 rac1pub.ciqa.server rac1als 10.0.0.1 rac1prv 192.168.0.221 rac2pub.ciqa.server rac2als 10.0.02 rac2prv
Server 2 (rac2pub) Device IP Address Subnet Purpose eth0 192.168.0.221 255.255.255.0 Connects rac2pub to the public network eth1 10.0.0.2 255.255.0.0 Connects rac2pub (interconnect) to rac1pub (rac1prv) /etc/hosts 127.0.0.1 ciqa.server ciqaora2 192.168.0.220 rac1pub.ciqa.server rac1als 10.0.0.1 rac1prv 192.168.0.221 rac2pub.ciqa.server rac2als 10.0.02 rac2prv
在网络设置中eth1和eth0,将<当计算机启动时激活设备>;选择 如果确定在单个节点上模拟RAC,那么/etc/hosts文件内容可以类似如下 [oracle@orasrv1 oracle]$ vi /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 huiheng.localdomain orasrv1(orasrv2) 10.0.0.1 rac1prv 10.0.0.2 rac2prv 192.168.0.220 rac1pub.huiheng. localdomain rac1als 192.168.0.221 rac2pub.huiheng. localdomain rac2als 配置后重起network [root @orasrv1 root] # service network restart
1. Check node one using the following procedure: a. Ping node 2 using the private hostname. b. Ping node 2 using the public hostname. c. Ping node 2 using the private IP address. d. Ping node 2 using the public IP address.
2. Check node two using the following procedure: a. Ping node 1 using the private hostname. b. Ping node 1 using the public hostname. c. Ping node 1 using the private IP address. d. Ping node 1 using the public IP address.
[oracle @orasrv1 oracle] $ ls -l /var/opt/oracle/oradata/orcl/CMQuorumFile 检查连接情况 After you finished creating the partitions, I recommend that you reboot the kernel on all RAC nodes to make sure all partitions are recognized by the kernel on all RAC nodes: [oracle @orasrv1 oracle] $ su - root [root @orasrv1 root] # reboot
# lvscan $ ls -l ~oracle/oradata/orcl 9.配置远程权限
When you run the Oracle Installer on a RAC node, it will use the rsh feature for copying Oracle software to other RAC nodes. Therefore, the oracle account on the RAC node where runIntaller is launched must be trusted by all other RAC nodes. This means that you should be able to run rsh, rcp, and rlogin on this RAC node against other RAC nodes without a password. The rsh daemon validates users using the /etc/hosts.equiv file and the .rhosts file found in the user's (oracle's) home directory. Unfortunatelly, SSH is not supported. The following steps show how I setup a trusted environment for the "oracle" account on all RAC nodes. First make sure the "rsh" RPMs are installed on all RAC nodes:
在oracle下检查是否安装rsh
[oracle @orasrv1 oracle] $ rpm -q rsh rsh-server
If rsh is not installed, run the following command: [root @orasrv1 root] # su - root [root @orasrv1 root] # rpm -ivh rsh-0.17-5.i386.rpm rsh-server-0.17-5.i386.rpm
To enable the "rsh" service, the "disable" attribute in the /etc/xinetd.d/rsh file must be set to "no" and xinetd must be refreshed. This can be done by running the following commands: 将/etc/xinetd.d/rsh变为disable不可执行
[oracle @orasrv1 oracle] $ su - root [root @orasrv1 root] # chkconfig rsh on [root @orasrv1 root] # chkconfig rlogin on [root @orasrv1 root] # service xinetd reload
To allow the "oracle" user account to be trusted among the RAC nodes, create the /etc/hosts.equiv file: [root @orasrv1 root] # su - oracle [oracle @orasrv1 oracle] $ su [root @orasrv1 root] # touch /etc/hosts.equiv [root @orasrv1 root] # chmod 600 /etc/hosts.equiv [root @orasrv1 root] # chown root.root /etc/hosts.equiv
查看是否添加 In the preceding example, the second field permits only the oracle user account to run rsh commands on the specified nodes. For security reasons, the /etc/hosts.equiv file should be owned by root and the permissions should be set to 600. In fact, some systems will only honor the content of this file if the owner of this file is root and the permissions are set to 600. 让oracle用户可执行rsh在相对应的节点,所以/etc/hosts.equiv权限属于root,必须设置为600。
Now you should be able to run rsh against each RAC node without having to provide the password for the oracle account: 现在可以在oracle运行rsh在RAC每个节点 在oracle下执行
[oracle @orasrv1 oracle] $ rsh rac1prv ls -l /etc/hosts.equiv -rw------- 1 root root 49 Oct 19 13:18 /etc/hosts.equiv [oracle @orasrv1 oracle] $ rsh rac2prv ls -l /etc/hosts.equiv -rw------- 1 root root 49 Oct 19 14:39 /etc/hosts.equiv
The following changes have to be done on ALL RAC nodes.
To install the Oracle Cluster Manager, insert the Oracle 9i R2 Disk 1 and launch /mnt/cdrom/runInstaller. These steps only need to be performed on one RAC node, the node you are installing from. 将oracle9i R2 Disk1 的文件解到临时目录中,并运行runIstaller,这步只需要在一个RAC节点上运行
- Welcome Screen: Click Next - Inventory Location: Click OK - Unix Group Name: Use "oinstall". - Root Script Window: Open another window, login as root, and run [root @orasrv1 root] # sh /tmp/orainstRoot.sh on the node where you are running this installation (runInstaller). After you run the script, click Continue. - File Locations: Check the defaults. I used the default values and clicked Next. - Available Products: Select "Oracle Cluster Manager 9.2.0.4.0" - Public Node Information: Public Node 1: rac1pub Public Node 2: rac2pub Click Next. - Private Node Information: Private Node 1: rac1prv Private Node 2: rac2prv Click Next. - WatchDog Parameter: Accept the default value and click Next. We won't use the Watchdog. - Quorum Disk Information: /var/opt/oracle/oradata/orcl/CMQuorumFile Click Next. - Summary: Click Install - When installation has completed, click Exit.
Applying Oracle9i Cluster Manager 9.2.0.4.0 Patch Set one RAC nodes To patch the Oracle Cluster Manager, launch the installer either from /mnt/cdrom/runInstaller or from $ORACLE_HOME/bin/runInstaller [oracle @orasrv1 oracle] $ unset LANG [oracle @orasrv1 oracle] $ /tmp/Disk1/runIstaller
- Welcome Screen: Click Next - Inventory Location: Click OK - File Locations: Check the defaults. I used the default values and clicked Next. - Available Products: Select "Oracle Cluster Manager 9.2.0.4.0" - Public Node Information: Public Node 1: rac1pub Public Node 2: rac2pub Click Next. - Private Node Information: Private Node 1: rac1prv Private Node 2: rac2prv Click Next. - WatchDog Parameter: Accept the default value and click Next. We won't use the Watchdog. - Quorum Disk Information: /var/opt/oracle/oradata/orcl/CMQuorumFile Click Next. - Summary: Click Install - When installation has completed, click Exit 节点1会将必要的文件rsh到节点2
4.Configuring Oracle 9i Cluster Manager
ocmargs.ora 配置文件 REMOVE or comment out the following line(s) from the
[oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/ocmargs.ora file: Watchdogd
[oracle @orasrv1 oracle] $ more $ORACLE_HOME/oracm/admin/ocmargs.ora [oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/ocmargs.ora#watchdogd oracm norestart 1800
cmcfg.ora 配置文件 ADJUST the value of the MissCount parameter in the $ORACLE_HOME/oracm/admin/cmcfg.ora file based on the sum of the hangcheck_tick and hangcheck_margin values. The MissCount parameter must be set to at least 60 and it must be greater than the sum of hangcheck_tick + hangcheck_margin. In my example, hangcheck_tick + hangcheck_margin is 210. Therefore I set MissCount in $ORACLE_HOME/oracm/admin/cmcfg.ora to 215. [oracle @orasrv1 oracle] $ vi $ORACLE_HOME/oracm/admin/cmcfg.ora:并注释掉所有的Watchdog行 启动OCM all RAC node [oracle @orasrv1 oracle] $ cd $ORACLE_HOME/oracm/bin [oracle @orasrv1 bin] $ su [oracle @orasrv1 bin] # ./ocmstart.sh 启动完用ps –ef |grep oracm 看一下进程 root 4389 1 0 15:14 ? 00:00:00 oracm root 4391 4389 0 15:14 ? 00:00:00 oracm root 4392 4391 0 15:14 ? 00:00:03 oracm root 4393 4391 0 15:14 ? 00:00:00 oracm root 4394 4391 0 15:14 ? 00:00:03 oracm root 4395 4391 0 15:14 ? 00:00:00 oracm root 4396 4391 0 15:14 ? 00:00:00 oracm root 4397 4391 0 15:14 ? 00:00:00 oracm root 4398 4391 0 15:14 ? 00:00:00 oracm root 4401 4391 0 15:14 ? 00:00:01 oracm root 4449 4391 0 15:14 ? 00:00:00 oracm root 4491 4391 0 15:14 ? 00:00:00 oracm root 9494 4391 0 17:48 ? 00:00:00 oracm root 9514 4391 0 17:48 ? 00:00:01 oracm root 9519 4391 0 17:48 ? 00:00:00 oracm root 9520 4391 0 17:48 ? 00:00:00 oracm root 9521 4391 0 17:48 ? 00:00:00 oracm root 9522 4391 0 17:48 ? 00:00:00 oracm root 9526 4391 0 17:49 ? 00:00:00 oracm oracle 12000 11685 0 18:22 pts/4 00:00:00 grep oracm
注:cluter manager有时候不能正常启动,会出现以下错误提示: ocmstart.sh :Error: Restart is too frequent ocmstart.sh :Info: check the system configuration and fix the problem. ocmstart.sh: info:After you fixed the problem,remove the timestamp file ocmstart.sh: Info:”/opt/oracle/product/9.2.0/oracm/log/ocmstart.ts” 这是因为cluster manager不能频繁启动的原因,进行以下操作可以解决马上重新启动cluster manager [oracle @orasrv1 oracle] $ cd $ORACLE_HOME/oracm/log [oracle @orasrv1 oracle] $ rm *.ts [oracle @orasrv1 oracle] $ sh ./ocmstart.sh
5.Installing Oracle9i 9.2.0.4.0 Database
To install the Oracle9i Real Application Cluster 9.2.0.1.0 software, insert the Oracle9iR2 Disk 1 and launch runInstaller. These steps only need to be performed on one node, the node you are installing from. [oracle @orasrv1 oracle] $ unset LANG [oracle @orasrv1 oracle] $ /tmp/Disk1/runIstaller
- Welcome Screen: Click Next - Cluster Node Selection: Select/Highlight all RAC nodes using the shift key and the left mouse button. Click Next Note: If not all RAC nodes are showing up, or if the Node Selection Screen does not appear, then the Oracle Cluster Manager (Node Monitor) oracm is probably not running on all RAC nodes. See Starting and Stopping Oracle 9i Cluster Manager for more information. - File Locations: Click Next - Available Products: Select "Oracle9i Database 9.2.0.4.0" and click Next - Installation Types: Select "Enterprise Edition" and click Next - Database Configuration: Select "Software Only" and click Next - Shared Configuration File Name: Enter the name of an OCFS shared configuration file or the name of the raw device name. Select "/var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile" and click Next - Summary: Click Install. When installation has completed, click Exit.
曾经遇见过磁盘空间不够是问题,/opt/oracle 目录下空间不够,可能是当时分区的问题,/opt/oracle挂在了swap下,可将/opt/oracle挂到sda5上,sda5是当时建的扩展分区50gb,并修改文件 vi /etc/fstab 添加相应的值 /dev/sda5 /opt/oracle ext3 … cp -r /opt/oracle/ /temp mount /dev/sda5 /opt/oracle cp -r /temp /opt/oracle/ error: you do not have sufficient privileges to write to the specified path.lin component database configuration assistant 9.2.0.0 installation cannot continue for this componet. /opt/ora9/oradata 权限设置为 oracle.dba
在srvConfig.loc 中间添加srvconfig_loc 参数如下: srvconfig_loc=/var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile 创建srvConfig.dbf 文件。如果是共享设备,需要创建到共享设备上,如ocfs 文件系统或 者是raw 分区上,那么上面的文件名将有一些差异。 Starting Oracle Global Services Initialize the Shared Configuration File Before attempting to initialize Shared Configuration File, make sure that the Oracle Global Services daemon is NOT running, by using the following command: 要保证gsdctl没有启动 # su - oracle [oracle @orasrv1 oracle] $ gsdctl stat
NOTE: If you receive a PRKR-1025 error when attempting to run the srvconfig -init command, check that you have the valid entry for "srvconfig_loc" in your /var/opt/oracle/srvConfig.loc file and that the file is owned by "oracle". This entry gets created by the root.sh. If you receive a PRKR-1064 error when attempting to run the srvconfig -init command, then check if /var/opt/oracle/oradata/orcl/SharedSrvctlConfigFile file is accessable by all RAC nodes:
如果有错误,查看srvconfig.loc的权限应为oracle,在所有节点查看SharedSrvctlConfigFile [oracle @orasrv1 oracle] $ cd ~oracle/oradata/orcl [oracle @orasrv1 oracle] $ ls -l SharedSrvctlConfigFile lrwxrwxrwx 1 oracle dba 13 May 2 20:17 SharedSrvctlConfigFile ->; /dev/raw/raw2
如果用的是裸设备,你的raw共享磁盘设置的太小,加大空间再试
Start Oracle Global Services After initializing the Shared Configuration File, you will need to manually start the Oracle Global Services daemon (gsd) to ensure that it works. At this point in the installation, the Global Services daemon should be down. To confirm this, run the following command: 初试化后,手工启动gsdctl,在启动前保证gsdctl是没有启动的 [oracle @orasrv1 oracle] $ gsdctl stat
GSD is not running on the local node Let's manually start the Global Services daemon (gsd) by running the following command on all nodes in the RAC cluster: [oracle @orasrv1 oracle] $ gsdctl start
如果有某节点没有启动,按下面的方法检查,如果没有在所有节点启动,在dbac时有错误 Successfully started GSD on local node Check Node Name and Node Number Mappings In most cases, the Oracle Global Services daemon (gsd) should successfully start on all local nodes in the RAC cluster. There are problems, however, where the node name and node number mappings are not correct in the cmcfg.ora file on node 2. This does not happen very often, but it has happened to me on at least one occasion. If the node name and node number mappings are not correct, it will not show up until you attempt to run the Database Configuration Assistant (dbca)—the assistant we will be using later to create our cluster database. The error reported by the DBCA will say something to the effect, "gsd daemon has not been started on node 2". To check that the node name and number mappings are correct on your cluster, run the following command on both your nodes:
Screen Name Response Type of Database Select "Oracle Cluster Database" and click "Next" Operations Select "Create a database" and click "Next" Node Selection Click the "Select All" button to the right. If all of the nodes in your RAC cluster are not showing up, or if the Node Selection Screen does not appear, then the Oracle Cluster Manager (Node Manager) oracm is probably not running on all RAC nodes. For more information, see Starting and Stopping Oracle9i Cluster Manager under the "Installing Oracle9i Cluster Manager" section.
Database Templates Select "New Database" and click "Next" Database Identification Global Database Name: orcl SID Prefix: orcl Database Features For your new database, you can keep all database features selected. I typically do. If you want to, however, you can clear any of the boxes to not install the feature in your new database. Click "Next" when finished. Database Connection Options Select "Dedicated Server Mode" and click "Next" Initialization Parameters Click "Next" Database Storage If you have followed this article and created all symbolic links, then the datafiles for all tablespaces should match the DBCA. I do, however, change the initial size for each tablespace. To do this, negotiate through the navigation tree for all tablespaces and change the value for the following tablespaces: If you need to, select appropriate files and then click "Next" Creation Options Click here for a snapshot of the options I used to create my cluster database When you are ready to start the database creation process, click "Finish" Summary Click "OK" SGA区的大小 SGA=log_buffer + Large_pool_size + java_pool_size + shared_pool_size + Data buffer SGA<=物理RAM的1半,SGA不能太小,Oracle性能会差,但是也不能过大,影响操作系统正常运作。 log_buffer(日志缓冲区),通常设置成1M Large_pool_size,大缓冲池,建议20-30M Java_pool_size,假如数据库没有使用java,建议10-30M Shared_pool_size,共享池,这个参数对性能影响很大,通常为物理RAM的10% Data buffer,数据缓冲区,这个参数对性能影响也很大,建议在确定了SGA的大小,和分配完前面的内存,剩下的都可以分配给Data buffer,这里设置成300M 2. PGA区的大小 在Oracle9iR2上,已经会自动根据情况,评估好PGA区的大小,这里我们使用默认值即可,24M,PGA区一般主要影响Oracle的排序性能. 3.db_block_size 数据块尺寸,每次读写数据库的数据块大小,太小影响将会频繁的读写磁盘,造成性能下降,这个参数默认为8K,我们设置为16K 4.processor 并发进程数,Oracle自动评估会自动评估,默认为150,不能设置太大,设置太大Oracle将为控制大量的并发进程耗费大量的内存。甚至导致内存不足而当机。我们采用了默认值。 5.Session 同时会话数,默认为38,正确的设置应为processor*1.1,我们设置为170 6.max_enables_roles 最大角色数,这个参数和性能无关,只不过为长远的数据库规划扩展考虑,这个值建议设置大一些,默认30,我们设置为145,最大不能超过148. 7.lock_sga=true 这个参数可以将SGA区锁定在物理内存里,不会被切换到虚拟内存中,可以减少页面的换入换出,从而提高性能,注意,windows不能使用这个参数。 从9iR2开始,默认不使用pfile模式启动数据库,而是采用了spfile启动数据库,这样可以更方便的改变数据库的初始化参数。查看数据库参数和更改参数语句如下 查看语句: SQL>;show parameters 参数名 更改语句: SQL>;alter system set 参数名 = ?; 验证 RAC 集群 / 数据库配置 当 DBCA 完成时,您将拥有一个功能完全的、运行的 Oracle RAC 集群。 这一部分提供了可用来验证您的 Oracle9 i RAC 配置的几条命令和 SQL 查询。 --Look for Oracle Cluster Manager-- $ ps -ef | grep oracm | grep -v 'grep' $ gsdctl stat GSD is running on the local node --Using srvctl-- $ srvctl status database -d orcl Instance orcl1 is running on node rac1als Instance orcl2 is running on node rac2als $ srvctl config database -d orcl rac1als orcl1 /opt/ora9/product/ 9.2 rac2als orcl2 /opt/ora9/product/ 9.2
Query gv$instance SELECT inst_id , instance_number inst_no , instance_name inst_name , parallel , status , database_status db_status , active_state state , host_name host FROM gv$instance ORDER BY inst_id;
INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST -------- -------- ---------- --- ------- ----------- ------- ------- 1 1 orcl1 YES OPEN ACTIVE NORMAL rac1als 2 2 orcl2 YES OPEN ACTIVE NORMAL rac2als 启动和停止集群 这一部分详细介绍了启动和关闭 Oracle9 i RAC 集群中的例程所必需的各种方法和命令。确保您是以 " oracle " UNIX 用户的身份登录的: # su - oracle 启动集群 启动所有注册的实例: $ srvctl start database -d orcl 启动 orcl2 实例: $ srvctl start instance -d orcl -i orcl2 停止集群 关闭所有的注册实例: $ srvctl stop database -d orcl>; 利用 immediate 选项关闭 orcl2 实例: $ srvctl stop instance -d orcl -i orcl2 -o immediate 利用 abort 选项关闭 orcl2 实例: $ srvctl stop instance -d orcl -i orcl2 -o abort