一、系统环境准备
[root@RHEL5 ~]# uname -r
2.6.18-92.el5
ASM:
oracleasm-support-2.0.4-1.el5.i386.rpm
oracleasmlib-2.0.3-1.el5.i386.rpm
oracleasm-2.6.18-92.el5-2.0.4-2.el5.i686.rpm
oracle clusterware:
10201(可以升级到10204)
oracle database:
10201(可以升级到10204)
Asm下载地址:
下载时看清是x86还是x86_64,一定看好自己的内核版本,特别是小版本好
clusterware下载地址:
RAC01系统安装
注意第一个系统安装时建立两块硬盘,第二块硬盘立即分配所有空间。
最小安装时,需要保留以下的软件包:
#####################################
1 desktop environments:
gnome desktop environment
2 applications:
editer
3 development:
development libraries
development tools
gnome software development
4 servers:
windows file servr
5 base system:
base
x windows system
6 cluster storage:
不安装
7 clustering:
不安装
8 virtualization:
不安装
二、硬件环境
RAC02
添加一块新硬盘—---使用一个已存在的虚拟磁盘------指定磁盘文件,指向rac01上预分配的磁盘
注意使用共享硬盘需要在两个节点上编辑文件Red Hat Enterprise Linux 5.vmx,添加以下参数:
disk.locking = "false"
diskLib.dataCacheMaxSize = "0"
diskLib.dataCacheMaxReadAheadSize = "0"
diskLib.DataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"
三 配置Linux环境
1、创建用户和组
[root@rac01 ~]# groupadd -g 1001 dba
[root@rac01 ~]# groupadd -g 1002 oinstall
[root@rac01 ~]# useradd -u 1001 -g oinstall -G dba oracle
[root@rac01 ~]# passwd oracle
[root@rac01 ~]# id nobody 查看是否存在匿名用户,如不存在则创建。
2、配置网络
网络规划
主机名 类型 IP地址 注册位置
rac01 公共 192.168.7.244 /etc/hosts
rac02 公共 192.168.7.225 /etc/hosts
vip01 虚拟 192.168.7.220 /etc/hosts
vip02 虚拟 192.168.7.226 /etc/hosts
priv01 专用 10.10.10.1 /etc/hosts
priv02 专用 10.10.10.2 /etc/hosts
配置主机名解析
vi /etc/hosts
192.168.7.244 rac01
192.168.7.220 rac01-vip
10.10.10.1 rac01-priv
192.168.7.225 rac02
192.168.7.226 rac02-vip
10.10.10.2 rac02-priv
特别注意:确保 RAC 节点名没有出现在回送地址中
注意主机名的大小写。
确保在 /etc/hosts 文件的回送地址中不包含节点名(linux1 或 linux2)。如果机器名出现在回送地址条目中,如下所示:
127.0.0.1 linux1 localhost.localdomain localhost
需要按如下所示将其删除:
127.0.0.1 localhost.localdomain localhost
如果 RAC 节点名出现在回送地址中,您在 RAC 安装期间将接收到以下错误信息:
ORA-00603: ORACLE server session terminated by fatal error
或
ORA-29702: error occurred in Cluster Group Service operation
3、配置两服务器时间同步
Rac01
[root@rac01 ~]# vi /etc/ntp.conf
添加
restrict 192.168.7.225 mask 255.255.255.0 nomodify notrap
启动服务
[root@rac01 ~]# service ntpd start
Rac02
[root@rac02 ~]# crontab -e
5 * * * * ntpdate 192.168.7.244
4、配置SSH
su - oracle
ssh-keygen -t rsa 生成密钥对,使用空密码
生成密钥对:
只需在一台服务器上操作:
[root@rac01 ~]# ssh 192.168.7.244 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
[root@rac01 ~]# ssh 192.168.7.225 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
[root@rac01 ~]# scp authorized_keys 192.168.7.225:/home/oracle/.ssh/
[root@rac01 ~]# scp authorized_keys 192.168.7.244:/home/oracle/.ssh/
需在两台台服务器上操作:
[root@rac01/02 ~]# chmod 600 /home/oracle/.ssh/authorized_keys
验证: [root@rac01/02 ~]# ssh 192.168.7.225 date
[root@rac01/02 ~]# ssh 192.168.7.244 date
5、安装所需的rpm包
compat-db-4.2.52-5.1.i386.rpm
compat-gcc-34-3.4.6-4.i386.rpm
compat-gcc-34-c++-3.4.6-4.i386.rpm
compat-libgcc-296-2.96-138.i386.rpm
compat-libstdc++-296-2.96-138.i386.rpm
compat-libstdc++-33-3.2.3-61.i386.rpm
fontconfig-devel-2.4.1-7.el5.i386.rpm
freetype-devel-2.2.1-19.el5.i386.rpm
install10g_for_RHEL5.sh
libaio-0.3.106-3.2.i386.rpm
libaio-devel-0.3.106-3.2.i386.rpm
libXp-1.0.0-8.1.el5.i386.rpm
openmotif22-2.2.3-18.i386.rpm
openmotif-2.3.0-0.5.el5.i386.rpm
6、配置内核参数
编辑 /etc/sysctl.conf,添加:
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 2147483648
net.ipv4.ip_local_port_range = 1024 65000
net.core.rem_default = 1048576
net.core.rem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
使参数生效 sysctl –p
7、设置环境变量
rac01
[oracle@rac01 ~]$ cat >> /home/oracle/.bashrc <export ORACLE_BASE=/orac/orahome
export ORACLE_HOME=/orac/orahome/product/10g
export ORACLE_SID=orcl1 #####
export NLS_LANG=american_america.ZHS16GBK
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:$LD_LIBRARY_PATH
export PATH=$PATH:/usr/sbin:/etc:$ORACLE_HOME/bin
export LANG=en_US
export THREADS_FLAG=native
EOF
rac02
[oracle@rac02 ~]$ cat >> /home/oracle/.bashrc <export ORACLE_BASE=/orac/orahome
export ORACLE_HOME=/orac/orahome/product/10g
export ORACLE_SID=orcl2 #####
export NLS_LANG=american_america.ZHS16GBK
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:$LD_LIBRARY_PATH
export PATH=$PATH:/usr/sbin:/etc:$ORACLE_HOME/bin
export LANG=en_US
export THREADS_FLAG=native
EOF
8、配置shell对oracle的限制
[root@rac01/02 ~]# cat >> /etc/security/limits.conf <#For Oracle
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
oracle soft memlock 3145728
oracle hard memlock 3145728
EOF
以上的限制需要启用pam模块的一个选项
[root@rac01/02 ~]# cat >> /etc/pam.d/login <#For Oracle
session required /lib/security/pam_limits.so
EOF
9、配置Hangcheck计时器
[root@rac01/02 ~]# cat >> /etc/rc.d/rc.local <modprobe hangcheck-timer hangcheck-tick=30 hangcheck_margin=180
EOF
四、配置磁盘 (2个裸设备4个ASMDISK)
1、配置目录
[root@rac01 ~]# mkdir -p/orac/orahome (oracle软件安装目录)
[root@rac01 ~]# mkdir -p/orac/crs (集群就绪软件目录)
注:下面的两个文件不需要创建,建好raw设备直接链接过来即可
/orac/crs/ocr.crs (集群注册) OCR大概需要至少100M空间
/orac/crs/vote.crs (表决磁盘) voting大概需要至少20M的空间
[root@rac01 ~]# chown -R oracle:oinstall /orac
[root@rac01 ~]# chmod -R 775 /orac
使用fdisk 对共享磁盘分区,只需在一个节点上执行,创建完后须要分别重起两个节点,重起后查看两个节点的磁盘是否都已经出现了刚才创建的分区
Device Boot Start End Blocks Id System
/dev/sdc1 1 25 200781 83 Linux
/dev/sdc2 26 50 200812+ 83 Linux
/dev/sdc3 51 318 2152710 83 Linux
/dev/sdc4 319 913 4779337+ 5 Extended
/dev/sdc5 319 586 2152678+ 83 Linux
/dev/sdc6 587 854 2152678+ 83 Linux
2、配置 raw设备 (不能使用lvm+raw)
[root@rac01 ~]# raw /dev/raw/raw1 /dev/sdc1
[root@rac01 ~]# raw /dev/raw/raw2 /dev/sdc2
如果要取消裸设备的绑定,可以重建绑定这个裸设备到0 0。
[root@rac01 ~]# raw /dev/raw/raw1 0 0
授权裸设备
[root@rac01 ~]# chown oracle:oinstall /dev/raw/raw[1-2]
[root@rac01 ~]# chmod 660 /dev/raw/raw[1-2]
ORACLE用户创建链接:
使用oracle用户
[oracle@rac01 ~]$ ln -s /dev/raw/raw1 /orac/crs/ocr.crs
[oracle@rac01 ~]$ ln -s /dev/raw/raw2 /orac/crs/vote.crs
使开机启动自动绑定raw设备
cat >> /etc/rc.d/rc.local <raw /dev/raw/raw1 /dev/sdc1
raw /dev/raw/raw2 /dev/sdc2
chown oracle:oinstall /dev/raw/raw[1-2]
chmod 660 /dev/raw/raw[1-2]
EOF
3、配置asm 设备
[root@rac01 ~]# rpm -ivh oracleasm-support-2.0.4-1.el5.i386.rpm
[root@rac01 ~]# rpm -ivh oracleasm-2.6.18-92.el5-2.0.4-2.el5.i686.rpm
[root@rac01 ~]# rpm -ivh oracleasmlib-2.0.3-1.el5.i386.rpm
配置ASMLib
(以下操作需要在两个节点上执行)
[root@rac01 ~]# /etc/init.d/oracleasm configure
Default user to own the driver interface [oracle]: oracle
Default group to own the driver interface [oinstall]: oinstall
Start Oracle ASM library driver on boot (y/n) [y]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: [ OK ]
Loading module "oracleasm": [ OK ]
Mounting ASMlib driver filesystem: [ OK ]
Scanning system for ASM disks: [ OK ]
以上操作将加载oracleasm.o驱动,并且mount上ASM文件系统,同时我们可以通过以下命令来手工的卸载和加载ASMLib
[root@rac01 ~]# /etc/init.d/oracleasm disable //出错检查/var/log/messages文件,确认需要更新的内核版本
[root@rac01 ~]# /etc/init.d/oracleasm enable
添加init文件使系统启动时自动加载ASMLib
[oracle@rac01 ~]$ su - root
[root@rac01 ~]# cd /etc/rc3.d
[root@rac01 ~]# ln -s ../init.d/oracleasm S99oracleasm
[root@rac01 ~]# ln -s ../init.d/oracleasm K01oracleasm
重新启动系统,确认ASMLib已经可以自动加载
[root@rac01 ~]#lsmod |grep oracleasm
以下操作只需在一个节点上操作
创建ASM磁盘(说明:createdisk 是针对分区,不是针对磁盘,即:先应将磁盘分区)
[root@rac01 ~]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdc5
[root@rac01 ~]# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc6
[root@rac01 ~]# /etc/init.d/oracleasm createdisk VOL3 /dev/sdc7
以下操作需在两个节点上操作
[root@rac01 ~]# /etc/init.d/oracleasm scandisks
[root@rac01 ~]# /etc/init.d/oracleasm listdisks
[root@rac01 ~]# /etc/init.d/oracleasm listdisks //列出ASM磁盘
VOL1
VOL2
VOL3
VOL4
如果要删除ASM磁盘通过以下命令
[root@rac01 ~]# /etc/init.d/oracleasm deletedisk VOL4
注意:
如果是在RAC环境中的某一个节点中添加了ASM磁盘,那么需要在其他的节点上运行scandisk来获取这种变化
[root@rac01 ~]# /etc/init.d/oracleasm scandisks
OK,现在已经完成了创建ASM实例的物理基础,下面开始安装数据库
五、安装Clusterware
注意: 时间同步很重要,否则安装过程出错。因为vmware的时间问题,我在安装到同步文件到rac02这步时,每2秒同步一次时间,不然安装出错。
在这项配置中,除非特别说明,所有操作都是基于oracle用户的。
1、检查安装环境
在安装crs之前,建议先利用CVU(Cluster Verification Utility)检查 CRS 的安装前环境 (需要先安装cvuqdisk-1.0.1-1.rpm ,位于clusterware/rpm/目录下)
[oracle@rac01 cluvfy]$ /opt/clusterware/cluvfy/runcluvfy.sh stage -pre crsinst -n rac01,rac02 -verbose
需要注意几点,其返回的信息中有几个错误可忽略~~
.与VIP 查找一组适合的接口有关,错误信息如下:
错误信息如下:
ERROR:
Could not find a suitable set of interfaces for VIPs.
这是一个bug,Metalink中有详细说明,doc.id:338924.1,如说明中所述,可以忽略该错误,没什么问题。
.有一堆包的验证会出现错误,要么提示找不到,要么是版本不对。例如:
Check: Package existence for "compat-gcc-7.3-2.96.128"
Node Name Status Comment
------------------------------ ------------------------------ ----------------
node2 missing failed
node1 missing failed
Result: Package existence check failed for "compat-gcc-7.3-2.96.128".
Check: Package existence for "compat-gcc-c++-7.3-2.96.128"
Node Name Status Comment
------------------------------ ------------------------------ ----------------
node2 missing failed
node1 missing failed
Result: Package existence check failed for "compat-gcc-c++-7.3-2.96.128".
...........
...........
这之类的吧
不用管它们,这也是一个BUG,只要确认节点中都已经安装了正确版本的 compat-* 包即可。
.内存检查失败
可以忽略的错误,内存差一点并不会影响到crs的安装,只是慢一些而已。
2、开始安装crs
如果使用中文安装,安装出现的乱码主要原因来自于oracle自带的jre,所以我们首先安装自己的java的jre安装好后
./runInstaller -jreLoc /usr/java/jdk1.5.0_15/jre
rac的安装并没有太明显的主从关系,一般我们认为,在哪个上面执行了安装,哪个就是主(实际也不完全是这样,主节点安装的时候,也会自动将文件复制到其它节点的)
这里我们选择rac01进行安装~~
关于磁盘冗余
指定OCR使用外部冗余还是内部冗余,如果服务器作了raid,不需要内部冗余,直接选择外部冗余。
指定表决磁盘(所谓表决磁盘,实际就是一个文件),仍然使用外部冗余。
3、如果你碰到了这个错误:
/orac/jdk/jre//bin/java: error while loading shared libraries: libpthread.so.0: cannot open shared object file: No such file or directory
可以按照如下方式解决:
===============================
修改vipca文件(需要在两个节点上操作)
[root@rac02 opt]# vi /opt/ora10g/product/10.2.0/crs_1/bin/vipca
找到如下内容:
Remove this workaround when the bug 3937317 is fixed
arch=`uname -m`
if [ "$arch" = "i686" -o "$arch" = "ia64" ]
then
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
fi
#End workaround
在fi后新添加一行:
unset LD_ASSUME_KERNEL
以及srvctl文件 (需要在两个节点上操作)
[root@rac02 opt]# vi /opt/ora10g/product/10.2.0/crs_1/bin/srvctl
找到如下内容:
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
同样在其后新增加一行:
unset LD_ASSUME_KERNEL
保存退出,然后在rac02重新执行root.sh
执行成功后返回图形界面,点击ok,这时开始执行环境配置。
4、最后一步 oracle cluster verfication utility 可能会失败:
Checking existence of VIP node application (required)
Check failed.
Check failed on nodes:
rac2,rac1
可以按照如下方式解决:
===============================
手工重新配置VIP
使用root用户,在rac01上操作即可(需要图形界面)
[root@rac01 bin]# /orac/crs/bin/vipca
Next
选择eth0
填写rac01和rac02的vip地址,如:
rac01 rac01-vip 192.168.7.226 255.255.255.0
rac02 rac02-vip 192.168.7.220 255.255.255.0
Next
vipca开始自动配置
启动vip时失败报错,查看日志/orac/crs/log/rac01/racg/ora.rac01.vip.log
Default gateway is not defined
设好网关,重试,OK
启动ons失败报错,查看日志/orac/crs/log/rac01/racg/ora.rac01.ons.log
Failed to get IP for localhost
vi /etc/hosts
添加127.0.0.1 rac01 localhost.localdomain localhost
重试,OK
配置好以后会有一个 eth0:1的虚拟网卡,ip地址为vip
重试 oracle cluster verfication utility ,通过,OK
至此,Clusterware安装完毕。
六、安装Database软件
1、选择安装节点时,把所有节点都选上
2、选择配置选项时,只安装数据库软件,数据可以留以后再配
选择:install database software only
3、同样要注意时间同步问题。
七、创建数据库和ASM实例
1、oracle用户执行DBCA
[oracle@rac01 ~]$ dbca
2、选择创建RAC数据库
3、选择操作,创建数据库
4、选择所有节点
5、选择自定义的数据库而不选择模板:custom database
6、指定数据库标识:racdb。
这里有两项,一个是global name,同时还有一个sid的前缀,注意,是前缀。然后oracle会自动为各节点分配sid。
7、选择是否启用EM:不启用
8、为管理员帐户设置密码
9、选择存储:ASM
10、然后就需要你配置ASM的一些参数,以创建ASM实例
a)设置asm实例sys用户的密码,并选择初始化文件的方式
初始化文件的方式选择:IFILE
b)点击next后开始创建asm实例
没有检测到监听,是否自动创建监听?直接YES
C)选择asm可用的磁盘组
对RAC的虚拟IP地址的原理很不清楚
为什么使用了虚拟ip地址就可以透明切换了??
由于它只是一个虚拟的Ip,当一个节点的down掉之后,crs可以在另外一个node的网卡上绑定这个ip,对用户来说是透明的。
在RAC中,VIP是资源,在一个节点failed如果使用物理IP,那么client连接会等待到OS超时才知道这个节电已经不可用
VIP可以failover 到另外一个节点,当连接请求发到这个VIP时,该VIP会立刻发回一个NAK标示,随后。。。。。
启动
root /etc/init.d/init.crs start
crs_stat
查看crs状态
/orac/crs/bin/crs_stat -t
/orac/crs/bin/srvctl start asm -n rac02 -i +ASM2
ORA_CRS_HOME
就是说,先用crs_unregister注销掉,再用crs_stat -p导出为cap文件到相应目录,最后再用crs_register注册服务,我用过的,三步。
ORACLE官方文档上这样讲的
/orac/crs/bin/crsctl get css misscount
/orac/crs/bin/crsctl set css misscount 360
/orac/crs/bin/crsctl set css misscount 120
Oracle的Clusterware和管理组件:
Oracle Clusterware is designed for, and tightly integrated with, Oracle RAC.
When you create an Oracle RAC database using any of the management tools, the database is registered with and managed by Oracle Clusterware, along with the other Oracle processes such as Virtual Internet Protocol (VIP) address, Global Services Daemon (GSD), the Oracle Notification Service (ONS), and the Oracle Net listeners. These resources are automatically started when Oracle Clusterware starts the node and automatically restarted if they fail. The Oracle Clusterware daemons run on each node.
Oracle ClusterWare的进程组件:
Cluster Synchronization Services (CSS)--Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using third-party clusterware, then the css process interfaces with your clusterware to manage node membership information.
Cluster Ready Services (CRS)--The primary program for managing high availability operations within a cluster. Anything that the crs process manages is known as a cluster resource which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on. The crs process manages cluster resources based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. The crs process generates events when a resource status changes. When you have installed Oracle RAC, crs monitors the Oracle instance, Listener, and so on, and automatically restarts these components when a failure occurs. By default, the crs process makes five attempts to restart a resource and then does not make further restart attempts if the resource does not restart.
Event Management (EVM)--A background process that publishes events that crs creates.
Oracle Notification Service (ONS)--A publish and subscribe service for communicating Fast Application Notification (FAN) events.
RACG--Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
Process Monitor Daemon (OPROCD)--This process is locked in memory to monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
Unix系统Oracle Clusterware的后台进程:
crsd--Performs high availability recovery and management operations such as maintaining the OCR and managing application resources. This process runs as the root user, or by a user in the admin group on Mac OS X-based systems. This process restarts automatically upon failure.
evmd--Event manager daemon. This process also starts the racgevt process to manage FAN server callouts.
ocssd--Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
oprocd--Process monitor for the cluster. Note that this process only appears on platforms that do not use vendor clusterware with Oracle Clusterware.
Oracle RAC的Cache Fusion技术:
Oracle RAC databases have two or more database instances that each contain memory structures and background processes.
Each instance has a buffer cache in its System Global Area (SGA). Using Cache Fusion, Oracle RAC environments logically combine each instance's buffer cache to enable the instances to process data as if the data resided on a logically combined, single cache.
The SGA size requirements for Oracle RAC are greater than the SGA requirements for single-instance Oracle databases due to Cache Fusion.
GCS和GES服务与GRD的作用:
To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances, which effectively increases the size of the SGA for an Oracle RAC instance.
Cache Fusion的实现:
After one instance caches data, any other instance within the same cluster database can acquire a block image from another instance in the same database faster than by reading the block from disk. Therefore, Cache Fusion moves current blocks between instances rather than re-reading the blocks from disk. When a consistent block is needed or a changed block is required on another instance, Cache Fusion transfers the block image directly between the affected instances. Oracle RAC uses the private interconnect for interinstance communication and block transfers. The GES Monitor and the Instance Enqueue Process manages access to Cache Fusion resources and enqueue recovery processing.
RAC实现的几个后台进程:
The Oracle RAC processes and their identifiers are as follows:
LMS--Global Cache Service Process
LMD--Global Enqueue Service Daemon
LMON--Global Enqueue Service Monitor
LCK0--Instance Enqueue Process
阅读(6037) | 评论(0) | 转发(0) |