Chinaunix首页 | 论坛 | 博客
  • 博客访问: 530040
  • 博文数量: 134
  • 博客积分: 7990
  • 博客等级: 少将
  • 技术积分: 1290
  • 用 户 组: 普通用户
  • 注册时间: 2007-10-29 11:43
文章分类

全部博文(134)

文章存档

2009年(7)

2008年(80)

2007年(47)

我的朋友

分类: Oracle

2007-12-20 21:08:15

Configure RAC Nodes for Remote Access using SSH


  Perform the following configuration procedures on both Oracle RAC nodes in the cluster!

Before you can install Oracle RAC 11g, you must configure secure shell (SSH) for the UNIX user account you plan to use to install Oracle Clusterware 11g and the Oracle Database 11g software. The installation and configuration tasks described in this section will need to be performed on both Oracle RAC nodes. As configured earlier in this article, the software owner for Oracle Clusterware 11g and the Oracle Database 11g software will be "oracle".

The goal here is to setup user equivalence for the oracle UNIX user account. User equivalence enables the oracle UNIX user account to access all other nodes in the cluster (running commands and copying files) without the need for a password. Oracle added support in 10g Release 1 for using the SSH tool suite for setting up user equivalence. Before Oracle Database 10g, user equivalence had to be configured using remote shell (RSH).

  The SSH configuration described in this article uses SSH1. If SSH is not available, then OUI attempts to use rsh and rcp instead. These services, however, are disabled by default on most Linux systems. The use of RSH will not be discussed in this article.

You need either an RSA or a DSA key for the SSH protocol. RSA is used with the SSH 1.5 protocol, while DSA is the default for the SSH 2.0 protocol. With OpenSSH, you can use either RSA or DSA. For the purpose of this article, we will configure SSH using SSH1.

If you have an SSH2 installation, and you cannot use SSH1, then refer to your SSH distribution documentation to configure SSH1 compatibility or to configure SSH2 with DSA. This type of configuration is beyond the scope of this article and will not be discussed.

So, why do we have to setup user equivalence? Installing Oracle Clusterware and the Oracle Database software is only performed from one node in a RAC cluster. When running the Oracle Universal Installer (OUI) on that particular node, it will use the ssh and scp commands to run remote commands on and copy files (the Oracle software) to all other nodes within the RAC cluster. The oracle UNIX user account on the node running the OUI (runInstaller) must be trusted by all other nodes in your RAC cluster. This means that you must be able to run the secure shell commands (ssh or scp) on the Linux server you will be running the OUI from against all other Linux servers in the cluster without being prompted for a password.

  Please note that the use of secure shell is not required for normal RAC operation. This configuration, however, must to be enabled for RAC and patchset installations as well as creating the clustered database.

The methods required for configuring SSH1, an RSA key, and user equivalence are described in the following sections.


Configuring the Secure Shell

To determine if SSH is installed and running, enter the following command:
# pgrep sshd
3797
If SSH is running, then the response to this command is a list of process ID number(s). Run this command on both Oracle RAC nodes in the cluster to verify the SSH daemons are installed and running!

  To find out more about SSH, refer to the man page:
# man ssh


Creating the RSA Keys on Both Oracle RAC Nodes

The first step in configuring SSH is to create an RSA public/private key pair on both Oracle RAC nodes in the cluster. The command to do this will create a public and private key for RSA (for a total of two keys per node). The content of the RSA public keys will then need to be copied into an authorized key file which is then distributed to both Oracle RAC nodes in the cluster.

Use the following steps to create the RSA key pair. Please note that these steps will need to be completed on both Oracle RAC nodes in the cluster::

  1. Logon as the "oracle" UNIX user account.
    # su - oracle

  2. If necessary, create the .ssh directory in the "oracle" user's home directory and set the correct permissions on it:
    $ mkdir -p ~/.ssh
    $ chmod 700 ~/.ssh

  3. Enter the following command to generate an RSA key pair (public and private key) for the SSH protocol:
    $ /usr/bin/ssh-keygen -t rsa
    At the prompts:
    • Accept the default location for the key files.
    • Enter and confirm a pass phrase. This should be different from the "oracle" UNIX user account password however it is not a requirement.

    This command will write the public key to the ~/.ssh/id_rsa.pub file and the private key to the ~/.ssh/id_rsa file. Note that you should never distribute the private key to anyone!

  4. Repeat the above steps for each Oracle RAC node in the cluster.

Now that both Oracle RAC nodes contain a public and private key for RSA, you will need to create an authorized key file on one of the nodes. An authorized key file is nothing more than a single file that contains a copy of everyone's (every node's) RSA public key. Once the authorized key file contains all of the public keys, it is then distributed to all other nodes in the cluster.

Complete the following steps on one of the nodes in the cluster to create and then distribute the authorized key file. For the purpose of this article, I am using linux1:

  1. First, determine if an authorized key file already exists on the node (~/.ssh/authorized_keys). In most cases this will not exist since this article assumes you are working with a new install. If the file doesn't exist, create it now:
    $ touch ~/.ssh/authorized_keys
    $ cd ~/.ssh
    $ ls -l *.pub
    -rw-r--r-- 1 oracle oinstall 395 Dec 13 12:32 id_rsa.pub
    NOTE: The listing above should show the id_rsa.pub public key created in the previous section.

  2. In this step, use SCP (Secure Copy) or SFTP (Secure FTP) to copy the content of the ~/.ssh/id_rsa.pub public key from both Oracle RAC nodes in the cluster to the authorized key file just created (~/.ssh/authorized_keys). Again, this will be done from linux1. You will be prompted for the oracle UNIX user account password for both Oracle RAC nodes accessed.

    The following example is being run from linux1 and assumes a two-node cluster, with nodes linux1 and linux2:

    $ ssh linux1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    The authenticity of host 'linux1 (192.168.1.100)' can't be established.
    RSA key fingerprint is 7a:68:1a:ab:7c:58:6f:ac:29:50:dd:22:2f:2e:c3:fd.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'linux1,192.168.1.100' (RSA) to the list of known hosts.
    oracle@linux1's password: xxxxx
    
    $ ssh linux2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    The authenticity of host 'linux2 (192.168.1.101)' can't be established.
    RSA key fingerprint is 16:b8:8f:8b:9b:16:34:3f:be:8d:87:1f:c4:2b:45:51.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'linux2,192.168.1.101' (RSA) to the list of known hosts.
    oracle@linux2's password: xxxxx

      The first time you use SSH to connect to a node from a particular system, you will see a message similar to the following:
    The authenticity of host 'linux1 (192.168.1.100)' can't be established.
    RSA key fingerprint is 7a:68:1a:ab:7c:58:6f:ac:29:50:dd:22:2f:2e:c3:fd.
    Are you sure you want to continue connecting (yes/no)? yes
    Enter yes at the prompt to continue. You should not see this message again when you connect from this system to the same node.

  3. At this point, we have the RSA public key from every node in the cluster in the authorized key file (~/.ssh/authorized_keys) on linux1. We now need to copy it to the remaining nodes in the cluster. In our two-node cluster example, the only remaining node is linux2. Use the scp command to copy the authorized key file to all remaining nodes in the RAC cluster:
    $ scp ~/.ssh/authorized_keys linux2:.ssh/authorized_keys
    oracle@linux2's password: xxxxx
    authorized_keys                             100%  790     0.8KB/s   00:00

  4. Change the permission of the authorized key file for both Oracle RAC nodes in the cluster by logging into the node and running the following:
    $ chmod 600 ~/.ssh/authorized_keys

  5. At this point, if you use ssh to log in to or run a command on another node, you are prompted for the pass phrase that you specified when you created the RSA key. For example, test the following from linux1:
    $ ssh linux1 hostname
    Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
    linux1
    
    $ ssh linux2 hostname
    Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
    linux2

      If you see any other messages or text, apart from the host name, then the Oracle installation can fail. Make any changes required to ensure that only the host name is displayed when you enter these commands. You should ensure that any part of a login script(s) that generate any output, or ask any questions, are modified so that they act only when the shell is an interactive shell.


Enabling SSH User Equivalency for the Current Shell Session

When running the OUI, it will need to run the secure shell tool commands (ssh and scp) without being prompted for a pass phrase. Even though SSH is configured on both Oracle RAC nodes in the cluster, using the secure shell tool commands will still prompt for a pass phrase. Before running the OUI, you need to enable user equivalence for the terminal session you plan to run the OUI from. For the purpose of this article, all Oracle installations will be performed from linux1.

User equivalence will need to be enabled on any new terminal shell session before attempting to run the OUI. If you log out and log back in to the node you will be performing the Oracle installation from, you must enable user equivalence for the terminal shell session as this is not done by default.

To enable user equivalence for the current terminal shell session, perform the following steps:

  1. Logon to the node where you want to run the OUI from (linux1) as the "oracle" UNIX user account.
    # su - oracle

  2. Enter the following commands:
    $ exec /usr/bin/ssh-agent $SHELL
    $ /usr/bin/ssh-add
    Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
    Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
    At the prompts, enter the pass phrase for each key that you generated.

  3. If SSH is configured correctly, you will be able to use the ssh and scp commands without being prompted for a password or pass phrase from this terminal session:
    $ ssh linux1 "date;hostname"
    Thu Dec 13 13:45:40 EST 2007
    linux1
    
    $ ssh linux2 "date;hostname"
    Thu Dec 13 13:46:13 EST 2007
    linux2

      The commands above should display the date set on each Oracle RAC node along with its hostname. If any of the nodes prompt for a password or pass phrase then verify that the ~/.ssh/authorized_keys file on that node contains the correct public keys.

    Also, if you see any other messages or text, apart from the date and hostname, then the Oracle installation can fail. Make any changes required to ensure that only the date is displayed when you enter these commands. You should ensure that any part of a login script(s) that generate any output, or ask any questions, are modified so that they act only when the shell is an interactive shell.

  4. The Oracle Universal Installer is a GUI interface and requires the use of an X Server. From the terminal session enabled for user equivalence (the node you will be performing the Oracle installations from), set the environment variable DISPLAY to a valid X Windows display:

    Bourne, Korn, and Bash shells:

    $ DISPLAY=:0
    $ export DISPLAY
    C shell:
    $ setenv DISPLAY :0
    After setting the DISPLAY variable to a valid X Windows display, you should perform another test of the current terminal session to ensure that X11 forwarding is not enabled:
    $ ssh linux1 hostname
    linux1
    
    $ ssh linux2 hostname
    linux2

      If you are using a remote client to connect to the node performing the installation, and you see a message similar to: "Warning: No xauth data; using fake authentication data for X11 forwarding." then this means that your authorized keys file is configured correctly; however, your SSH configuration has X11 forwarding enabled. For example:
    $ export DISPLAY=melody:0
    $ ssh linux2 hostname
    Warning: No xauth data; using fake authentication data for X11 forwarding.
    linux2
    Note that having X11 Forwarding enabled will cause the Oracle installation to fail. To correct this problem, create a user-level SSH client configuration file for the "oracle" UNIX user account that disables X11 Forwarding:

    • Using a text editor, edit or create the file ~/.ssh/config
    • Make sure that the ForwardX11 attribute is set to no. For example, insert the following into the ~/.ssh/config file:
      Host *
              ForwardX11 no

  5. You must run the Oracle Universal Installer from this terminal session or remember to repeat the steps to enable user equivalence (steps 2, 3, and 4 from this section) before you start the Oracle Universal Installer from a different terminal session.


Remove any stty Commands

When installing the Oracle software, any hidden files on the system (i.e. .bashrc, .cshrc, .profile) will cause the installation process to fail if they contain stty commands.

To avoid this problem, you must modify these files to suppress all output on STDERR as in the following examples:

  • Bourne, Bash, or Korn shell:
    if [ -t 0 ]; then
        stty intr ^C
    fi

  • C shell:
    test -t 0
    if ($status == 0) then
        stty intr ^C
    endif

  If there are hidden files that contain stty commands that are loaded by the remote shell, then OUI indicates an error and stops the installation.



All Startup Commands for Both Oracle RAC Nodes


  Verify that the following startup commands are included on both of the Oracle RAC nodes in the cluster!

Up to this point, we have talked in great detail about the parameters and resources that need to be configured on both nodes in the Oracle RAC 11g configuration. This section will review those parameters, commands, and entries (in previous sections of this document) that need to occur on both Oracle RAC nodes when the machine is booted.

In this section, I provide all of the commands, parameters, and entries that have been discussed so far that will need to be included in the startup scripts for each Linux node in the RAC cluster. For each of the startup files below, I indicate in blue the entries that should be included in each of the startup files in order to provide a successful RAC node.


/etc/sysctl.conf

We wanted to adjust the default and maximum send buffer size as well as the default and maximum receive buffer size for the interconnect. This file also contains those parameters responsible for configuring shared memory, semaphores, file handles, and local IP range used by the Oracle instance.

/etc/sysctl.conf
# Kernel sysctl configuration file for Oracle Enterprise Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536


# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS                              |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use   |
# | of UDP as the default protocol on Linux for             |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances  |
# | within the RAC cluster. Oracle strongly suggests to     |
# | adjust the default and maximum receive buffer size      |
# | (SO_RCVBUF socket option) to 4 MB, and the default and  |
# | maximum send buffer size (SO_SNDBUF socket option) to   |
# | 256 KB. The receive buffers are used by TCP and UDP to  |
# | hold received data until it is read by the application. |
# | The receive buffer cannot overflow because the peer is  |
# | not allowed to send data beyond the buffer size window. |
# | This means that datagrams will be discarded if they     |
# | don't fit in the socket receive buffer. This could      |
# | cause the sender to overwhelm the receiver.             |
# +---------------------------------------------------------+

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_default=4194304

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_max=4194304

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_default=262144

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_max=262144


# +---------------------------------------------------------+
# | ADJUSTING ADDITIONAL KERNEL PARAMETERS FOR ORACLE       |
# +---------------------------------------------------------+
# | Configure the kernel parameters for all Oracle Linux    |
# | servers by setting shared memory and semaphores,        |
# | setting the maximum amount of file handles, and setting |
# | the IP local port range.                                |
# +---------------------------------------------------------+

# +---------------------------------------------------------+
# | SHARED MEMORY                                           |
# +---------------------------------------------------------+
kernel.shmmax=1073741823

# +---------------------------------------------------------+
# | SEMAPHORES                                              |
# | ----------                                              |
# |                                                         |
# | SEMMSL_value  SEMMNS_value  SEMOPM_value  SEMMNI_value  |
# |                                                         |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128

# +---------------------------------------------------------+
# | FILE HANDLES                                            |
# ----------------------------------------------------------+
fs.file-max=65536

# +---------------------------------------------------------+
# | LOCAL IP RANGE                                          |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000

  Verify that each of the required kernel parameters (above) are configured in the /etc/sysctl.conf file. Then, ensure that each of these parameters are truly in effect by running the following command on both Oracle RAC nodes in the cluster:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.core.rmem_default = 4194304
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmax = 1073741823
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000


/etc/hosts

All machine/IP entries for nodes in the RAC cluster.

/etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.

127.0.0.1        localhost.localdomain   localhost

# Public Network - (eth0)
192.168.1.100    linux1
192.168.1.101    linux2

# Private Interconnect - (eth1)
192.168.2.100    linux1-priv
192.168.2.101    linux2-priv

# Public Virtual IP (VIP) addresses - (eth0)
192.168.1.200    linux1-vip
192.168.1.201    linux2-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195    openfiler1
192.168.2.195    openfiler1-priv

192.168.1.106    melody
192.168.1.102    alex
192.168.1.105    bartman
192.168.1.120    cartman


/etc/udev/rules.d/55-openiscsi.rules

/etc/hosts.equiv
# /etc/udev/rules.d/55-openiscsi.rules
KERNEL=="sd*", BUS=="scsi", PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c/part%n"


/etc/udev/scripts/iscsidev.sh

/etc/rc.local
#!/bin/sh

# FILE: /etc/udev/scripts/iscsidev.sh

BUS=${1}
HOST=${BUS%%:*}

[ -e /sys/class/iscsi_host ] || exit 1

file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"

target_name=$(cat ${file})

# This is not an open-scsi drive
if [ -z "${target_name}" ]; then
   exit 1
fi

echo "${target_name##*.}"



Install and Configure Oracle Cluster File System (OCFS2)


  Most of the configuration procedures in this section should be performed on both Oracle RAC nodes in the cluster! Creating the OCFS2 filesystem, however, should only be executed on one of nodes in the RAC cluster.


Overview

It is now time to install the Oracle Cluster File System, Release 2 (OCFS2). OCFS2, developed by Oracle Corporation, is a Cluster File System which allows all nodes in a cluster to concurrently access a device via the standard file system interface. This allows for easy management of applications that need to run across a cluster.

OCFS (Release 1) was released in December 2002 to enable Oracle Real Application Cluster (RAC) users to run the clustered database without having to deal with RAW devices. The file system was designed to store database related files, such as data files, control files, redo logs, archive logs, etc. OCFS2 is the next generation of the Oracle Cluster File System. It has been designed to be a general purpose cluster file system. With it, one can store not only database related files on a shared disk, but also store Oracle binaries and configuration files (shared Oracle Home) making management of RAC even easier.

In this article, I will be using the latest release of OCFS2 ( at the time of this writing) to store the two files that are required to be shared by the Oracle Clusterware software. Along with these two files, I will also be using this space to store the shared ASM SPFILE for all Oracle RAC instances.

See the following page for more information on OCFS2 (including Installation Notes) for Linux:

  OCFS2 Project Documentation


Download OCFS2

First, let's download the latest OCFS2 distribution. The OCFS2 distribution comprises of two sets of RPMs; namely, the kernel module and the tools. The latest kernel module is available for download from and the tools from .

Download the appropriate RPMs starting with the latest OCFS2 kernel module (the driver). With CentOS 5.1, I am using kernel release 2.6.18-53.el5. The appropriate OCFS2 kernel module was found in the latest release of OCFS2 at the time of this writing (). The available OCFS2 kernel modules for Linux kernel 2.6.18-53.el5 are listed below. Always download the latest OCFS2 kernel module that matches the distribution, platform, kernel version and the kernel flavor (default kernel, PAE kernel, or xen kernel).

  - (for default kernel)
  - (for PAE kernel)
  - (for xen kernel)
For the tools, simply match the platform and distribution. You should download both the OCFS2 tools and the OCFS2 console applications.
  - (OCFS2 tools)
  - (OCFS2 console)

  The OCFS2 Console is optional but highly recommended. The ocfs2console application requires e2fsprogs, glib2 2.5-12 or later, vte 0.14 or later, pygtk2 (EL5) or python-gtk (SLES9) 1.99.16 or later, python 2.4 or later and ocfs2-tools.

  If you were curious as to which OCFS2 driver release you need, use the OCFS2 release that matches your kernel version. To determine your kernel release:
$ uname -a
Linux linux1 2.6.18-53.el5 #1 SMP Mon Nov 12 02:22:48 EST 2007 i686 i686 i386 GNU/Linux


Install OCFS2

I will be installing the OCFS2 files onto two - single processor machines. The installation process is simply a matter of running the following command on both Oracle RAC nodes in the cluster as the root user account:
$ su -
# rpm -Uvh ocfs2-2.6.18-53.el5-1.2.7-1.el5.i686.rpm \
       ocfs2console-1.2.7-1.el5.i386.rpm \
       ocfs2-tools-1.2.7-1.el5.i386.rpm
Preparing...                ########################################### [100%]
   1:ocfs2-tools            ########################################### [ 33%]
   2:ocfs2-2.6.18-53.el5    ########################################### [ 67%]
   3:ocfs2console           ########################################### [100%]


Disable SELinux (RHEL4 U2 and higher)

Users of RHEL4 U2 and higher (CentOS 5.1 is based on RHEL5 U1) are advised that OCFS2 currently does not work with SELinux enabled. If you are using RHEL4 U2 or higher (which includes us since we are using CentOS 5.1) you will need to disable SELinux (using tool system-config-securitylevel) to get the O2CB service to execute.

  A ticket has been logged with Red Hat on this issue.

During the installation of CentOS, we Disabled SELinux on the SELinux screen. If, however, you did not disable SELinux during the installation phase, (or if you simply want to verify it is truly disable), you can use the GUI utility system-config-securitylevel to disable SELinux.

# /usr/bin/system-config-securitylevel &


This will bring up the following screen:


Figure 13: Security Level Configuration Opening Screen


Now, click the SELinux tab and select the "Disabled" option. After clicking [OK], you will be presented with a warning dialog. Simply acknowledge this warning by clicking "Yes". Your screen should now look like the following after disabling the SELinux option:


Figure 14: SELinux Disabled


If you needed to disable SELinux in this section on any of the nodes, those nodes will need to be rebooted to implement the change. SELinux must be disabled before you can continue with configuring OCFS2!

# init 6


Configure OCFS2

The next step is to generate and configure the /etc/ocfs2/cluster.conf file on both Oracle RAC nodes in the cluster. The easiest way to accomplish this is to run the GUI tool ocfs2console. In this section, we will not only create and configure the /etc/ocfs2/cluster.conf file using ocfs2console, but will also create and start the cluster stack O2CB. When the /etc/ocfs2/cluster.conf file is not present, (as will be the case in our example), the ocfs2console tool will create this file along with a new cluster stack service (O2CB) with a default cluster name of ocfs2. This will need to be done on both Oracle RAC nodes in the cluster as the root user account:

  Note that OCFS2 will be configured to use the private network (192.168.2.0) for all of its network traffic as recommended by Oracle. While OCFS2 does not take much bandwidth, it does require the nodes to be alive on the network and sends regular keepalive packets to ensure that they are. To avoid a network delay being interpreted as a node disappearing on the net which could lead to a node-self-fencing, a private interconnect is recommended. It is safe to use the same private interconnect for both Oracle RAC and OCFS2.

A popular question then is what node name should be used and should it be related to the IP address? The node name needs to match the hostname of the machine. The IP address need not be the one associated with that hostname. In other words, any valid IP address on that node can be used. OCFS2 will not attempt to match the node name (hostname) with the specified IP address.

$ su -
# ocfs2console &
This will bring up the GUI as shown below:


Figure 15: ocfs2console Screen


Using the ocfs2console GUI tool, perform the following steps:

  1. Select [Cluster] -> [Configure Nodes...]. This will start the OCFS2 Cluster Stack (Figure 16). Acknowledge this Information dialog box by clicking [Close]. You will then be presented with the "Node Configuration" dialog.
  2. On the "Node Configurtion" dialog, click the [Add] button.
    • This will bring up the "Add Node" dialog.
    • In the "Add Node" dialog, enter the Host name and IP address for the first node in the cluster. Leave the IP Port set to its default value of 7777. In my example, I added both nodes using linux1 / 192.168.2.100 for the first node and linux2 / 192.168.2.101 for the second node.
      Note: The node name you enter "must" match the hostname of the machine and the IP addresses will use the private interconnect.
    • Click [Apply] on the "Node Configuration" dialog - All nodes should now be "Active" as shown in Figure 17.
    • Click [Close] on the "Node Configuration" dialog.
  3. After verifying all values are correct, exit the application using [File] -> [Quit]. This needs to be performed on both Oracle RAC nodes in the cluster.



Figure 16: Starting the OCFS2 Cluster Stack


The following dialog shows the OCFS2 settings I used for the node linux1 and linux2:


Figure 17: Configuring Nodes for OCFS2


  See the Troubleshooting section if you get the error:
o2cb_ctl: Unable to access cluster service while creating node


After exiting the ocfs2console, you will have a /etc/ocfs2/cluster.conf similar to the following. This process needs to be completed on both Oracle RAC nodes in the cluster and the OCFS2 configuration file should be exactly the same for both of the nodes:

/etc/ocfs2/cluster.conf
node:
        ip_port = 7777
        ip_address = 192.168.2.100
        number = 0
        name = linux1
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.2.101
        number = 1
        name = linux2
        cluster = ocfs2

cluster:
        node_count = 2
        name = ocfs2


O2CB Cluster Service

Before we can do anything with OCFS2 like formatting or mounting the file system, we need to first have OCFS2's cluster stack, O2CB, running (which it will be as a result of the configuration process performed above). The stack includes the following services:

  • NM: Node Manager that keep track of all the nodes in the cluster.conf
  • HB: Heart beat service that issues up/down notifications when nodes join or leave the cluster
  • TCP: Handles communication between the nodes
  • DLM: Distributed lock manager that keeps track of all locks, its owners and status
  • CONFIGFS: User space driven configuration file system mounted at /config
  • DLMFS: User space interface to the kernel space DLM

All of the above cluster services have been packaged in the o2cb system service (/etc/init.d/o2cb). Here is a short listing of some of the more useful commands and options for the o2cb system service.

  The following commands are for documentation purposes only and do not need to be run when installing and configuring OCFS2 for this article!

  • /etc/init.d/o2cb status
    Module "configfs": Loaded
    Filesystem "configfs": Mounted
    Module "ocfs2_nodemanager": Loaded
    Module "ocfs2_dlm": Loaded
    Module "ocfs2_dlmfs": Loaded
    Filesystem "ocfs2_dlmfs": Mounted
    Checking O2CB cluster ocfs2: Online
      Heartbeat dead threshold: 31
      Network idle timeout: 30000
      Network keepalive delay: 2000
      Network reconnect delay: 2000
    Checking O2CB heartbeat: Not active

  • /etc/init.d/o2cb offline ocfs2
    Stopping O2CB cluster ocfs2: OK
    The above command will offline the cluster we created, ocfs2.

  • /etc/init.d/o2cb unload
    Unmounting ocfs2_dlmfs filesystem: OK
    Unloading module "ocfs2_dlmfs": OK
    Unmounting configfs filesystem: OK
    Unloading module "configfs": OK
    The above command will unload all OCFS2 modules.

  • /etc/init.d/o2cb load
    Loading module "configfs": OK
    Mounting configfs filesystem at /config: OK
    Loading module "ocfs2_nodemanager": OK
    Loading module "ocfs2_dlm": OK
    Loading module "ocfs2_dlmfs": OK
    Mounting ocfs2_dlmfs filesystem at /dlm: OK
    Loads all OCFS2 modules.

  • /etc/init.d/o2cb online ocfs2
    Starting O2CB cluster ocfs2: OK
    The above command will online the cluster we created, ocfs2.


Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold

You now need to configure the on-boot properties of the OC2B driver so that the cluster stack services will start on each boot. You will also be adjusting the OCFS2 Heartbeat Threshold from its default setting of 31 to 61. Perform the following on both Oracle RAC nodes in the cluster:

  With releases of OCFS2 prior to 1.2.1, a bug existed where the driver would not get loaded on each boot even after configuring the on-boot properties to do so. This bug was fixed in release 1.2.1 of OCFS2 and does not need to be addressed in this article. If however you are using a release of OCFS2 prior to 1.2.1, please see the Troubleshooting section for a workaround to this bug.

Set the on-boot properties as follows:

# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot.  The current values will be shown in brackets ('[]').  Hitting
 without typing an answer will keep that current value.  Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [n]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [31]: 61
Specify network idle timeout in ms (>=5000) [30000]: 30000
Specify network keepalive delay in ms (>=1000) [2000]: 2000
Specify network reconnect delay in ms (>=2000) [2000]: 2000
Writing O2CB configuration: OK
Loading module "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK


Format the OCFS2 File System

  Unlike the other tasks in this section, creating the OCFS2 file system should only be executed on one of nodes in the RAC cluster. I will be executing all commands in this section from linux1 only.

We can now start to make use of the iSCSI volume we partitioned for OCFS2 in the section "Create Partitions on iSCSI Volumes".

If the O2CB cluster is offline, start it. The format operation needs the cluster to be online, as it needs to ensure that the volume is not mounted on some other node in the cluster.

Earlier in this document, we created the directory /u02 under the section Create Mount Point for OCFS2 / Clusterware which will be used as the mount point for the OCFS2 cluster file system. This section contains the commands to create and mount the file system to be used for the Cluster Manager.

  Note that it is possible to create and mount the OCFS2 file system using either the GUI tool ocfs2console or the command-line tool mkfs.ocfs2. From the ocfs2console utility, use the menu [Tasks] - [Format].

See the instructions below on how to create the OCFS2 file system using the command-line tool mkfs.ocfs2.

To create the file system, we can use the Oracle executable mkfs.ocfs2. For the purpose of this example, I run the following command only from linux1 as the root user account using the local SCSI device name mapped to the iSCSI volume for crs — /dev/iscsi/crs/part1. Also note that I specified a label named "oracrsfiles" which will be referred to when mounting or un-mounting the volume:

$ su -
# mkfs.ocfs2 -b 4K -C 32K -N 4 -L oracrsfiles /dev/iscsi/crs/part1

mkfs.ocfs2 1.2.7
Filesystem label=oracrsfiles
Block size=4096 (bits=12)
Cluster size=32768 (bits=15)
Volume size=2145943552 (65489 clusters) (523912 blocks)
3 cluster groups (tail covers 977 clusters, rest cover 32256 clusters)
Journal size=67108864
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 1 block(s)
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful


Mount the OCFS2 File System

Now that the file system is created, we can mount it. Let's first do it using the command-line, then I'll show how to include it in the /etc/fstab to have it mount on each boot.

  Mounting the file system will need to be performed on both nodes in the Oracle RAC cluster as the root user account using the OCFS2 label oracrsfiles!

First, here is how to manually mount the OCFS2 file system from the command-line. Remember that this needs to be performed as the root user account:

$ su -
# mount -t ocfs2 -o datavolume,nointr -L "oracrsfiles" /u02
If the mount was successful, you will simply get your prompt back. We should, however, run the following checks to ensure the file system is mounted correctly. Let's use the mount command to ensure that the new file system is really mounted. This should be performed on both nodes in the RAC cluster:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cartman:SHARE2 on /cartman type nfs (rw,addr=192.168.1.120)
configfs on /sys/kernel/config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sda1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)

  Please take note of the datavolume option I am using to mount the new file system. Oracle database users must mount any volume that will contain the Voting Disk file, Cluster Registry (OCR), Data files, Redo logs, Archive logs and Control files with the datavolume mount option so as to ensure that the Oracle processes open the files with the o_direct flag. The nointr option ensures that the I/O's are not interrupted by signals.

Any other type of volume, including an Oracle home (which I will not be using for this article), should not be mounted with this mount option.

  Why does it take so much time to mount the volume? It takes around 5 seconds for a volume to mount. It does so as to let the heartbeat thread stabilize. In a later release, Oracle plans to add support for a global heartbeat, which will make most mounts instant.


Configure OCFS2 to Mount Automatically at Startup

Let's take a look at what you've have done so far. You installed the OCFS2 software packages which will be used to store the shared files needed by Cluster Manager. After going through the install, you loaded the OCFS2 module into the kernel and then formatted the clustered file system. Finally, you mounted the newly created file system using the OCFS2 label "oracrsfiles". This section walks through the steps responsible for mounting the new OCFS2 file system each time the machine(s) are booted using its label.

We start by adding the following line to the /etc/fstab file on both nodes in the RAC cluster:

LABEL=oracrsfiles     /u02           ocfs2   _netdev,datavolume,nointr     0 0

  Notice the "_netdev" option for mounting this file system. The _netdev mount option is a must for OCFS2 volumes. This mount option indicates that the volume is to be mounted after the network is started and dismounted before the network is shutdown.

Now, let's make sure that the ocfs2.ko kernel module is being loaded and that the file system will be mounted during the boot process.

If you have been following along with the examples in this article, the actions to load the kernel module and mount the OCFS2 file system should already be enabled. However, we should still check those options by running the following on both nodes in the RAC cluster as the root user account:

$ su -
# chkconfig --list o2cb
o2cb            0:off   1:off   2:on    3:on    4:on    5:on    6:off
The flags that I have marked in bold should be set to "on".


Check Permissions on New OCFS2 File System

Use the ls command to check ownership. The permissions should be set to 0775 with owner "oracle" and group "oinstall".

The following tasks only need to be executed on one of nodes in the RAC cluster. I will be executing all commands in this section from linux1 only.

Let's first check the permissions:

# ls -ld /u02
drwxr-xr-x 3 root root 4096 Dec 13 15:17 /u02
As we can see from the listing above, the oracle user account (and the oinstall group) will not be able to write to this directory. Let's fix that:
# chown oracle:oinstall /u02
# chmod 775 /u02
Let's now go back and re-check that the permissions are correct for both Oracle RAC nodes in the cluster:
# ls -ld /u02
drwxrwxr-x 3 oracle oinstall 4096 Dec 13 15:17 /u02


Create Directory for Oracle Clusterware Files

The last mandatory task is to create the appropriate directory on the new OCFS2 file system that will be used for the Oracle Clusterware shared files. We will also modify the permissions of this new directory to allow the "oracle" owner and group "oinstall" read/write access.

The following tasks only need to be executed on one of nodes in the RAC cluster. I will be executing all commands in this section from linux1 only.

# mkdir -p /u02/oradata/orcl
# chown -R oracle:oinstall /u02/oradata
# chmod -R 775 /u02/oradata
# ls -l /u02/oradata
total 4
drwxrwxr-x 2 oracle oinstall 4096 Dec 13 16:00 orcl


Reboot Both Nodes

Before starting the next section, this would be a good place to reboot both of the nodes in the RAC cluster. When the machines come up, ensure that the cluster stack services are being loaded and the new OCFS2 file system is being mounted:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cartman:SHARE2 on /cartman type nfs (rw,addr=192.168.1.120)
configfs on /sys/kernel/config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sda1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)


If you modified the O2CB heartbeat threshold, you should verify that it is set correctly:

# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold
61


How to Determine OCFS2 Version

To determine which version of OCFS2 is running, use:
# cat /proc/fs/ocfs2/version
OCFS2 1.2.7 Mon Nov 12 15:50:25 PST 2007 (build d443ce77532cea8d1e167ab2de51b8c8)



Install and Configure Automatic Storage Management (ASMLib 2.0)


  Most of the installation and configuration procedures should be performed on both of the Oracle RAC nodes in the cluster! Creating the ASM disks, however, will only need to be performed on a single node within the cluster.


Introduction

In this section, we will configure Automatic Storage Management (ASM) to be used as the file system / volume manager for all Oracle physical database files (data, online redo logs, control files, archived redo logs) and a Flash Recovery Area.

ASM was introduced in Oracle10g Release 1 and is used to alleviate the DBA from having to manage individual files and drives. ASM is built into the Oracle kernel and provides the DBA with a way to manage thousands of disk drives 24x7 for both single and clustered instances of Oracle. All of the files and directories to be used for Oracle will be contained in a disk group. ASM automatically performs load balancing in parallel across all available disk drives to prevent hot spots and maximize performance, even with rapidly changing data usage patterns.

There are two different methods to configure ASM on Linux:

  • ASM with ASMLib I/O: This method creates all Oracle database files on raw block devices managed by ASM using ASMLib calls. RAW devices are not required with this method as ASMLib works with block devices.

  • ASM with Standard Linux I/O: This method creates all Oracle database files on raw character devices managed by ASM using standard Linux I/O system calls. You will be required to create RAW devices for all disk partitions used by ASM.

In this article, I will be using the "ASM with ASMLib I/O" method. Oracle states (in Metalink Note 275315.1) that "ASMLib was provided to enable ASM I/O to Linux disks without the limitations of the standard UNIX I/O API". I plan on performing several tests in the future to identify the performance gains in using ASMLib. Those performance metrics and testing details are out of scope of this article and therefore will not be discussed.

We start this section by first downloading the ASMLib drivers (ASMLib Release 2.0) specific to our Linux kernel. We will then install and configure the ASMLib 2.0 drivers while finishing off the section with a demonstration of how to create the ASM disks.

If you would like to learn more about Oracle ASMLib 2.0, visit


Download the ASMLib 2.0 Packages

We start this section by downloading the latest ASMLib 2.0 libraries and the driver from OTN. At the time of this writing, the latest release of the ASMLib driver was . Like the Oracle Cluster File System, we need to download the version for the Linux kernel and number of processors on the machine. We are using kernel 2.6.18-53.el5 while the machines I am using are both single processor machines:
# uname -a
Linux linux1 2.6.18-53.el5 #1 SMP Mon Nov 12 02:22:48 EST 2007 i686 i686 i386 GNU/Linux

  If you do not currently have an account with Oracle OTN, you will need to create one. This is a FREE account!


 

  - (for default kernel)
  - (for PAE kernel)
  - (for xen kernel)
You will also need to download the following ASMLib tools:
  - (Userspace library)
  - (Driver support files)


Install ASMLib 2.0 Packages

This installation needs to be performed on both nodes in the RAC cluster as the root user account:
$ su -
# rpm -Uvh oracleasm-2.6.18-53.el5-2.0.4-1.el5.i686.rpm \
       oracleasmlib-2.0.3-1.el5.i386.rpm \
       oracleasm-support-2.0.4-1.el5.i386.rpm
Preparing...                ########################################### [100%]
   1:oracleasm-support      ########################################### [ 33%]
   2:oracleasm-2.6.18-53.el5########################################### [ 67%]
   3:oracleasmlib           ########################################### [100%]


Configuring and Loading the ASMLib 2.0 Packages

Now that we downloaded and installed the ASMLib 2.0 Packages for Linux, we need to configure and load the ASM kernel module. This task needs to be run on both nodes in the RAC cluster as the root user account:
$ su -
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library
driver.  The following questions will determine whether the driver is
loaded on boot and what permissions it will have.  The current values
will be shown in brackets ('[]').  Hitting  without typing an
answer will keep that current value.  Ctrl-C will abort.

Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration:  [  OK  ]
Creating /dev/oracleasm mount point: [  OK  ]
Loading module "oracleasm": [  OK  ]
Mounting ASMlib driver filesystem: [  OK  ]
Scanning system for ASM disks: [  OK  ]


Create ASM Disks for Oracle

  Creating the ASM disks only needs to be done on one node in the RAC cluster as the root user account. I will be running these commands on linux1. On the other Oracle RAC node, you will need to perform a scandisk to recognize the new volumes. When that is complete, you should then run the oracleasm listdisks command on both Oracle RAC nodes to verify that all ASM disks were created and available.

In the section "Create Partitions on iSCSI Volumes", we configured (partitioned) four iSCSI volumes to be used by ASM. ASM will be used for storing Oracle database files like online redo logs, database files, control files, archived redo log files, and the flash recovery area. Use the local device names that were created by udev when configuring the four ASM volumns.

  If you are repeating this article using the same hardware (actually, the same shared logical drives), you may get a failure when attempting to create the ASM disks. If you do receive a failure, try listing all ASM disks using:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
As you can see, the results show that I have four ASM volumes already defined. If you have the four volumes already defined from a previous run, go ahead and remove them using the following commands. After removing the previously created volumes, use the "oracleasm createdisk" commands (below) to create the new volumes.
# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk "VOL1" [  OK  ]
# /etc/init.d/oracleasm deletedisk VOL2
Removing ASM disk "VOL2" [  OK  ]
# /etc/init.d/oracleasm deletedisk VOL3
Removing ASM disk "VOL3" [  OK  ]
# /etc/init.d/oracleasm deletedisk VOL4
Removing ASM disk "VOL4" [  OK  ]

To create the ASM disks using the iSCSI target names to local SCSI device name mappings (above), type the following:

$ su -
# /etc/init.d/oracleasm createdisk VOL1 /dev/iscsi/asm1/part1
Marking disk "/dev/iscsi/asm1/part1" as an ASM disk [  OK  ]

# /etc/init.d/oracleasm createdisk VOL2 /dev/iscsi/asm2/part1
Marking disk "/dev/iscsi/asm2/part1" as an ASM disk [  OK  ] 

# /etc/init.d/oracleasm createdisk VOL3 /dev/iscsi/asm3/part1
Marking disk "/dev/iscsi/asm3/part1" as an ASM disk [  OK  ]

# /etc/init.d/oracleasm createdisk VOL4 /dev/iscsi/asm4/part1
Marking disk "/dev/iscsi/asm4/part1" as an ASM disk [  OK  ]


On all other nodes in the RAC cluster, you must perform a scandisk to recognize the new volumes:

# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [  OK  ]


We can now test that the ASM disks were successfully created by using the following command on both nodes in the RAC cluster as the root user account:

# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4



Download Oracle RAC 11g Software


  The following download procedures only need to be performed on one node in the cluster!


Overview

The next logical step is to install Oracle Clusterware 11g Release 1 (11.1.0.6.0), Oracle Database 11g Release 1 (11.1.0.6.0), and optionally the Oracle Database 11g Examples for Linux x86 software. However, we must first download and extract the required Oracle software packages from the Oracle Technology Network (OTN).

  If you do not currently have an account with Oracle OTN, you will need to create one. This is a FREE account!

In this section, we will be downloading and extracting the required software from Oracle to only one of the Linux nodes in the RAC cluster - namely linux1. This is the machine where I will be performing all of the Oracle installs from. The Oracle installer will copy the required software packages to all other nodes in the RAC configuration using the remote access method we setup in the section "Configure RAC Nodes for Remote Access using SSH".

Login to the node that you will be performing all of the Oracle installations from (linux1) as the "oracle" user account. In this example, I will be downloading the required Oracle software to linux1 and saving them to "~oracle/orainstall".


Oracle Clusterware 11g Release 1 (11.1.0.6.0) for Linux x86

First, download the Oracle Clusterware.

 

  •   (244,654,895 bytes)


Oracle Database 11g Release 1 (11.1.0.6.0) for Linux x86

Next, we need to download the Oracle Database software. This can be downloaded from the same page used to download the Oracle Clusterware software:

 

  •   (1,844,533,232 bytes)


Oracle Database 11g Examples (formerly Companion)

Finally, we should download the Oracle Database 11g Examples software. This can be downloaded from the same page used to download the Oracle Clusterware software:

 

  •   (457,209,469 bytes)


As the "oracle" user account, extract the three packages you downloaded to a temporary directory. In this example, I will use "~oracle/orainstall".

Extract the Oracle Clusterware 11g package as follows:

# su - oracle
$ mkdir -p ~oracle/orainstall
$ cd ~oracle/orainstall
$ unzip linux_11gR1_clusterware.zip

Then extract the Oracle Database 11g software:

$ cd ~oracle/orainstall
$ unzip linux_11gR1_database.zip

Finally, extract the Oracle Database 11g Examples software:

$ cd ~oracle/orainstall
$ unzip linux_11gR1_examples.zip



Pre-Installation Tasks for Oracle Clusterware 11g


  Perform the following checks on both Oracle RAC nodes in the cluster!


Before installing the Oracle Clusterware and Oracle RAC software, it is highly recommended to run the Cluster Verification Utility (CVU) to verify the hardware and software configuration. CVU is a command-line utility provided on the Oracle Clusterware installation media. It is responsible for performing various system checks to assist you with confirming the Oracle RAC nodes are properly configured for Oracle Clusterware and Oracle Real Application Clusters installation. The CVU only needs to be run from the node you will be performing the Oracle installations from (linux1 in this article). Note that the CVU is also run automatically at the end of the Oracle Clusterware installation as part of the Configuration Assistants process.

  CentOS Users!

The Cluster Verification Utility (CVU) included with Oracle Clusterware 11g will fail to run on the CentOS platform. This includes manually running the CVU at the command-line as well as Oracle Clusterware automatically running it at the end of the Oracle Clusterware installation (as part of the Configuration Assistants process). Although running the CVU is not required and many of the errors that result from running the CVU can be safely ignored, it is still highly recommended to obtain a successful run.

Please see the section "Install redhat-release Stub Package" for instructions on how to install the "redhat-release" stub package. This package is used to resolve the operating system identification errors encountered while running Oracle CVU 11g on the CentOS 5 platform.


Prerequisites for Using Cluster Verification Utility

Install redhat-release Stub Package (CentOS Users Only!)

When the Cluster Verification Utility (CVU) for Oracle 11g is run on CentOS 5 (or any of the other RHEL clones), the following error message is displayed:
ERROR:
Cannot identify the operating system. Ensure that correct software is being executed for this operating system.
Verification cannot proceed.

The CVU included with Oracle Clusterware 11g will attempt to verify that the host operating system is supported. At the time of this writing, Oracle Clusterware 11g is certified on the following 32-bit Linux platforms:

Linux x86 (32-bit) Operating System Requirements
Linux Distribution Requirements *-release
Package Requirements
Asianux Distributions
  • Asianux 2, kernel 2.6.9 or later
  • Asianux 3, kernel 2.6.18 or later
  • asianux-release
    Enterprise Linux Distributions
  • Enterprise Linux 4 Update 4 (Oracle distribution), kernel 2.6.9 or later
  • Enterprise Linux 5 (Oracle distribution), kernel 2.6.9 or later
  • enterprise-release
    Red Hat Enterprise Linux Distributions
  • Red Hat Enterprise Linux 4 Update 4, kernel 2.6.9 or later
  • Red Hat Enterprise Linux 5, kernel 2.6.9 or later
  • redhat-release
    SUSE Enterprise Linux Distributions
  • SUSE 10, kernel 2.6.16.21 or later
  • sles-release

    Notice that CentOS is not included in the list of supported platforms!

    The CVU included with Oracle Clusterware 11g uses the "rpm" command to query the "-release" package in order to verify the host operating system. is the name of the distribution. Please refer to the column labeled "*-release Package Requirements" in the above table to determine the name of the package CVU will query for each of the supported distributions.

    For example, with Red Hat Enterprise Linux, the CVU will look for the existence of the "redhat-release" package and its version:

    # /bin/rpm -q --qf %{version} redhat-release
    For users running CentOS or other RedHat clones, this will cause the CVU to fail. CentOS, for example, will install the /etc/redhat-release text file, but will name the package "centos-release". When Oracle attempts to verify the existence of the redhat-release package, it will fail:
    # /bin/rpm -q --qf %{version} redhat-release
    package redhat-release is not installed
    To resolve this, I created a simple "stub" package with the correct package name and version so that CVU 11g will succeed while attempting to obtain a "Unique Distribution ID". Please note that this package does not contain, nor will it install, any files to your system. It merely exists for the purpose of allowing the CVU to detect the presence of the redhat-release package and the correct version.

    Simply download and install the following RPM to both Oracle RAC nodes in the cluster to allow CVU to perform its checks.

    Download redhat-release stub package

      redhat-release-5-1.0.el5.centos.1.i386.rpm

    Install redhat-release stub package

    # rpm -i redhat-release-5-1.0.el5.centos.1.i386.rpm

    Test redhat-release stub package

    # rpm -q --qf %{version} redhat-release
    5#

    Install cvuqdisk RPM (RHEL/CentOS Users Only!)

    The second pre-requisite for running the CVU pertains to users running Oracle Enterprise Linux, Red Hat Linux , CentOS, and SuSE. If you are using any of the above listed operating systems, then you must download and install the package cvuqdisk to both of the Oracle RAC nodes in the cluster. This means you will need to install the cvuqdisk RPM to both linux1 and linux2. Without cvuqdisk, CVU will be unable to discover shared disks and you will receive the error message "Package cvuqdisk not installed" when you run CVU.

    The cvuqdisk RPM can be found on the Oracle Clusterware installation media in the rpm directory. For the purpose of this article, the Oracle Clusterware media was extracted to the /home/oracle/orainstall/clusterware directory on linux1. Note that before installing the cvuqdisk RPM, we need to set an environment variable named CVUQDISK_GRP to point to the group that will own the cvuqdisk utility. The default group is oinstall which is the group we are using for the oracle UNIX user account in this article.

    Locate and copy the cvuqdisk RPM from linux1 to linux2 as the "oracle" user account:

    $ ssh linux2 "mkdir -p /home/oracle/orainstall/clusterware/rpm"
    $ scp /home/oracle/orainstall/clusterware/rpm/cvuqdisk-1.0.1-1.rpm linux2:/home/oracle/orainstall/clusterware/rpm
    Perform the following steps as the "root" user account on both Oracle RAC nodes to install the cvuqdisk RPM:
    $ su -
    # cd /home/oracle/orainstall/clusterware/rpm
    # CVUQDISK_GRP=oinstall; export CVUQDISK_GRP
    
    # rpm -iv cvuqdisk-1.0.1-1.rpm
    Preparing packages for installation...
    cvuqdisk-1.0.1-1
    
    # ls -l /usr/sbin/cvuqdisk
    -rwsr-x--- 1 root oinstall 4168 Jun  2  2005 /usr/sbin/cvuqdisk

    Verify Remote Access / User Equivalence

    The CVU should be run from linux1 — the node we will be performing all of the Oracle installations from. Before running CVU, login as the oracle user account and verify remote access / user equivalence is configured to all nodes in the cluster. When using the secure shell method, user equivalence will need to be enabled for the terminal shell session before attempting to run the CVU. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for each key that you generated when prompted:
    # su - oracle
    $ exec /usr/bin/ssh-agent $SHELL
    $ /usr/bin/ssh-add
    Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
    Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
    Verifying Oracle Clusterware Requirements with CVU
    Once all prerequisites for running the CVU utility have been met, we can now check that all pre-installation tasks for Oracle Clusterware are completed by executing the following command as the "oracle" UNIX user account from linux1:
    $ cd /home/oracle/orainstall/clusterware
    $ ./runcluvfy.sh stage -pre crsinst -n linux1,linux2 -verbose
    Review the CVU report. Note that all of the checks performed by CVU should be reported as passed before continuing with the Oracle Clusterware installation.

    If your system only has 1GB of RAM memory, you may receive an error during the "Total memory" check:

    Check: Total memory
      Node Name     Available                 Required                  Comment
      ------------  ------------------------  ------------------------  ----------
      linux2        1009.65MB (1033880KB)     1GB (1048576KB)           failed
      linux1        1009.65MB (1033880KB)     1GB (1048576KB)           failed
    Result: Total memory check failed.
    As you can see from the output above, the requirement is for 1GB of memory (1048576 KB). Although your system may have 1GB of memory installed in each of the Oracle RAC nodes, the Linux kernel is calculating it to be 1033880 KB which comes out to be 14696 KB short. This can be considered close enough and safe to continue with the installation. As I mentioned earlier in this article, I highly recommend both Oracle RAC nodes have 2GB of RAM memory for performance reasons.
    Verifying the Hardware and Operating System Setup with CVU
    The next CVU check to run will verify the hardware and operating system setup. Again, run the following as the "oracle" UNIX user account from linux1:
    $ cd /home/oracle/orainstall/clusterware
    $ ./runcluvfy.sh stage -post hwos -n linux1,linux2 -verbose
    Review the CVU report. As with the previous checks (Verifying Oracle Clusterware Requirements with CVU), all of the checks performed by CVU should be reported as passed before continuing with the Oracle Clusterware installation.

    Note the warnings you will receive in the "Checking shared storage accessibility..." portion of the report:

    Checking shared storage accessibility...
    
    WARNING:
    Unable to determine the sharedness of /dev/sda on nodes:
            linux2,linux1
    
    WARNING:
    Unable to determine the sharedness of /dev/sdb on nodes:
            linux2,linux1
    
    WARNING:
    Unable to determine the sharedness of /dev/sdc on nodes:
            linux2,linux1
    
    WARNING:
    Unable to determine the sharedness of /dev/sdd on nodes:
            linux2,linux1
    
    WARNING:
    Unable to determine the sharedness of /dev/sde on nodes:
            linux2,linux1
    These warnings can be safely ignored. It is worth noting that these warnings were considered an error in Oracle 10g RAC. Although we know the disks are visible and shared from both of our Oracle RAC nodes in the cluster, the check itself still fails. Several reasons for this have been documented. The first came from Metalink indicating that cluvfy currently does not work with devices other than SCSI devices. This would include devices like EMC PowerPath and volume groups like those from Openfiler. At the time of this writing, no workaround exists other than to use manual methods for detecting and verifying shared devices. Another reason for this warning was documented by Bane Radulovic at Oracle Corporation. His research shows that CVU calls smartclt on Linux, and the problem is that smartclt does not return the serial number from our iSCSI devices. For example, a check against /dev/sde shows:
    # /usr/sbin/smartctl -i /dev/sde
    smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
    Home page is 
    
    Device: Openfile Virtual disk     Version: 0
    Serial number:
    Device type: disk
    Local Time is: Fri Oct 12 01:37:09 2007 EDT
    Device supports SMART and is Disabled
    Temperature Warning Disabled or Not Supported
    At the time of this writing, it is unknown if the Openfiler developers have plans to fix this.
    阅读(6757) | 评论(1) | 转发(0) |
    给主人留下些什么吧!~~

    chinaunix网友2009-05-15 00:42:01

    thanx ALOT!!!!