从事IT基础架构多年,发现自己原来更合适去当老师……喜欢关注新鲜事物,不仅限于IT领域。
分类: Oracle
2009-11-02 18:26:24
: | 259301.1 | 类型: | BULLETIN | |
上次修订日期: | 09-FEB-2009 | 状态: | PUBLISHED |
PURPOSE ------- This document is to provide additional information on CRS (Cluster Ready Services) in 10g Real Application Clusters. SCOPE & APPLICATION ------------------- This document is intended for RAC Database Administrators and Oracle support enginneers. CRS and 10g REAL APPLICATION CLUSTERS ------------------------------------- CRS (Cluster Ready Services) is a new feature for 10g Real Application Clusters that provides a standard cluster interface on all platforms and performs new high availability operations not available in previous versions. CRS KEY FACTS ------------- Prior to installing CRS and 10g RAC, there are some key points to remember about CRS and 10g RAC: - CRS is REQUIRED to be installed and running prior to installing 10g RAC. - CRS can either run on top of the vendor clusterware (such as Sun Cluster, HP Serviceguard, IBM HACMP, TruCluster, Veritas Cluster, Fujitsu Primecluster, etc...) or can run without the vendor clusterware. The vendor clusterware was required in 9i RAC but is optional in 10g RAC. - The CRS HOME and ORACLE_HOME must be installed in DIFFERENT locations. - Shared Location(s) or devices for the Voting File and OCR (Oracle Configuration Repository) file must be available PRIOR to installing CRS. The voting file should be at least 20MB and the OCR file should be at least 100MB. - CRS and RAC require that the following network interfaces be configured prior to installing CRS or RAC: - Public Interface - Private Interface - Virtual (Public) Interface For more information on this, see - The root.sh script at the end of the CRS installation starts the CRS stack. If your CRS stack does not start, see - Only one set of CRS daemons can be running per RAC node. - On Unix, the CRS stack is run from entries in /etc/inittab with "respawn". - If there is a network split (nodes lose communication with each other). One or more nodes may reboot automatically to prevent data corruption. - The supported method to start CRS is booting the machine. MANUAL STARTUP OF THE CRS STACK IS NOT SUPPORTED UNTIL 10.1.0.4 OR HIGHER. - The supported method to stop is shutdown the machine or use "init.crs stop". - Killing CRS daemons is not supported unless you are removing the CRS installation via because flag files can become mismatched. - For maintenance, go to single user mode at the OS. Once the stack is started, you should be able to see all of the daemon processes with a ps -ef command: [rac1]/u01/home/beta> ps -ef | grep crs oracle 1363 999 0 11:23:21 ? 0:00 /u01/crs_home/bin/evmlogger.bin -o /u01 oracle 999 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/evmd.bin root 1003 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/crsd.bin oracle 1002 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/ocssd.bin CRS DAEMON FUNCTIONALITY ------------------------ Here is a short description of each of the CRS daemon processes: CRSD: - Engine for HA operation - Manages 'application resources' - Starts, stops, and fails 'application resources' over - Spawns separate 'actions' to start/stop/check application resources - Maintains configuration profiles in the OCR (Oracle Configuration Repository) - Stores current known state in the OCR. - Runs as root - Is restarted automatically on failure OCSSD: - OCSSD is part of RAC and Single Instance with ASM - Provides access to node membership - Provides group services - Provides basic cluster locking - Integrates with existing vendor clusteware, when present - Can also runs without integration to vendor clustware - Runs as Oracle. - Failure exit causes machine reboot. --- This is a feature to prevent data corruption in event of a split brain. EVMD: - Generates events when things happen - Spawns a permanent child evmlogger - Evmlogger, on demand, spawns children - Scans callout directory and invokes callouts. - Runs as Oracle. - Restarted automatically on failure CRS LOG DIRECTORIES ------------------- When troubleshooting CRS problems, it is important to review the directories under the CRS Home. $ORA_CRS_HOME/crs/log - This directory includes traces for CRS resources that are joining, leaving, restarting, and relocating as identified by CRS. $ORA_CRS_HOME/crs/init - Any core dumps for the crsd.bin daemon should be written here. could be used to debug these. $ORA_CRS_HOME/css/log - The css logs indicate all actions such as reconfigurations, missed checkins , connects, and disconnects from the client CSS listener . In some cases the logger logs messages with the category of (auth.crit) for the reboots done by oracle. This could be used for checking the exact time when the reboot occured. $ORA_CRS_HOME/css/init - Core dumps from the ocssd primarily and the pid for the css daemon whose death is treated as fatal are located here. If there are abnormal restarts for css then the core files will have the formats of core.. could be used to debug these. $ORA_CRS_HOME/evm/log - Log files for the evm and evmlogger daemons. Not used as often for debugging as the CRS and CSS directories. $ORA_CRS_HOME/evm/init - Pid and lock files for EVM. Core files for EVM should also be written here. could be used to debug these. $ORA_CRS_HOME/srvm/log - Log files for OCR. STATUS FOR CRS RESOURCES ------------------------ After installing RAC and running the VIPCA (Virtual IP Configuration Assistant) launched with the RAC root.sh, you should be able to see all of your CRS resources with crs_stat. Example: cd $ORA_CRS_HOME/bin ./crs_stat NAME=ora.rac1.gsd TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac1.oem TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac1.ons TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac1.vip TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac2.gsd TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac2.oem TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac2.ons TYPE=application TARGET=ONLINE STATE=ONLINE NAME=ora.rac2.vip TYPE=application TARGET=ONLINE STATE=ONLINE There is also a script available to view CRS resources in a format that is easier to read. Just create a shell script with: --------------------------- Begin Shell Script ------------------------------- #!/usr/bin/ksh # # Sample 10g CRS resource status query script # # Description: # - Returns formatted version of crs_stat -t, in tabular # format, with the complete rsc names and filtering keywords # - The argument, $RSC_KEY, is optional and if passed to the script, will # limit the output to HA resources whose names match $RSC_KEY. # Requirements: # - $ORA_CRS_HOME should be set in your environment RSC_KEY=$1 QSTAT=-u AWK=/usr/xpg4/bin/awk # if not available use /usr/bin/awk # Table header:echo "" $AWK \ 'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State"; printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}' # Table body: $ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \ 'BEGIN { FS="="; state = 0; } $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; state == 0 {next;} $1~/TARGET/ && state == 1 {apptarget = $2; state=2;} $1~/STATE/ && state == 2 {appstate = $2; state=3;} state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}' --------------------------- End Shell Script ------------------------------- Example output: [opcbsol1]/u01/home/usupport> ./crsstat HA Resource Target State ----------- ------ ----- ora.V10SN.V10SN1.inst ONLINE ONLINE on opcbsol1 ora.V10SN.V10SN2.inst ONLINE ONLINE on opcbsol2 ora.V10SN.db ONLINE ONLINE on opcbsol2 ora.opcbsol1.ASM1.asm ONLINE ONLINE on opcbsol1 ora.opcbsol1.LISTENER_OPCBSOL1.lsnr ONLINE ONLINE on opcbsol1 ora.opcbsol1.gsd ONLINE ONLINE on opcbsol1 ora.opcbsol1.ons ONLINE ONLINE on opcbsol1 ora.opcbsol1.vip ONLINE ONLINE on opcbsol1 ora.opcbsol2.ASM2.asm ONLINE ONLINE on opcbsol2 ora.opcbsol2.LISTENER_OPCBSOL2.lsnr ONLINE ONLINE on opcbsol2 ora.opcbsol2.gsd ONLINE ONLINE on opcbsol2 ora.opcbsol2.ons ONLINE ONLINE on opcbsol2 ora.opcbsol2.vip ONLINE ONLINE on opcbsol2 CRS RESOURCE ADMINISTRATION --------------------------- You can use srvctl to manage these resources. Below are syntax and examples. ------------------------------------------------------------------------------- CRS RESOURCE STATUS srvctl status database -d [-f] [-v] [-S ] srvctl status instance -d -i >[, ] [-f] [-v] [-S ] srvctl status service -d -s [, ] [-f] [-v] [-S ] srvctl status nodeapps [-n ] srvctl status asm -n EXAMPLES: Status of the database, all instances and all services. srvctl status database -d ORACLE -v Status of named instances with their current services. srvctl status instance -d ORACLE -i RAC01, RAC02 -v Status of a named services. srvctl status service -d ORACLE -s ERP -v Status of all nodes supporting database applications. srvctl status node ------------------------------------------------------------------------------- START CRS RESOURCES srvctl start database -d [-o < start-options>] [-c | -q] srvctl start instance -d -i [, ] [-o ] [-c | -q] srvctl start service -d [-s [, ]] [-i ] [-o ] [-c | -q] srvctl start nodeapps -n srvctl start asm -n [-i ] [-o ] EXAMPLES: Start the database with all enabled instances. srvctl start database -d ORACLE Start named instances. srvctl start instance -d ORACLE -i RAC03, RAC04 Start named services. Dependent instances are started as needed. srvctl start service -d ORACLE -s CRM Start a service at the named instance. srvctl start service -d ORACLE -s CRM -i RAC04 Start node applications. srvctl start nodeapps -n myclust-4 ------------------------------------------------------------------------------- STOP CRS RESOURCES srvctl stop database -d [-o ] [-c | -q] srvctl stop instance -d -i [, ] [-o ][-c | -q] srvctl stop service -d [-s [, ]] [-i ][-c | -q] [-f] srvctl stop nodeapps -n srvctl stop asm -n [-i ] [-o ] EXAMPLES: Stop the database, all instances and all services. srvctl stop database -d ORACLE Stop named instances, first relocating all existing services. srvctl stop instance -d ORACLE -i RAC03,RAC04 Stop the service. srvctl stop service -d ORACLE -s CRM Stop the service at the named instances. srvctl stop service -d ORACLE -s CRM -i RAC04 Stop node applications. Note that instances and services also stop. srvctl stop nodeapps -n myclust-4 ------------------------------------------------------------------------------- ADD CRS RESOURCES srvctl add database -d -o [-m ] [-p ] [-A /netmask] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}] [-s ] [-n ] srvctl add instance -d -i -n srvctl add service -d -s -r [-a ] [-P ] [-u] srvctl add nodeapps -n -o [-A /netmask[/if1[|if2|...]]] srvctl add asm -n -i -o OPTIONS: -A vip range, node, and database, address specification. The format of address string is: [ ]/ / [/ ] [,] [ ]/ / [/ ] -a for services, list of available instances, this list cannot include preferred instances -m domain name with the format “us.mydomain.com” -n node name that will support one or more instances -o $ORACLE_HOME to locate Oracle binaries -P for services, TAF preconnect policy - NONE, PRECONNECT -r for services, list of preferred instances, this list cannot include available instances. -s spfile name -u updates the preferred or available list for the service to support the specified instance. Only one instance may be specified with the -u switch. Instances that already support the service should not be included. EXAMPLES: Add a new node: srvctl add nodeapps -n myclust-1 -o $ORACLE_HOME –A 139.184.201.1/255.255.255.0/hme0 Add a new database. srvctl add database -d ORACLE -o $ORACLE_HOME Add named instances to an existing database. srvctl add instance -d ORACLE -i RAC01 -n myclust-1 srvctl add instance -d ORACLE -i RAC02 -n myclust-2 srvctl add instance -d ORACLE -i RAC03 -n myclust-3 Add a service to an existing database with preferred instances (-r) and available instances (-a). Use basic failover to the available instances. srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a RAC03,RAC04 Add a service to an existing database with preferred instances in list one and available instances in list two. Use preconnect at the available instances. srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a RAC03,RAC04 -P PRECONNECT ------------------------------------------------------------------------------- REMOVE CRS RESOURCES srvctl remove database -d srvctl remove instance -d [-i ] srvctl remove service -d -s [-i ] srvctl remove nodeapps -n EXAMPLES: Remove the applications for a database. srvctl remove database -d ORACLE Remove the applications for named instances of an existing database. srvctl remove instance -d ORACLE -i RAC03 srvctl remove instance -d ORACLE -i RAC04 Remove the service. srvctl remove service -d ORACLE -s STD_BATCH Remove the service from the instances. srvctl remove service -d ORACLE -s STD_BATCH -i RAC03,RAC04 Remove all node applications from a node. srvctl remove nodeapps -n myclust-4 ------------------------------------------------------------------------------- MODIFY CRS RESOURCES srvctl modify database -d [-n ] [-m ] [-p ] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}] [-s ] srvctl modify instance -d -i -n srvctl modify instance -d -i {-s | -r} srvctl modify service -d -s -i -t [-f] srvctl modify service -d -s -i -r [-f] srvctl modify nodeapps -n [-A ] [-x] OPTIONS: -i -t the instance name (-i) is replaced by the instance name (-t) -i -r the named instance is modified to be a preferred instance -A address-list for VIP application, at node level -s add or remove ASM dependency EXAMPLES: Modify an instance to execute on another node. srvctl modify instance -d ORACLE -n myclust-4 Modify a service to execute on another node. srvctl modify service -d ORACLE -s HOT_BATCH -i RAC01 -t RAC02 Modify an instance to be a preferred instance for a service. srvctl modify service -d ORACLE -s HOT_BATCH -i RAC02 –r ------------------------------------------------------------------------------- RELOCATE SERVICES srvctl relocate service -d -s [-i ]-t [-f] EXAMPLES: Relocate a service from one instance to another srvctl relocate service -d ORACLE -s CRM -i RAC04 -t RAC01 ------------------------------------------------------------------------------- ENABLE CRS RESOURCES (The resource may be up or down to use this function) srvctl enable database -d srvctl enable instance -d -i [, ] srvctl enable service -d -s ] [, ] [-i ] EXAMPLES: Enable the database. srvctl enable database -d ORACLE Enable the named instances. srvctl enable instance -d ORACLE -i RAC01, RAC02 Enable the service. srvctl enable service -d ORACLE -s ERP,CRM Enable the service at the named instance. srvctl enable service -d ORACLE -s CRM -i RAC03 ------------------------------------------------------------------------------- DISABLE CRS RESOURCES (The resource must be down to use this function) srvctl disable database -d srvctl disable instance -d -i [, ] srvctl disable service -d -s ] [, ] [-i ] EXAMPLES: Disable the database globally. srvctl disable database -d ORACLE Disable the named instances. srvctl disable instance -d ORACLE -i RAC01, RAC02 Disable the service globally. srvctl disable service -d ORACLE -s ERP,CRM Disable the service at the named instance. srvctl disable service -d ORACLE -s CRM -i RAC03,RAC04 ------------------------------------------------------------------------------- For more information on this see the Oracle10g Real Application Clusters Administrator’s Guide - Appendix B RELATED DOCUMENTS ----------------- Oracle10g Real Application Clusters Installation and Configuration Oracle10g Real Application Clusters Administrator’s Guide ------------------------------------------------------------------------ To learn about Oracle University offerings related to Real Application Clusters, read .