分类:
2006-04-12 22:01:02
这几天刚写的,还是初稿,还在继续修订中,如有什么问题请及时提出,我会尽快修改,尽量不误倒新手,给新手们分享下,相互学习交流.
目 录
4.4. 4 hacmp resource group attribute 14
4.5定义非IP网络接口和persistent IP- 15
目前XXXXXXXXXX两台P670各划分了七个分区. 两台中的两个分区做双机.安装hacmp软件.目前操作系统已经安装.
系统:AIX 5.2 ML:5200-04
存储:rootvg .本地SCSI盘.
Datavg: SAN ESS SHARE DISK.
其中在Datavg有两个LV, sna400 和exmbill..里面存放的是系统crontab自动的调用的脚本程序.使系统自动完成脚本提供的功能.
网络接口:ent2 : U1.5-P2-I9/E1 Ent3: U1.5-P2-I2/Q1在同一物理网络.在防火墙里面.Ent1在另一个物理网络,属于防火墙外面.
串口: Sa1: U1.5-P2-I9/Q1
详见下图:
根据XXXXXXXX具体的环境,HA中配置两个资源组.具体实施方式如下面介绍.
操作系统已经安装,确定下面的软件包已经安装.
Lslpp – l |grep lppname
Bos.clvm.enh
Bos.data
Rsct.basic(Rsct.basic.hacmp,Rsct.basic.rte,Rsct.basic.sp)
Rsct.compt.basic(Rsct.compat.basic.hacmp,Rsct.compat.basic.rte,Rsct.compat.basic.sp)
Rsct.compat.clients(Rsct.compat.clients.hacmp,Rsct.compat.clients.rte,Rsct.compat.clients.sp)
Bos.perf.tools
Perfagent.tools
Bos.adt.syscalls
Bos.adt.libm
扩展/usr文件系统,为升级ML做准备
调整各文件适当大小/tmp ,/var ,/
Chfs –a size=+n M /
Smitty chlicense Change Number of Licensed Users
Smitty chtz Change Time Zone
Smitty chps Change pagespaces
Smitty chaio Change Characteristics of Asynchronous I/O
Smitty chgsys Change Characteristics of Operating System
Maximum number of PROCESSES allowed per user
Maximum number of pages in block I/O BUFFER CACHE
下载升级系统维护级别ML04,下载维护520004.tar.gz到/tmp 或/usr/sys/inst.images,
gzip -d -c 520004.tar.gz | tar -xvf -,inutoc /usr/sys/inst.images,
installp -acgXd /usr/sys/inst.images bos.rte.install,smit update_all,然后重启系统
1、 在两台计算机上设置IP地址
(1) 在p
A. 配置第一块boot网卡
->#smitty
->Communications Applications and Services
->TCP/IP
->Minimum Configuration & Startup
select en1
* HOSTNAME [p
* Internet ADDRESS (dotted decimal) [
Network MASK (dotted decimal) [255.255.255.0]
* Network INTERFACE en1
NAMESERVER
Internet ADDRESS (dotted decimal) []
DOMAIN Name []
Default GATEWAY Address []
B. 配置第二块boot网卡
->#smitty
->Communications Applications and Services
->TCP/IP
->Minimum Configuration & Startup
select en3
* HOSTNAME [p
* Internet ADDRESS (dotted decimal) [192.168.10.35]
Network MASK (dotted decimal) [255.255.255.0]
* Network INTERFACE en3
NAMESERVER
Internet ADDRESS (dotted decimal) []
DOMAIN Name []
Default GATEWAY Address []
(2) 在p670B机上:
A. 配置第一块boot网卡
->#smitty
->Communications Applications and Services
->TCP/IP
->Minimum Configuration & Startup
select en1
* HOSTNAME [p670B_CSPS]
* Internet ADDRESS (dotted decimal) [
Network MASK (dotted decimal) [255.255.255.0]
* Network INTERFACE en1
NAMESERVER
Internet ADDRESS (dotted decimal) []
DOMAIN Name []
Default GATEWAY Address []
B. 配置第二块boot网卡
->#smitty
->Communications Applications and Services
->TCP/IP
->Minimum Configuration & Startup
select en3
* HOSTNAME [p670B_CSPS]
* Internet ADDRESS (dotted decimal) [192.168.10.45]
Network MASK (dotted decimal) [255.255.255.0]
* Network INTERFACE en3
NAMESERVER
Internet ADDRESS (dotted decimal) []
DOMAIN Name []
Default GATEWAY Address []
2、 修改/etc/hosts文件,内容如下:
127.0.0.1 loopback localhost p
192.168.10.35 P
192.168.3.45 P
192.168.4.20 P
192.168.10.45 P670B_CSPS_boot2
192.168.4.21 P670B_per
下面的部份在添加第二个应用时增加.以防止在配置第一个应用时HACMP会自动发现更多的网络接口.影响配置.
#172.16.10.1 P
#172.16.10.2 P670B_CSPS_boot3
#21.136.67.77 P670B_CSPS
登录root用户
smitty tty
选择”Add a TTY”
Move cursor to desired item and press Enter.
List All Defined TTYs
Add a TTY
Move a TTY to Another Port
Change / Show Characteristics of a TTY
Remove a TTY
Configure a Defined TTY
Generate Error Report
Trace a TTY
Add a TTY
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP] [Entry Fields]
TTY type tty
TTY interface rs232
Description Asynchronous Terminal
Parent adapter sa1
* PORT number [0] +
Enable LOGIN disable +
BAUD rate [9600]
PARITY [none] +
BITS per character [8] +
Number of STOP BITS [1] +
TIME before advancing to next port setting [0] +
TERMINAL type [dumb]
FLOW CONTROL to be used [xon]
另台系统配置相同的操作.
在两台主机上,通过lsdev –Cc tty查看新增的串行口,会有类似下面的内容:
Tty1 Available 00-00-S1-00 Asynchronous Terminal
序号 |
主机 |
内容 |
1. |
主机1 |
stty |
2. |
主机2 |
stty 这时在两台主机的命令行下会有内容的显示,否则,tty配置失败。示例如下: speed 9600 baud; -parity hupcl eol2 = ^? brkint -inpck -istrip icrnl -ixany ixoff onlcr tab3 echo echoe echok |
3. |
主机1 |
cat /etc/hosts > /dev/tty1 |
4. |
主机2 |
cat < /dev/tty1 这时在主机2的命令行下有主机1的/etc/hosts文件的内容,否则,tty配置失败。 |
确定系统已经安装RSCT
以下的包也是必须要安装的:
• bos.adt.lib
• bos.adt.libm
• bos.adt.syscalls
• bos.net.tcp.client
• bos.net.tcp.server
• bos.rte.SRC
• bos.rte.libc
• bos.rte.libcfg
• bos.rte.libcur
• bos.rte.libpthreads
• bos.rte.odm
如果您要安装并行的资源组,还要安装下面的包:
• bos.rte.lvm.rte
• bos.clvm.enh.
smitty installp
->Install Software
->INPUT device / directory for software [/dev/cd0]
SOFTWARE to install [not select cluster.haview,
cluster.hativoli,
cluster.man.en_US.haview]
ACCEPT new license agreements? Yes
安装软件,一般基本上除了haview ,netwiew ( Tivoli),的 包以外,所有的hacmp的包都要安装。
3 打补丁
现在hacmp最新的补丁是:
IY53044 - Latest HACMP for AIX R510 Fixes as of January 2004
注意,不能忽略给hacmp打补丁这一步骤。对hacmp来说,补丁是十分重要的。很多发现的缺陷都已经在补丁中被解决了。有的时候严格的按照正确步骤安装和配置完hacmp的软件后,发现takeover 有问题,ip接管有问题,机器自动宕机等等千奇百怪的问题,其实都与补丁有关。所以一定要注意打补丁这个环节。
4 重启机器
在hacmp 5.1 中 为了安全起见,不再使用/.rhosts 文件来控制两台机器之间的命令和数据交换,而是引进的一个新的进程clcomd 。 在/etc/inittab文件最后添加了一行:clcomdES:2:once:startsrc -s clcomdES >/dev/console 2>&1 。
因此重新启机后, ps –ef |grep clomd ,会发现:root 12908 6478 0 Apr 12 - 0:21 /usr/es/sbin/cluster/clcomd –d ,证明该进程启动了。
Hacmp5.1使用/usr/es/sbin/cluster/etc/rhosts 文件来代替 /.rhosts 文件的功能。
注意:如果两个节点间的通讯发生了什么问题,可以检查rhots 文件,或者编辑rhosts文件加入两个节点的网络信息。
在配置ha之前,客户机器已经是生产机.在两台P670上的两个分区.已经连接好了ESS存储,已经作了lun的划分,并且经创建好了卷组datavg,逻辑卷以及文件系统等资源。
针对hacmp的特点,作如下配置:
Ø 分别在两台主机上执行ls –lt /dev/datavg,确定两边vg的major,minor的值是否相同
Ø 分别在两台主机上执行lvlstmajor来确认两边同时可用的最小的major number
Ø 在两台主机上将共享卷组分别做export
Ø 分别在两边主机执行importvg –V [major]:[minor] –y datavg hdisk
Ø 分别在两台主机执行chvg –a n datavg
Ø 将sharevg分别varyoff
在两个应用分区的/usr/sbin/cluster/script目录下面创建startapp.sh和stopapp.sh脚本文件.chmod +x startapp.sh stopapp.sh修改执行属性。
基于调试的目的,这些脚本里面都只有一个banner语句,最终调试完成后,将直接修改脚本内容,以反映最终的实际情况。
在HA 5.1版本里面,已经不再使用/.rhosts文件来进行控制,而是使用clcomd进程进行控制。在hacmp的软件安装完成后,就可以在/etc/inittab文件里面看到系统已经添加了一行:clcomdES:2:once:startsrc -s clcomdES >/dev/console 2>&1。如果发现系统里面没有启动这个进程,可以手工重新启动os来解决,或者手工启动。
使用mktcpip分别进行两块网卡的ip地址配置。
分别ping测试P
网卡配置完了以后,重新检查/etc/hosts文件。
用smitty hacmp来添加cluster和node。
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
* Cluster Name [cluster1]
New Nodes (via selected communication paths) [P
Currently Configured Node(s)
回车执行,系统会自行discover hacmp的资源,显示如下:
。。。。。。。。。。。。。。。。。。。。
IP Network Discovery completed normally
Current cluster configuration:
No resource groups defined
Cluster Description of Cluster: app1
Cluster Security Level: Standard
There are 2 node(s) and 1 network(s) defined
NODE P
Network net_ether_01
P
P
NODE P670B_CSPS_:
Network net_ether_01
P670B_CSPS_boot1
P670B_CSPS_boot2 192.168.10.45
。。。。。。。。。。。。。。。。。。。。。。
#Smitty hacmp
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
Configure Service IP Labels/Addresses
Configure Application Servers
Configure Volume Groups, Logical Volumes and Filesystems
Configure Concurrent Volume Groups and Logical Volumes
Add a Service IP Label/Address
Change/Show a Service IP Label/Address
Remove Service IP Label(s)/Address(es)
* IP Label/Address [P
Network Name [net_ether_01]
#Smitty hacmp
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
Configure Service IP Labels/Addresses
Configure Application Servers
Configure Volume Groups, Logical Volumes and Filesystems
Configure Concurrent Volume Groups and Logical Volumes
Add an Application Server
Change/Show an Application Server
Remove an Application Server
* Server Name [app1]
* Start Script [/usr/sbin/cluster/script/startapp.sh]
* Stop Script [/usr/sbin/cluster/script/stopapp.sh]
#Smitty hacmp
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
Add a Resource Group
Change/Show a Resource Group
Remove a Resource Group
Change/Show Resources for a Resource Group (standard)
在弹出菜单里面选择rotating,继续下一步
* Resource Group Name [serv1]
* Participating Node Names / Default Node Priority [P
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
Add a Resource Group
Change/Show a Resource Group
Remove a Resource Group
Change/Show Resources for a Resource Group (standard)
Resource Group Name serv1
Node Relationship rotating
Site Relationship ignore
Participating Node Names / Default Node Priority P
Dynamic Node Priority []
Service IP label [P
Filesystems (default is All) []
Filesystems Consistency Check fsck
Filesystems Recovery Method sequential
Filesystems/Directories to Export []
Filesystems/Directories to NFS mount []
Network For NFS Mount []
Volume Groups [datavg]
Concurrent Volume groups []
Raw Disk PVIDs []
Connections Services []
Fast Connect Services []
Tape Resources []
Application Servers [app1]
Communication Links []
Primary Workload Manager Class []
Secondary Workload Manager Class []
Miscellaneous Data []
Automatically Import Volume Groups false
执行smit hacmp进行tty的配置:
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Discover HACMP-related Information from Configured Nodes
Extended Topology Configuration
Extended Resource Configuration
Extended Event Configuration
Extended Performance Tuning Parameters Configuration
Security and Users Configuration
Snapshot Configuration
Extended Verification and Synchronization
Configure an HACMP Cluster
Configure HACMP Nodes
Configure HACMP Sites
Configure HACMP Networks
Configure HACMP Communication Interfaces/Devices
Configure HACMP Persistent Node IP Label/Addresses
Configure HACMP Global Networks
Configure HACMP Network Modules
Configure Topology Services and Group Services
Show HACMP Topology
选择rs232àDiscovered Communication Deviceà选择两个分区的tty1设备,回车添加。
执行smit hacmp进行persistent ip的配置:
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Discover HACMP-related Information from Configured Nodes
Extended Topology Configuration
Extended Resource Configuration
Extended Event Configuration
Extended Performance Tuning Parameters Configuration
Security and Users Configuration
Snapshot Configuration
Extended Verification and Synchronization
Configure an HACMP Cluster
Configure HACMP Nodes
Configure HACMP Sites
Configure HACMP Networks
Configure HACMP Communication Interfaces/Devices
Configure HACMP Persistent Node IP Label/Addresses
Configure HACMP Global Networks
Configure HACMP Network Modules
Configure Topology Services and Group Services
Show HACMP Topology
选择Add a Persistent Node IP Label/Address
* Node Name p
* Network Name net_ether_01
Node IP Label/Address p
* Node Name p670B_CSPS
* Network Name net_ether_01
Node IP Label/Address p670B_per
详细察看输出,确认无误:
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
Cluster Description of Cluster: app1
Cluster Security Level: Standard
There are 2 node(s) and 1 network(s) defined
NODE P
Network net_ether_01
P
P
P
P
NODE P670B_CSPS:
Network net_ether_01
P670B_CSPS 192.168.3.45
P670B_CSPS_boot2 192.168.10.45
P670B_CSPS_boot1
P
Resource Group serv1
Behavior rotating
Participating Nodes P
Service IP Label P
执行同步:
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
Display HACMP Configuration
启动hacmp:
#smitty clstart
* Start now, on system restart or both [now]
Broadcast message from (tty) at ...true
...
查看cluster服务是否运行:
Lssrc –g cluster
Ps –eaf |grep cluster
查看ha启动日志:
>tail –f /tmp/hacmp.out
执行/usr/es/sbin/cluster/clstat察看cluster状态,发现所有状态均正常。
clstat - HACMP Cluster Status Monitor
Cluster: cluster1
Thurs Oct 28 11:28:41 BEIDT 2004
State: UP Nodes: 2
SubState: STABLE
Node: P
Interface: P
State: UP
Interface: P
State: UP
Interface: P
State: UP
Interface: P
State: UP
Resource Group: serv1 State: On line
Node: P670B_CSPS State: UP
Interface: P670B_CSPS_boot1 (1) Address:
State: UP
Interface: P670B_CSPS_boot2 (1) Address: 192.168.10.45
State: UP
Interface: P670B_per Address:192.168.3.21
State: UP
在这个基础上继续做一些cluster的功能的基本测试,完全符合预期的要求。
Ø 在主机系统和备机系统做hacmp的启动和关闭,资源组的启动和关闭正常
Ø 在主机应用系统上做资源组的切换测试,takeover到备机应用系统,测试正常
Ø 在备机应用系统上做资源组的切换测试,takeover到主机应用系统,测试正常
Ø 在主机和备机上分别用boot –q来做系统非正常下机,用clstat可以看到失败节点上的应用和资源组已经成功切换到了另外一个节点,在失败节点重新启动hacmp后,资源的回切正常
Ø 在两台主机上将service网卡上的光纤线拔掉,可以看到service ip自动漂移另外一块的boot网卡
在前面第一个资源组中有一个共享卷组datavg下有两个LV和文件系统: sna400和exmbill使用的是crontab程序脚本.而这个服务的IP与第一个服务IP不能在同一网卡个绑定.因为两块网卡分别在防火墙内外.所以在这里我们就加一个资源组.只做service IP地址切换来满足功能.后台的crontab程序可以在两台主机上复制相同的脚本.
可以先用备份工具tar 或backup将上面的文件系统备份到磁带中:
然后在备机rootvg中创建与上面LV大小相同的LV并创建文件系统.
Smitty mklv
Smitty crjfs
再将数据恢复到创建的文件系统中.
待应用切换到B机时.在主机A上创建同样的LV和文件系统.并将上面备份的crontab的脚本程序恢复进去.
最后删除datavg中多余的那两个LV
在完成前面HACMP配置后.第一个服务已经可以正常的启动和切换.服务地址为192.168.3.35. 永久管理IP为:A机:192.168.4.20 和B机:192.168.4.21.
下面我们需要继续添加第二运用.根据需要配置第二个网络和资源组
增加第二个网络和资源组的过程中,最好还是关闭hacmp操作(虽然应该可以在线操作)
增加第二个网络的步骤:
(1) 取消两台主机上/etc/hosts中的设计第二个网络ip的注释,然后发现一下网
络:
#172.16.10.1 P
#172.16.10.2 P670B_CSPS_boot3
#21.136.67.77 P670B_CSPS
smit hacmp --> Extended Configuration --> Discover HACMP-related
Information from Configured Nodes
(2) 添加网络:
smit hacmp --> Extended Configuration --> Extended Topology
Configuration --> Configure HACMP Networks --> Add a Network to the
HACMP Cluster --> Select a Network Type选# Discovered IP-based Network
Types:ether --> 回车就创建net_ether_02。
(3) 添加interface:
smit hacmp --> Extended Configuration --> Extended Topology
Configuration --> Configure HACMP Communication Interfaces/Devices -->
Add Communication Interfaces/Devices --> Select a category选Add
Discovered Communication Interface and Devices --> Select a category选
Communication Interfaces --> Select a Network Name选net_ether_02 --> 选
择其中一个interface
同样方法添加另一个interface.
(4) 添加service-ip
smit hacmp --> Extended Configuration --> Extended Resource
Configuration --> HACMP Extended Resources Configuration --> Configure
HACMP Service IP Labels/Addresses --> Add a Service IP Label/Address -->
Select a Service IP Label/Address type选Configurable on Multiple Nodes
--> Network Name选net_ether_02 (21.136.67.0/24) --> 选择第二个service-ip
(5) 创建一个资源组
smit hacmp --> Extended Configuration --> Extended Resource
Configuration --> HACMP Extended Resource Group Configuration --> Add a
Resource Group --> 完成设定后创建资源组
smit hacmp --> Extended Configuration --> Extended Resource
Configuration --> HACMP Extended Resource Group Configuration -->
Change/Show Resources and Attributes for a Resource Group --> 选择新建的
资源组,把第二个service-ip放进去
(6) 此步骤可选:
在每个节点的/usr/es/sbin/cluster/netmon.cf文件,加入与第二个网络里面boot
地址同一网段的一个可靠地址,例如:boot网段是192.168.10.x,那么在
netmon.cf文件中加入一行192.168.10.100(该网段中的一个可靠ip地址)
(7) 配置完成后,最后执行一次同步操作.执行Extended Verification and Synchronization。.
再次执行/usr/es/sbin/cluster/clstat察看cluster状态,所有状态均正常,第二服务IP: 21.136.67.77 P670B-CSPS正常启动。
切换测试成功.http://netyu.cublog.cn
如果上面有些东东涉及到法律问题或公司的保密的请及时与我联系,我将修改删除.