分类:
2006-09-26 14:11:22
在HACMP 5.1中使用磁盘总线作为心跳网络
In certain situations RS232, tmssa, and tmscsi connections are considered too costly or complex to set up. Heartbeating via disk (diskhb) provides users with:
A point-to-point network type that is very easy to configure
Additional protection against cluster partitioning
A point-to-point network type that can use any disk-type to form a data path
A setup that does not require additional hardware; it can use a disk that is also used for data and included in a resource group
In order to support SSA concurrent VGs, there is a small space reserved on every disk for use in clvmd communication. Enhanced concurrent VGs do not use the reserved space for communication; instead, they use the RSCT group services.
Disk heartbeating uses a reserved disk sector (that has been reserved for SSA concurrent mode VGs) as a zone where nodes can exchange keep alive messages.
Any disk that is part of an enhanced concurrent VG can be used for a diskhb network, including those used for data storage. Moreover, the VG that contains the disk used for a diskhb network does not have to be varied on.
Any disk type may be configured as part of an enhanced concurrent VG, making this network type extremely flexible. For more information on configuring a disk heartbeat network, see Chapter 3, "Planning Cluster Network Connectivity", in the HACMP for AIX
1、因为要并发处理,所以卷组要建成concurrent capable的,AIX 5.2下只支持Enhanced Concurrent卷组:
# mkvg -C -n -y datavg hdisk1
0516-1335 mkvg: This system does not support enhanced
concurrent capable volume groups.
报错真正的原因——没有安装bos.clvm.enh:
# lslpp -L bos.clvm.enh
Fileset Level State Type Description (Uninstaller)
----------------------------------------------------------------------------
bos.clvm.enh
安装后,重新执行mkvg就正常了。
hdisk1 000234ff7d008e
2、配置diskheardbeat
说明:
在配置HACMP的过程中,除了TCP/IP网络之外,您也可以在其它形式的网络上,如串行网络和磁盘总线上配置心跳网络。使用磁盘总线上的心跳网络能够在TCP/IP网络资源有限的情况下提供额外的HACMP节点间的通信手段,并且能够防止HACMP节点之间由于TCP/IP软件出现问题而无法相互通信。这样,即使是在没有串行网络的情况下,TCP/IP软件也不会成为“The Single Point of Failure”。
在使用磁盘总线作为心跳网络时,HACMP共享磁盘上的一小块非数据区会被用来作为通信介质。两个HACMP节点可以向这个非数据区中分配给各自的区域写入信息,并从另一个节点的区域读取信息。
在HACMP 5.1之前的版本中,通过磁盘总线传输信号仅限于使用目标模式的SCSI(Target Mode SCSI)和使用目标模式的SSA(Target Mode SSA)的HACMP集群。从HACMP 5.1开始,HACMP所支持的任何类型的共享磁盘设备都可以用来构成心跳网络。
磁盘心跳网络中的心跳路径必须在HACMP中一一配置。每个心跳路径由两个HACMP节点和一个共享磁盘构成。
加入磁盘心跳网络的共享磁盘必须被配置为增强的并发方式(Enhanced Concurrent Mode)以支持两个HACMP节点的同时访问。
增强的并发方式(Enhanced Concurrent Mode)是自AIX 5.1开始引入的一种磁盘并发访问方式。任何HACMP所支持的共享磁盘设备都可以被配置成为增强的并发方式, 并且能够提供所有过去SSA并发方式所提供的功能。在增强的并发方式下,CLVM通过AIX中的RSCT组件来协调各个节点对于共享磁盘的访问。
所有在AIX 5.1以上版本中创建的并发方式的卷组都会被自动创建为增强的并发方式。对于已经创建的SSA并发方式的卷组,可以用chvg -C命令转化为增强的并发方式。在运行64位内核时,增强的并发方式是HACMP唯一支持的并发方式。
在配置磁盘心跳网络之前请将所有相关的共享卷组设置为增强的并发方式。
在HACMP基本配置完成后,运行smitty hacmp来配置磁盘心跳网络。请选择Extended Configuration-> Extended Topology Configuration->Configure HACMP Networks->Add a Network to the HACMP Cluster。然后用上下箭头键选择diskhb:
Configure HACMP Networks
│ Select a Network Type │
│ │
│ Move cursor to desired item and press Enter. │
│ │
│ [TOP] │
│ # Discovery last performed: (Apr 05 16:46) │
│ # Discovered IP-based Network Types │
│ │
│ # Discovered Serial Device Types │
│ diskhb │
│ rs232 │
│ │
│ # Pre-defined IP-based Network Types │
│ atm │
│ ether │
│ [MORE...9] │
│ │
│ F1=Help F2=Refresh F3=Cancel │
│ F8=Image F10=Exit Enter=Do │
│ /=Find n=Find Next │
└─────────────────────────────────────────
接着输入该网络的名字或按回车接受系统默认值:
Add a Serial Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* Network Name [net_diskhb_02]
* Network Type diskhb
然后进入菜单Extended Configuration-> Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices,并选择Add Discovered Communication Interface and Devices->Communication Devices。您将会看到以下的界面。请用上下箭头键和F7键选择两个节点上的同一个共享磁盘设备(如图),并按回车继续。
Select Point-to-Point Pair of Discovered Communication Devices to Add │
│ │
│ Move cursor to desired item and press F7. Use arrow keys to scroll. │
│ ONE OR MORE items can be selected. │
│ Press Enter AFTER making all selections. │
│ │
│ # Node Device Device Path Pvid │
│> test
│> test650b1 hdisk5 /dev/hdisk5
│ test
│ test650b1 tty0 /dev/tty0 │
│ test
│ test650b1 tty1 /dev/tty1 │
│ │
│ F1=Help F2=Refresh F3=Cancel │
│ F7=Select F8=Image F10=Exit │
│ Enter=Do /=Find n=Find Next │
接下来,您需要同步HACMP的拓扑和资源设置。在HACMP启动后,您可以用clstat等工具来监测磁盘心跳网络的状态。
3、测试
To test a disk heartbeat network, do the following steps:
Ensure that the PVID of the disk is identical on both nodes of the connection
Ensure that the disk is defined as a member of an enhanced concurrent volume group on both cluster nodes
Verify that you have installed the correct version of the bos.clvm.enh and RSCT filesets.
Run the command /usr/es/sbin/cluster/utilities/clrsctinfo -cp cllsif|grep diskhb and verify that the nodes' synchronization was successful. We used hdisk1 for the definition of the disk heartbeat network between cluster nodes one and two, as you can see in illustration
//////Clrsctinfo output sample
F80/#clrsctinfo -cp cllsif|grep diskhb
F80_hdisk1_01:service:net_diskhb_01:diskhb:serial:F80:/dev/rhdisk1::hdisk1::
b50_hdisk1_01:service:net_diskhb_01:diskhb:serial:b50:/dev/rhdisk1::hdisk1::
b50/#clrsctinfo -cp cllsif|grep diskhb
F80_hdisk1_01:service:net_diskhb_01:diskhb:serial:F80:/dev/rhdisk1::hdisk1::
b50_hdisk1_01:service:net_diskhb_01:diskhb:serial:b50:/dev/rhdisk1::hdisk1::
On node 1, run the command /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk1 -r. Your result should be similar to the one shown in illustration
//////Disk heartbeat receive
F80/#dhb_read -p /dev/hdisk1 -r
Receive Mode:
Waiting for response . . .
Link operating normally
On node 2, run the command /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk1 –t
Your result should be similar to that in illustration
//////
Disk heartbeat transmit
b50/#dhb_read -p /dev/hdisk1 -t
Transmit Mode:
Detected remote utility in receive mode. Waiting for response . . .
Link operating normally
Repeat the test in the opposite direction.
Go to the directory /var/ha/log
If your cluster is named longcredit and you used disk 1, run the command tail -f nim.topsvcs.rhdisk1.longcredit
The output of your command should be similar to illustation
/////
Example 4-15 Sample log for heartbeat over disk
#F80/var/ha/log#tail -f /var/ha/log/nim.topsvcs.rhdisk1.longcredit
04/24 23:45:12.614: Received a SEND MSG command. Dst: .
04/24 23:45:27.080: Received a SEND MSG command. Dst: .
04/24 23:45:43.670: Received a SEND MSG command. Dst: .
04/24 23:46:00.212: Received a SEND MSG command. Dst: .
04/24 23:46:18.806: Received a SEND MSG command. Dst: .
04/24 23:46:39.452: Received a SEND MSG command. Dst: .
04/24 23:47:00.104: Received a SEND MSG command. Dst: .
04/24 23:47:22.808: Received a SEND MSG command. Dst: .
04/24 23:47:45.524: Received a SEND MSG command. Dst: .
04/24 23:48:10.311: Received a SEND MSG command. Dst: .
04/24 23:48:35.097: Received a SEND MSG command. Dst: .
In certain situations RS232, tmssa, and tmscsi connections are considered too costly or complex to set up. Heartbeating via disk (diskhb) provides users with:
A point-to-point network type that is very easy to configure
Additional protection against cluster partitioning
A point-to-point network type that can use any disk-type to form a data path
A setup that does not require additional hardware; it can use a disk that is also used for data and included in a resource group
In order to support SSA concurrent VGs, there is a small space reserved on every disk for use in clvmd communication. Enhanced concurrent VGs do not use the reserved space for communication; instead, they use the RSCT group services.
Disk heartbeating uses a reserved disk sector (that has been reserved for SSA concurrent mode VGs) as a zone where nodes can exchange keep alive messages.
Any disk that is part of an enhanced concurrent VG can be used for a diskhb network, including those used for data storage. Moreover, the VG that contains the disk used for a diskhb network does not have to be varied on.
Any disk type may be configured as part of an enhanced concurrent VG, making this network type extremely flexible. For more information on configuring a disk heartbeat network, see Chapter 3, "Planning Cluster Network Connectivity", in the HACMP for AIX
1、因为要并发处理,所以卷组要建成concurrent capable的,AIX 5.2下只支持Enhanced Concurrent卷组:
# mkvg -C -n -y datavg hdisk1
0516-1335 mkvg: This system does not support enhanced
concurrent capable volume groups.
报错真正的原因——没有安装bos.clvm.enh:
# lslpp -L bos.clvm.enh
Fileset Level State Type Description (Uninstaller)
----------------------------------------------------------------------------
bos.clvm.enh
安装后,重新执行mkvg就正常了。
hdisk1 000234ff7d008e
2、配置diskheardbeat
说明:
在配置HACMP的过程中,除了TCP/IP网络之外,您也可以在其它形式的网络上,如串行网络和磁盘总线上配置心跳网络。使用磁盘总线上的心跳网络能够在TCP/IP网络资源有限的情况下提供额外的HACMP节点间的通信手段,并且能够防止HACMP节点之间由于TCP/IP软件出现问题而无法相互通信。这样,即使是在没有串行网络的情况下,TCP/IP软件也不会成为“The Single Point of Failure”。
在使用磁盘总线作为心跳网络时,HACMP共享磁盘上的一小块非数据区会被用来作为通信介质。两个HACMP节点可以向这个非数据区中分配给各自的区域写入信息,并从另一个节点的区域读取信息。
在HACMP 5.1之前的版本中,通过磁盘总线传输信号仅限于使用目标模式的SCSI(Target Mode SCSI)和使用目标模式的SSA(Target Mode SSA)的HACMP集群。从HACMP 5.1开始,HACMP所支持的任何类型的共享磁盘设备都可以用来构成心跳网络。
磁盘心跳网络中的心跳路径必须在HACMP中一一配置。每个心跳路径由两个HACMP节点和一个共享磁盘构成。
加入磁盘心跳网络的共享磁盘必须被配置为增强的并发方式(Enhanced Concurrent Mode)以支持两个HACMP节点的同时访问。
增强的并发方式(Enhanced Concurrent Mode)是自AIX 5.1开始引入的一种磁盘并发访问方式。任何HACMP所支持的共享磁盘设备都可以被配置成为增强的并发方式, 并且能够提供所有过去SSA并发方式所提供的功能。在增强的并发方式下,CLVM通过AIX中的RSCT组件来协调各个节点对于共享磁盘的访问。
所有在AIX 5.1以上版本中创建的并发方式的卷组都会被自动创建为增强的并发方式。对于已经创建的SSA并发方式的卷组,可以用chvg -C命令转化为增强的并发方式。在运行64位内核时,增强的并发方式是HACMP唯一支持的并发方式。
在配置磁盘心跳网络之前请将所有相关的共享卷组设置为增强的并发方式。
在HACMP基本配置完成后,运行smitty hacmp来配置磁盘心跳网络。请选择Extended Configuration-> Extended Topology Configuration->Configure HACMP Networks->Add a Network to the HACMP Cluster。然后用上下箭头键选择diskhb:
Configure HACMP Networks
│ Select a Network Type │
│ │
│ Move cursor to desired item and press Enter. │
│ │
│ [TOP] │
│ # Discovery last performed: (Apr 05 16:46) │
│ # Discovered IP-based Network Types │
│ │
│ # Discovered Serial Device Types │
│ diskhb │
│ rs232 │
│ │
│ # Pre-defined IP-based Network Types │
│ atm │
│ ether │
│ [MORE...9] │
│ │
│ F1=Help F2=Refresh F3=Cancel │
│ F8=Image F10=Exit Enter=Do │
│ /=Find n=Find Next │
└─────────────────────────────────────────
接着输入该网络的名字或按回车接受系统默认值:
Add a Serial Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* Network Name [net_diskhb_02]
* Network Type diskhb
然后进入菜单Extended Configuration-> Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices,并选择Add Discovered Communication Interface and Devices->Communication Devices。您将会看到以下的界面。请用上下箭头键和F7键选择两个节点上的同一个共享磁盘设备(如图),并按回车继续。
Select Point-to-Point Pair of Discovered Communication Devices to Add │
│ │
│ Move cursor to desired item and press F7. Use arrow keys to scroll. │
│ ONE OR MORE items can be selected. │
│ Press Enter AFTER making all selections. │
│ │
│ # Node Device Device Path Pvid │
│> test
│> test650b1 hdisk5 /dev/hdisk5
│ test
│ test650b1 tty0 /dev/tty0 │
│ test
│ test650b1 tty1 /dev/tty1 │
│ │
│ F1=Help F2=Refresh F3=Cancel │
│ F7=Select F8=Image F10=Exit │
│ Enter=Do /=Find n=Find Next │
接下来,您需要同步HACMP的拓扑和资源设置。在HACMP启动后,您可以用clstat等工具来监测磁盘心跳网络的状态。
3、测试
To test a disk heartbeat network, do the following steps:
Ensure that the PVID of the disk is identical on both nodes of the connection
Ensure that the disk is defined as a member of an enhanced concurrent volume group on both cluster nodes
Verify that you have installed the correct version of the bos.clvm.enh and RSCT filesets.
Run the command /usr/es/sbin/cluster/utilities/clrsctinfo -cp cllsif|grep diskhb and verify that the nodes' synchronization was successful. We used hdisk1 for the definition of the disk heartbeat network between cluster nodes one and two, as you can see in illustration
//////Clrsctinfo output sample
F80/#clrsctinfo -cp cllsif|grep diskhb
F80_hdisk1_01:service:net_diskhb_01:diskhb:serial:F80:/dev/rhdisk1::hdisk1::
b50_hdisk1_01:service:net_diskhb_01:diskhb:serial:b50:/dev/rhdisk1::hdisk1::
b50/#clrsctinfo -cp cllsif|grep diskhb
F80_hdisk1_01:service:net_diskhb_01:diskhb:serial:F80:/dev/rhdisk1::hdisk1::
b50_hdisk1_01:service:net_diskhb_01:diskhb:serial:b50:/dev/rhdisk1::hdisk1::
On node 1, run the command /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk1 -r. Your result should be similar to the one shown in illustration
//////Disk heartbeat receive
F80/#dhb_read -p /dev/hdisk1 -r
Receive Mode:
Waiting for response . . .
Link operating normally
On node 2, run the command /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk1 –t
Your result should be similar to that in illustration
//////
Disk heartbeat transmit
b50/#dhb_read -p /dev/hdisk1 -t
Transmit Mode:
Detected remote utility in receive mode. Waiting for response . . .
Link operating normally
Repeat the test in the opposite direction.
Go to the directory /var/ha/log
If your cluster is named longcredit and you used disk 1, run the command tail -f nim.topsvcs.rhdisk1.longcredit
The output of your command should be similar to illustation
/////
Example 4-15 Sample log for heartbeat over disk
#F80/var/ha/log#tail -f /var/ha/log/nim.topsvcs.rhdisk1.longcredit
04/24 23:45:12.614: Received a SEND MSG command. Dst: .
04/24 23:45:27.080: Received a SEND MSG command. Dst: .
04/24 23:45:43.670: Received a SEND MSG command. Dst: .
04/24 23:46:00.212: Received a SEND MSG command. Dst: .
04/24 23:46:18.806: Received a SEND MSG command. Dst: .
04/24 23:46:39.452: Received a SEND MSG command. Dst: .
04/24 23:47:00.104: Received a SEND MSG command. Dst: .
04/24 23:47:22.808: Received a SEND MSG command. Dst: .
04/24 23:47:45.524: Received a SEND MSG command. Dst: .
04/24 23:48:10.311: Received a SEND MSG command. Dst: .
04/24 23:48:35.097: Received a SEND MSG command. Dst: .