pacemaker+corosync+crmsh实现web高可用
配置环境:
node1:192.168.85.144
node2:192.168.85.145
nfs:192.168.85.143
VIP:192.168.85.128
集群资源:VIP httpd
资源代理:OCF的IPaddr LSB的httpd
一.环境配置(省略)
1.各主机间能相互通信;
2.设置节点主机名;
3.节点间能通过主机名通信;
4.双机免密码登录;
5.时间同步;
nfs配置
二.安装pacemaker和corosync
1.两个节点安装httpd服务,写一个测试页面,保证双方能相互访问
[root@node1 ~]# curl
node2.a.com
[root@node1 ~]# service httpd stop
Stopping httpd: [ OK ]
[root@node1 ~]# chkconfig httpd off
[root@node2 ~]# curl
node1.a.com
[root@node2 ~]# service httpd stop
Stopping httpd: [ OK ]
[root@node2 ~]# chkconfig httpd off
2.安装(两个节点)
[root@node1 ~]# yum install pacemaker corosync
三.配置
1.node1上配置corosync
[root@node1 ~]# cd /etc/corosync/
[root@node1 corosync]# cp -p corosync.conf.example corosync.conf
[root@node1 corosync]# cat corosync.conf | egrep -v '#'
compatibility: whitetank #是否兼容0.8之前的版本
totem { #图腾,定义集群中各节点的通信机制和参数
version: 2 #图腾的协议版本号,不修改
secauth: on #安全认证功能是否启用
threads: 0 #实现认证时的并行线程数,0表示默认配置
interface { #指定在哪个接口发心跳信息;子模块
ringnumber: 0 #冗余环号,节点有多个网卡时可定义,避免心跳信息成环
bindnetaddr: 192.168.85.0 #指定心跳报文传输的网段
mcastaddr: 239.245.4.1 #指定心跳报文发送的组播地址
mcastport: 5405 #心跳组播使用的端口
ttl: 1 #生命周期,表示只向外播一次,跨3层的话需要加大一点,不然对方接收不到,最大255
}
}
logging { #日志相关
fileline: off #指定要打印的行
to_stderr: no #是否发送到标准错误输出,即屏幕
to_logfile: yes #是否记录到日志文件
logfile: /var/log/cluster/corosync.log #定义日志文件位置
to_syslog: no #是否记录到系统日志syslog,一般二者选一即可
debug: off #是否启动调试
timestamp: on #日志中是否打印时间戳
logger_subsys { #日志的子系统
subsys: AMF #是否记录AMF
debug: off #是否开启AMF的调试信息
}
}
service {
name: pacemaker #corosync启动时也将pacemaker启动
ver: 1
}
#aisexec { #表示启动ais的功能时,以哪个用户,组的身份运行,默认为root:root这些可不用定义;
#user: root
#group: root
#}
2.node1上生成节点间通信时用到的认证密钥文件
用corosync-keygen生成key时,要使用/dev/random生成随机数,因此如果新装的系统操作不够多,那么就没有足够的熵,狂敲键盘即可,随意敲,敲够即可;
实验演示没有足够的熵,这里将采用投机的方式:
[root@node1 corosync]# mv /dev/random /dev/lw
[root@node1 corosync]# ln /dev/urandom /dev/random #将随机数生成器链接至伪随机数生成器
[root@node1 corosync]# corosync-keygen #生成密钥文件
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
[root@node1 corosync]# rm -rf /dev/random #删除链接
[root@node1 corosync]# mv /dev/lw /dev/random #还原随机数生成器
[root@node1 corosync]# ll
-r-------- 1 root root 128 Nov 5 17:34 authkey #权限为400
-rw-r--r-- 1 root root 2764 Nov 5 17:23 corosync.conf
-rw-r--r-- 1 root root 1073 Jul 24 07:10 corosync.conf.example.udpu
drwxr-xr-x 2 root root 4096 Jul 24 07:10 service.d
drwxr-xr-x 2 root root 4096 Jul 24 07:10 uidgid.d
3.将认证文件和配置文件拷贝给node2
[root@node1 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
authkey 100% 128 0.1KB/s 00:00
corosync.conf 100% 2764 2.7KB/s 00:00
4.安装pacemaker的配置接口crmsh
RHEL自6.4起不再提供集群的命令行配置工具crmsh转而使用pcs;如果想继续使用crm命令,必须下载相关的程序包自行安装才可。crmsh依赖于pssh,因此需要一并下载,安装过程中还会有其它的依赖关系,所以使用yum方式进行安装
[root@node1 ~]# ll *.rpm
-rw-r--r-- 1 root root 616532 Nov 5 20:24 crmsh-2.1-1.6.i686.rpm
-rw-r--r-- 1 root root 27652 Nov 5 20:25 crmsh-debuginfo-2.1-1.6.i686.rpm
-rw-r--r-- 1 root root 109972 Nov 5 20:25 crmsh-test-2.1-1.6.i686.rpm
[root@node1 ~]# yum install pssh *.rpm
软件下载地址:
四.检验集群的安装
1.检验corosync的安装
1.1在第一个节点启动Corosync:
[root@node1 ~]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
1.2查看corosync引擎是否启动:
[root@node1 ~]#
grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
Nov 05 21:32:49 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
1.3查看初始化节点通知是否正常发出
[root@node1 ~]#
grep TOTEM /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 05 21:32:49 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Nov 05 21:32:49 corosync [TOTEM ] The network interface [192.168.85.144] is now up.
Nov 05 21:32:49 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
注意:如果最后一句,如果只显示一个,就说明可以打开另外一个节点的corosync了;
打开node2的corosync:
[root@node1 ~]# ssh node2 '/etc/init.d/corosync start'
Starting Corosync Cluster Engine (corosync): [ OK ]
检查集群关系有没有正确建立:
[root@node1 ~]#
grep TOTEM /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 05 21:32:49 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Nov 05 21:32:49 corosync [TOTEM ] The network interface [192.168.85.144] is now up.
Nov 05 21:32:49 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Nov 05 21:34:06 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
2.检查pacemaker的安装
2.1现在我们已经确认Corosync正常,我们可以开始检查其他部分是否正常.
[root@node1 ~]#
grep pcmk_startup /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Nov 05 21:32:49 corosync [pcmk ] Logging: Initialized pcmk_startup
Nov 05 21:32:49 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295
Nov 05 21:32:49 corosync [pcmk ] info: pcmk_startup: Service: 9
Nov 05 21:32:49 corosync [pcmk ] info: pcmk_startup: Local hostname: node1.a.com
2.2现在启动pacemaker检查已经启动的进程
[root@node1 ~]# /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager[ OK ]
查看pacemaker是否启动了子进程:
[root@node1 ~]#
grep -e pacemakerd.*get_config_opt -e pacemakerd.*start_child -e "Starting Pacemaker" /var/log/cluster/corosync.log
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: get_config_opt: Found 'no' for option: to_syslog
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: notice: main: Starting Pacemaker 1.1.11 (Build: 97629de): generated-manpages agent-manpages ascii-docs ncurses libqb-logging libqb-ipc nagios corosync-plugin cman acls
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Using uid=189 and group=189 for process cib
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30944 for process cib
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30945 for process stonith-ng
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30946 for process lrmd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Using uid=189 and group=189 for process attrd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30947 for process attrd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Using uid=189 and group=189 for process pengine
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30948 for process pengine
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Using uid=189 and group=189 for process crmd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd: info: start_child: Forked child 30949 for process crmd
[root@node1 ~]# ps axf
............
30938 pts/1 S 0:00 pacemakerd
30944 ? Ss 0:00 \_ /usr/libexec/pacemaker/cib
30945 ? Ss 0:00 \_ /usr/libexec/pacemaker/stonithd
30946 ? Ss 0:00 \_ /usr/libexec/pacemaker/lrmd
30947 ? Ss 0:00 \_ /usr/libexec/pacemaker/attrd
30948 ? Ss 0:00 \_ /usr/libexec/pacemaker/pengine
30949 ? Ss 0:00 \_ /usr/libexec/pacemaker/crmd
2.3检查启动过程中是否产生了错误(可省略这两个错误)
[root@node1 ~]#
grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
Nov 05 21:32:49 corosync [pcmk ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.
Nov 05 21:32:49 corosync [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of 'Clusters from Scratch' () for details on using Pacemaker with CMAN
2.4启动另一个节点的pacemaker
[root@node1 ~]# ssh node2 '/etc/init.d/pacemaker start'
Starting Pacemaker Cluster Manager[ OK ]
3.检查集群状态
[root@node1 ~]#
crm_mon
Last updated: Thu Nov 5 21:39:24 2015
Last change: Thu Nov 5 21:39:09 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
0 Resources configured
Online: [ node1.a.com node2.a.com ]
由上可见,两个节点都在线,并且DC是节点1
出错请看:
http://blog.chinaunix.net/uid-30212356-id-5345348.html
四.基于corosync实现web高可用资源管理
关于crm命令的使用:
http://blog.chinaunix.net/uid-30212356-id-5345399.html
关于下面用到的资源相关的概述:
http://blog.chinaunix.net/uid-30212356-id-5333561.html
1.stonith参数的调整
禁用stonith功能,corosync默认是启用stonith功能的,没有stonith设备,若直接去配置资源的话,verif会报错,无法commit;
[root@node1 ~]# crm configure
crm(live)configure#
property stonith-enabled=false
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com
node node2.a.com
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
2.配置web集群
2.1定义VIP
crm(live)configure#
primitive webip ocf:heartbeat:IPaddr params ip=192.168.85.128
表示定义一个主资源webip,ocf:heartbeat:IPaddr表示资源代理的类型为ocf,提供者是heartbeat,具体的代理为IPaddr,params表示代理IPaddr的参数为ip="192.168.85.128"即VIP为192.168.85.128
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov 6 18:34:52 2015
Last change: Fri Nov 6 18:34:27 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com
2.2验证
[root@node1 ~]#
ip -o -f inet addr show
1: lo inet 127.0.0.1/8 scope host lo
2: eth1 inet 192.168.85.144/24 brd 192.168.85.255 scope global eth1
2: eth1 inet 192.168.85.128/24 brd 192.168.85.255 scope global secondary eth1
2.3配置httpd资源
[root@node1 ~]# crm configure
crm(live)configure#
primitive webserver lsb:httpd
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov 6 18:46:09 2015
Last change: Fri Nov 6 18:45:37 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com
webserver (lsb:httpd): Started node2.a.com
资源运行于两个节点上,默认以均衡的方式进行工作,尽可能将不同的资源运行在不同的节点上;
3.定义资源约束
若想将多个资源运行在同一个节点上,则,做成组,或定义排列约束。
资源约束则用以指定在哪些群集节点上运行资源,以何种顺序装载资源,以及特定资源依赖于哪些其它资源。
pacemaker共给我们提供了三种资源约束方法:
1)Resource Location(资源位置):定义资源可以、不可以或尽可能在哪些节点上运行;
2)Resource Collocation(资源排列):排列约束定义集群资源可以、不可以在某个节点上同时运行;
3)Resource Order(资源顺序):顺序约束定义集群资源在节点上启动的顺序;
3.1定义组,将资源运行在同一个节点上
定义资源组:
定义组webcluster,成员是后面的资源
crm(live)# configure
crm(live)configure#
group webcluster webip webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com
node node2.a.com
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserver lsb:httpd
group webcluster webip webserver
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov 6 18:55:18 2015
Last change: Fri Nov 6 18:55:04 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ node1.a.com node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node1.a.com
webserver (lsb:httpd): Started node1.a.com
3.2浏览器测试
3.3让node1离线,测试是否转移
crm(live)# node
crm(live)node# standby node1.a.com
crm(live)node# cd
crm(live)# status
Last updated: Fri Nov 6 18:57:58 2015
Last change: Fri Nov 6 18:57:38 2015
Stack: classic openais (with plugin)
Current DC: node2.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Node node1.a.com: standby
Online: [ node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
资源已经转移到node2上了;
刷新浏览器测试
让node1重新上线:
crm(live)# node
crm(live)node# online node1.a.com
crm(live)node# cd
crm(live)# status
Last updated: Fri Nov 6 20:41:13 2015
Last change: Fri Nov 6 20:41:02 2015
Stack: classic openais (with plugin)
Current DC: node2.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ node1.a.com node2.a.com ]
#node1上线了,只不过DC不在node1上了
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
4.定义排列约束
4.1先删除组
[root@node2 ~]# crm resource
crm(live)resource# stop webcluster
crm(live)resource# cd
crm(live)# configure
crm(live)configure# delete webcluster
crm(live)configure#show
crm(live)configure#commit
crm(live)configure#cd
crm(live)#status
crm(live)# status
Last updated: Fri Nov 6 19:21:33 2015
Last change: Fri Nov 6 19:21:21 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com
webserver (lsb:httpd): Started node2.a.com
此时为负载均衡状态;
4.2定义排列约束
role(可省):每一个资源都可以定义很多角色,每一个资源代理在启动一个资源时,会经过很多阶段,刚准备启动叫premote,启动起来叫start,停止为stop。role在主从模型中用的多点,定义主的启动了,从的才能启动或其它role。
crm(live)# configure
让webserver和webip在一起
crm(live)configure# colocation webserver_and_webip inf: webserver webip(注意inf:后有一个空格,否则会提示出错)
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
5.定义顺序约束
就如我们定义的资源,他们的启动顺序应该是webip,webserver,那就可以给它们定义一个顺序约束;
kind类型:Mandatory (强制),Optional (可选),Serialize(顺序);
crm(live)configure# order webip_before_webserver mandatory: webip webserver
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
attributes standby=off
node node2.a.com
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip #排列约束
order webip_before_webserver Mandatory: webip webserver #顺序约束
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
6.定义位置约束
6.1定义位置约束
定义node1的倾向性为100
crm(live)configure# location webip_on_node1 webip 100: node1.a.com
crm(live)configure# verify
crm(live)configure# commit
如果此时资源在node2上,node1的倾向性为100,node2的倾向性为0,那么会将资源转移到node1上;
如果这时停止node1,资源不会转移到node2上,不会显示了,而显示:Current DC: node1.a.com - partition WITHOUT QUORUM,表明node1挂了,node2不具备法定票数,不具备法定票数的默认操作为suicide或stop(suicide全部资源全挂不显示任何资源)
6.2定义全局属性
两节点集群是一个特殊的集群,当node1挂了,我们当然希望资源转移到node2上,那么此时需要定义全局属性,一旦不具备法定票数时,应该忽略,而不是停止资源,保证资源在节点上;
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
attributes standby=off
node node2.a.com
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserver lsb:httpd
location webip_on_node1 webip 100: node1.a.com
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore
此时停止node1资源转移到node2上,当node1上线后资源又再次转移到node1;
当两个节点的倾向性一样时,最终的倾向性取决于所有倾向性之和;
删除位置约束:
crm(live)configure# edit
删除位置约束配置的行保存退出
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
attributes standby=off
node node2.a.com
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore
7.定义资源默认黏性
资源粘性生效于当前运行节点。资源运行在哪里,即在哪里生效。
粘性定义,无关任一node,只生效当前所运行节点;
资源粘性表示资源是否倾向于留在当前节点,如果为正整数,表示倾向,负数则会离开,inf正无穷,表示只要可以就留在该节点,-inf表示负无穷,表示如果可以就远离该节点(如果只剩下一个为-inf的节点,那么不得不留在这个节点)。
[root@node1 ~]# crm configure
crm(live)configure# rsc_defaults resource-stickness=100
crm(live)configure# show
node node1.a.com \
attributes standby=off
node node2.a.com
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
resource-stickness=100
8.定义资源监控
monitor是用来监控资源的,默认情况下pacemaker没有对任何资源进行监控,那为什么要对资源进行监控呢?
假如一个节点出故障了,corosync检测不到这个节点的心跳信息,那么它就认为这个节点故障了,因此资源就会转移到另外的备用节点上;但如果是服务非正常关闭了呢?
假如挂的是httpd服务而不是节点,在这种情况下,如果没有对资源进行监控,资源是不会转移的,因为压根儿就没节点什么事,它也不会意识到服务停掉了,这就意味着如果服务非正常关闭的话,那web也就不会响应了,所以我们应该在定义资源时对其进行监控
要想对资源进行监控,就必须在定义资源时指定op_type为monitor,假如服务非正常关闭的话,先让其重启,如果重启不了,再转移到其它节点上,interval代表多久监听一次,timeout代表超时时间;
如:
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.85.143 op monitor interval=30s timeout=20s on-fail=restart
crm(live)configure#delet vip //直接删除上面的定义
五.基于corosync实现web高可用
1.查看共享目录
[root@nfs ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24
[root@node1 ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24
[root@node2 ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24
2.为NFS主机提供测试页面
[root@nfs ~]# cat /mysqldata/index.html
NFS Server
3.配置NFS资源:
crm(live)configure#
primitive mynfs ocf:heartbeat:Filesystem params device=192.168.85.143:/mysqldata directory=/var/www/html fstype=nfs op start timeout=60s op stop timeout=60s
注意:因为上一个命令没有加上op start timeout=60s op stop timeout=60s这句话,所以提示以下错误,导致无法提交,加上就可以了;如果加上后提示一些乱七八糟的错误就把mynfs资源删了重建,可能是有些"垃圾"没清理更上次的配置有冲突;
WARNING: mynfs: default timeout 20s for start is smaller than the advised 60
WARNING: mynfs: default timeout 20s for stop is smaller than the advised 60
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov 6 21:53:55 2015
Last change: Fri Nov 6 21:53:51 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com
webserver (lsb:httpd): Started node1.a.com
mynfs (ocf::heartbeat:Filesystem): Started node1.a.com
4.删除资源组:
crm(live)# resource
crm(live)resource# stop webcluster
crm(live)configure# delete webcluster
5.重新创建资源组:
crm(live)# configure
crm(live)configure#
group webcluster webip mynfs webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)# status
Last updated: Sat Nov 7 11:31:22 2015
Last change: Sat Nov 7 11:31:10 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node1.a.com node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node1.a.com
mynfs (ocf::heartbeat:Filesystem): Started node1.a.com
webserver (lsb:httpd): Started node1.a.com
6.创建顺序约束说明启动顺序为VIP NFS httpd
crm(live)# configure
crm(live)configure#
order webip_before_mynfs_before_webserver mandatory: webip mynfs webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
attributes standby=off
node node2.a.com
primitive mynfs Filesystem \
params device="192.168.85.143:/mysqldata" directory="/var/www/html" fstype=nfs \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive webip IPaddr \
params ip=192.168.85.128
primitive webserve lsb:httpd
group webcluster webip mynfs webserver \
meta target-role=Started
order webip_before_mynfs_before_webserver Mandatory: webip mynfs webserver
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore
7.测试
crm(live)# status
Last updated: Sat Nov 7 11:34:50 2015
Last change: Sat Nov 7 11:33:44 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum #
注意,如果DC,资源不在同一个节点上,页面会无法显示;
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node1.a.com node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr):
Started node1.a.com
mynfs (ocf::heartbeat:Filesystem):
Started node1.a.com
webserver (lsb:httpd):
Started node1.a.com
7.1打开浏览器测试
7.2将node1改为备节点测试资源是否转移
crm(live)# node
crm(live)node# standby node1.a.com
crm(live)node# show
node1.a.com: normal
standby=on
node2.a.com: normal
crm(live)# status
Last updated: Sat Nov 7 11:38:13 2015
Last change: Sat Nov 7 11:37:55 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Node node1.a.com: standby
Online: [ node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node2.a.com
mynfs (ocf::heartbeat:Filesystem): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
浏览器测试:
7.3重新让node1上线
crm(live)node# online node1.a.com
crm(live)node# show
node1.a.com: normal
standby=off
node2.a.com: normal
crm(live)# status
Last updated: Sat Nov 7 11:39:31 2015
Last change: Sat Nov 7 11:38:58 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node1.a.com node2.a.com ]
Resource Group: webcluster
webip (ocf::heartbeat:IPaddr): Started node2.a.com
mynfs (ocf::heartbeat:Filesystem): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
node1上线了,当前DC为node2
至此,基于NFS的web高可用完成了,不过有些功能还未加上,等熟悉了crm命令后再加上吧;
参考资料: