Chinaunix首页 | 论坛 | 博客
  • 博客访问: 637397
  • 博文数量: 244
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 130
  • 用 户 组: 普通用户
  • 注册时间: 2016-06-27 09:53
个人简介

记录学习,记录成长

文章分类

全部博文(244)

我的朋友

分类: LINUX

2015-11-07 12:46:25

pacemaker+corosync+crmsh实现web高可用
配置环境:
node1:192.168.85.144
node2:192.168.85.145
nfs:192.168.85.143
VIP:192.168.85.128
集群资源:VIP  httpd
资源代理:OCF的IPaddr   LSB的httpd

一.环境配置(省略)
1.各主机间能相互通信;
2.设置节点主机名;
3.节点间能通过主机名通信;
4.双机免密码登录;
5.时间同步;
nfs配置

二.安装pacemaker和corosync
1.两个节点安装httpd服务,写一个测试页面,保证双方能相互访问
[root@node1 ~]# curl

node2.a.com


[root@node1 ~]# service httpd stop
Stopping httpd: [  OK  ]
[root@node1 ~]# chkconfig httpd off

[root@node2 ~]# curl

node1.a.com


[root@node2 ~]# service httpd stop
Stopping httpd: [  OK  ]
[root@node2 ~]# chkconfig httpd off

2.安装(两个节点)
[root@node1 ~]# yum install pacemaker corosync

三.配置
1.node1上配置corosync
[root@node1 ~]# cd /etc/corosync/

[root@node1 corosync]# cp -p corosync.conf.example corosync.conf
[root@node1 corosync]# cat  corosync.conf  | egrep -v '#'

compatibility: whitetank #是否兼容0.8之前的版本
totem { #图腾,定义集群中各节点的通信机制和参数
        version: 2 #图腾的协议版本号,不修改

        secauth: on #安全认证功能是否启用

        threads: 0 #实现认证时的并行线程数,0表示默认配置 

        interface { #指定在哪个接口发心跳信息;子模块
                ringnumber: 0 #冗余环号,节点有多个网卡时可定义,避免心跳信息成环
                bindnetaddr: 192.168.85.0 #指定心跳报文传输的网段
                mcastaddr: 239.245.4.1 #指定心跳报文发送的组播地址 
                mcastport: 5405 #心跳组播使用的端口
                ttl: 1 #生命周期,表示只向外播一次,跨3层的话需要加大一点,不然对方接收不到,最大255
        }
}

logging { #日志相关 
        fileline: off #指定要打印的行
        to_stderr: no #是否发送到标准错误输出,即屏幕
        to_logfile: yes #是否记录到日志文件 
        logfile: /var/log/cluster/corosync.log #定义日志文件位置
        to_syslog: no #是否记录到系统日志syslog,一般二者选一即可
        debug: off #是否启动调试
        timestamp: on #日志中是否打印时间戳
        logger_subsys { #日志的子系统
                subsys: AMF #是否记录AMF 
                debug: off #是否开启AMF的调试信息
        }
}
service {
        name: pacemaker #corosync启动时也将pacemaker启动
ver: 1
}
#aisexec { #表示启动ais的功能时,以哪个用户,组的身份运行,默认为root:root这些可不用定义;
        #user: root
        #group: root
#}

2.node1上生成节点间通信时用到的认证密钥文件
用corosync-keygen生成key时,要使用/dev/random生成随机数,因此如果新装的系统操作不够多,那么就没有足够的熵,狂敲键盘即可,随意敲,敲够即可;
实验演示没有足够的熵,这里将采用投机的方式:
[root@node1 corosync]# mv /dev/random /dev/lw

[root@node1 corosync]# ln /dev/urandom /dev/random #将随机数生成器链接至伪随机数生成器

[root@node1 corosync]# corosync-keygen #生成密钥文件
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.

[root@node1 corosync]# rm -rf /dev/random #删除链接

[root@node1 corosync]# mv /dev/lw /dev/random #还原随机数生成器

[root@node1 corosync]# ll
-r-------- 1 root root  128 Nov  5 17:34 authkey #权限为400
-rw-r--r-- 1 root root 2764 Nov  5 17:23 corosync.conf
-rw-r--r-- 1 root root 1073 Jul 24 07:10 corosync.conf.example.udpu
drwxr-xr-x 2 root root 4096 Jul 24 07:10 service.d
drwxr-xr-x 2 root root 4096 Jul 24 07:10 uidgid.d

3.将认证文件和配置文件拷贝给node2
[root@node1 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
authkey                                               100%  128     0.1KB/s   00:00    
corosync.conf                                       100% 2764     2.7KB/s   00:00    

4.安装pacemaker的配置接口crmsh
RHEL自6.4起不再提供集群的命令行配置工具crmsh转而使用pcs;如果想继续使用crm命令,必须下载相关的程序包自行安装才可。crmsh依赖于pssh,因此需要一并下载,安装过程中还会有其它的依赖关系,所以使用yum方式进行安装
[root@node1 ~]# ll *.rpm
-rw-r--r-- 1 root root 616532 Nov  5 20:24 crmsh-2.1-1.6.i686.rpm
-rw-r--r-- 1 root root  27652 Nov  5 20:25 crmsh-debuginfo-2.1-1.6.i686.rpm
-rw-r--r-- 1 root root 109972 Nov  5 20:25 crmsh-test-2.1-1.6.i686.rpm
[root@node1 ~]# yum install pssh *.rpm
软件下载地址:

四.检验集群的安装
1.检验corosync的安装
1.1在第一个节点启动Corosync:
[root@node1 ~]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [  OK  ]

1.2查看corosync引擎是否启动:
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [MAIN  ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
Nov 05 21:32:49 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

1.3查看初始化节点通知是否正常发出
[root@node1 ~]# grep  TOTEM  /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 05 21:32:49 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Nov 05 21:32:49 corosync [TOTEM ] The network interface [192.168.85.144] is now up.
Nov 05 21:32:49 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
注意:如果最后一句,如果只显示一个,就说明可以打开另外一个节点的corosync了;
打开node2的corosync:
[root@node1 ~]# ssh node2 '/etc/init.d/corosync start'
Starting Corosync Cluster Engine (corosync): [  OK  ]

检查集群关系有没有正确建立:
[root@node1 ~]# grep TOTEM /var/log/cluster/corosync.log
Nov 05 21:32:49 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 05 21:32:49 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Nov 05 21:32:49 corosync [TOTEM ] The network interface [192.168.85.144] is now up.
Nov 05 21:32:49 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Nov 05 21:34:06 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

2.检查pacemaker的安装
2.1现在我们已经确认Corosync正常,我们可以开始检查其他部分是否正常.
[root@node1 ~]# grep pcmk_startup /var/log/cluster/corosync.log 
Nov 05 21:32:49 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Nov 05 21:32:49 corosync [pcmk  ] Logging: Initialized pcmk_startup
Nov 05 21:32:49 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 4294967295
Nov 05 21:32:49 corosync [pcmk  ] info: pcmk_startup: Service: 9
Nov 05 21:32:49 corosync [pcmk  ] info: pcmk_startup: Local hostname: node1.a.com

2.2现在启动pacemaker检查已经启动的进程
[root@node1 ~]# /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager[  OK  ]

查看pacemaker是否启动了子进程:
[root@node1 ~]# grep -e pacemakerd.*get_config_opt -e pacemakerd.*start_child -e "Starting Pacemaker" /var/log/cluster/corosync.log 
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: get_config_opt:       Found 'no' for option: to_syslog
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: get_config_opt:       Defaulting to 'daemon' for option: syslog_facility
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:   notice: main:         Starting Pacemaker 1.1.11 (Build: 97629de):  generated-manpages agent-manpages ascii-docs ncurses libqb-logging libqb-ipc nagios  corosync-plugin cman acls
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Using uid=189 and group=189 for process cib
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30944 for process cib
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30945 for process stonith-ng
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30946 for process lrmd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Using uid=189 and group=189 for process attrd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30947 for process attrd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Using uid=189 and group=189 for process pengine
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30948 for process pengine
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Using uid=189 and group=189 for process crmd
Nov 05 21:35:12 [30938] node1.a.com pacemakerd:     info: start_child:  Forked child 30949 for process crmd

[root@node1 ~]# ps axf
............
30938 pts/1    S      0:00 pacemakerd
30944 ?        Ss     0:00  \_ /usr/libexec/pacemaker/cib
30945 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
30946 ?        Ss     0:00  \_ /usr/libexec/pacemaker/lrmd
30947 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
30948 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
30949 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd

2.3检查启动过程中是否产生了错误(可省略这两个错误)
[root@node1 ~]# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
Nov 05 21:32:49 corosync [pcmk  ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.
Nov 05 21:32:49 corosync [pcmk  ] ERROR: process_ais_conf:  Please see Chapter 8 of 'Clusters from Scratch' () for details on using Pacemaker with CMAN

2.4启动另一个节点的pacemaker
[root@node1 ~]# ssh node2 '/etc/init.d/pacemaker start'
Starting Pacemaker Cluster Manager[  OK  ]

3.检查集群状态
[root@node1 ~]# crm_mon
Last updated: Thu Nov  5 21:39:24 2015
Last change: Thu Nov  5 21:39:09 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
0 Resources configured
Online: [ node1.a.com node2.a.com ]
由上可见,两个节点都在线,并且DC是节点1
出错请看:http://blog.chinaunix.net/uid-30212356-id-5345348.html

四.基于corosync实现web高可用资源管理
关于crm命令的使用:http://blog.chinaunix.net/uid-30212356-id-5345399.html
关于下面用到的资源相关的概述:http://blog.chinaunix.net/uid-30212356-id-5333561.html

1.stonith参数的调整
禁用stonith功能,corosync默认是启用stonith功能的,没有stonith设备,若直接去配置资源的话,verif会报错,无法commit;
[root@node1 ~]# crm configure
crm(live)configure# property stonith-enabled=false
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com
node node2.a.com
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false

2.配置web集群
2.1定义VIP
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.85.128
表示定义一个主资源webip,ocf:heartbeat:IPaddr表示资源代理的类型为ocf,提供者是heartbeat,具体的代理为IPaddr,params表示代理IPaddr的参数为ip="192.168.85.128"即VIP为192.168.85.128

crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov  6 18:34:52 2015
Last change: Fri Nov  6 18:34:27 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ node1.a.com node2.a.com ]
 webip  (ocf::heartbeat:IPaddr):        Started node1.a.com

2.2验证
[root@node1 ~]# ip -o -f inet addr show
1: lo    inet 127.0.0.1/8 scope host lo
2: eth1    inet 192.168.85.144/24 brd 192.168.85.255 scope global eth1
2: eth1    inet 192.168.85.128/24 brd 192.168.85.255 scope global secondary eth1

2.3配置httpd资源
[root@node1 ~]# crm configure
crm(live)configure# primitive webserver lsb:httpd
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov  6 18:46:09 2015
Last change: Fri Nov  6 18:45:37 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured

Online: [ node1.a.com node2.a.com ]

 webip  (ocf::heartbeat:IPaddr):        Started node1.a.com 
 webserver      (lsb:httpd):    Started node2.a.com
资源运行于两个节点上,默认以均衡的方式进行工作,尽可能将不同的资源运行在不同的节点上;

3.定义资源约束
若想将多个资源运行在同一个节点上,则,做成组,或定义排列约束。
资源约束则用以指定在哪些群集节点上运行资源,以何种顺序装载资源,以及特定资源依赖于哪些其它资源。
pacemaker共给我们提供了三种资源约束方法:
        1)Resource Location(资源位置):定义资源可以、不可以或尽可能在哪些节点上运行;

        2)Resource Collocation(资源排列):排列约束定义集群资源可以、不可以在某个节点上同时运行;

        3)Resource Order(资源顺序):顺序约束定义集群资源在节点上启动的顺序;

3.1定义组,将资源运行在同一个节点上
定义资源组:
定义组webcluster,成员是后面的资源
crm(live)# configure
crm(live)configure# group webcluster webip webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com
node node2.a.com
primitive webip IPaddr \
        params ip=192.168.85.128
primitive webserver lsb:httpd
group webcluster webip webserver
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov  6 18:55:18 2015
Last change: Fri Nov  6 18:55:04 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured

Online: [ node1.a.com node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node1.a.com 
     webserver  (lsb:httpd):    Started node1.a.com
 
3.2浏览器测试


3.3让node1离线,测试是否转移
crm(live)# node
crm(live)node# standby node1.a.com
crm(live)node# cd
crm(live)# status
Last updated: Fri Nov  6 18:57:58 2015
Last change: Fri Nov  6 18:57:38 2015
Stack: classic openais (with plugin)
Current DC: node2.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured

Node node1.a.com: standby
Online: [ node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node2.a.com 
     webserver  (lsb:httpd):    Started node2.a.com
资源已经转移到node2上了;
刷新浏览器测试

让node1重新上线:
crm(live)# node
crm(live)node# online node1.a.com
crm(live)node# cd
crm(live)# status
Last updated: Fri Nov  6 20:41:13 2015
Last change: Fri Nov  6 20:41:02 2015
Stack: classic openais (with plugin)
Current DC: node2.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured

Online: [ node1.a.com node2.a.com ] #node1上线了,只不过DC不在node1上了
 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node2.a.com 
     webserver  (lsb:httpd):    Started node2.a.com
 
4.定义排列约束
4.1先删除组
[root@node2 ~]# crm resource
crm(live)resource# stop webcluster
crm(live)resource# cd
crm(live)# configure
crm(live)configure# delete webcluster
crm(live)configure#show 
crm(live)configure#commit 
crm(live)configure#cd 
crm(live)#status 
crm(live)# status
Last updated: Fri Nov  6 19:21:33 2015
Last change: Fri Nov  6 19:21:21 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured

Online: [ node1.a.com node2.a.com ]

 webip  (ocf::heartbeat:IPaddr):        Started node1.a.com 
 webserver      (lsb:httpd):    Started node2.a.com
此时为负载均衡状态;

4.2定义排列约束
role(可省):每一个资源都可以定义很多角色,每一个资源代理在启动一个资源时,会经过很多阶段,刚准备启动叫premote,启动起来叫start,停止为stop。role在主从模型中用的多点,定义主的启动了,从的才能启动或其它role。 

crm(live)# configure
让webserver和webip在一起
crm(live)configure# colocation webserver_and_webip inf: webserver webip(注意inf:后有一个空格,否则会提示出错)
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show

5.定义顺序约束
就如我们定义的资源,他们的启动顺序应该是webip,webserver,那就可以给它们定义一个顺序约束;
kind类型:Mandatory (强制),Optional (可选),Serialize(顺序);

crm(live)configure# order webip_before_webserver mandatory: webip webserver 
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
        attributes standby=off
node node2.a.com
primitive webip IPaddr \
        params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip    #排列约束
order webip_before_webserver Mandatory: webip webserver    #顺序约束
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false

6.定义位置约束
6.1定义位置约束
定义node1的倾向性为100
crm(live)configure# location webip_on_node1 webip 100: node1.a.com 
crm(live)configure# verify
crm(live)configure# commit
如果此时资源在node2上,node1的倾向性为100,node2的倾向性为0,那么会将资源转移到node1上;

如果这时停止node1,资源不会转移到node2上,不会显示了,而显示:Current DC: node1.a.com - partition WITHOUT QUORUM,表明node1挂了,node2不具备法定票数,不具备法定票数的默认操作为suicide或stop(suicide全部资源全挂不显示任何资源)
   
6.2定义全局属性
两节点集群是一个特殊的集群,当node1挂了,我们当然希望资源转移到node2上,那么此时需要定义全局属性,一旦不具备法定票数时,应该忽略,而不是停止资源,保证资源在节点上;
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
        attributes standby=off
node node2.a.com
primitive webip IPaddr \
        params ip=192.168.85.128
primitive webserver lsb:httpd
location webip_on_node1 webip 100: node1.a.com
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore
此时停止node1资源转移到node2上,当node1上线后资源又再次转移到node1;
当两个节点的倾向性一样时,最终的倾向性取决于所有倾向性之和;

删除位置约束:
crm(live)configure# edit
    删除位置约束配置的行保存退出
crm(live)configure# commit
crm(live)configure# show
node node1.a.com \
        attributes standby=off
node node2.a.com
primitive webip IPaddr \
        params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore

7.定义资源默认黏性
资源粘性生效于当前运行节点。资源运行在哪里,即在哪里生效。

粘性定义,无关任一node,只生效当前所运行节点;

资源粘性表示资源是否倾向于留在当前节点,如果为正整数,表示倾向,负数则会离开,inf正无穷,表示只要可以就留在该节点,-inf表示负无穷,表示如果可以就远离该节点(如果只剩下一个为-inf的节点,那么不得不留在这个节点)。
[root@node1 ~]# crm configure
crm(live)configure# rsc_defaults resource-stickness=100
crm(live)configure# show
node node1.a.com \
        attributes standby=off
node node2.a.com
primitive webip IPaddr \
        params ip=192.168.85.128
primitive webserver lsb:httpd
colocation webserver_and_webip inf: webserver webip
order webip_before_webserver Mandatory: webip webserver
property cib-bootstrap-options: \
        dc-version=1.1.11-97629de \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        stonith-enabled=false \
        no-quorum-policy=ignore
rsc_defaults rsc-options: \
        resource-stickness=100

8.定义资源监控
monitor是用来监控资源的,默认情况下pacemaker没有对任何资源进行监控,那为什么要对资源进行监控呢?
假如一个节点出故障了,corosync检测不到这个节点的心跳信息,那么它就认为这个节点故障了,因此资源就会转移到另外的备用节点上;但如果是服务非正常关闭了呢?
假如挂的是httpd服务而不是节点,在这种情况下,如果没有对资源进行监控,资源是不会转移的,因为压根儿就没节点什么事,它也不会意识到服务停掉了,这就意味着如果服务非正常关闭的话,那web也就不会响应了,所以我们应该在定义资源时对其进行监控 

要想对资源进行监控,就必须在定义资源时指定op_type为monitor,假如服务非正常关闭的话,先让其重启,如果重启不了,再转移到其它节点上,interval代表多久监听一次,timeout代表超时时间;
如:
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.85.143 op monitor interval=30s timeout=20s on-fail=restart

crm(live)configure#delet vip              //直接删除上面的定义

五.基于corosync实现web高可用
1.查看共享目录
[root@nfs ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24

[root@node1 ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24

[root@node2 ~]# showmount -e 192.168.85.143
Export list for 192.168.85.143:
/mysqldata 192.168.85.0/24

2.为NFS主机提供测试页面
[root@nfs ~]# cat /mysqldata/index.html 

NFS Server



3.配置NFS资源:
crm(live)configure# primitive mynfs ocf:heartbeat:Filesystem params device=192.168.85.143:/mysqldata directory=/var/www/html fstype=nfs op start timeout=60s op stop timeout=60s

注意:因为上一个命令没有加上op start timeout=60s op stop timeout=60s这句话,所以提示以下错误,导致无法提交,加上就可以了;如果加上后提示一些乱七八糟的错误就把mynfs资源删了重建,可能是有些"垃圾"没清理更上次的配置有冲突;
WARNING: mynfs: default timeout 20s for start is smaller than the advised 60
WARNING: mynfs: default timeout 20s for stop is smaller than the advised 60

crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Nov  6 21:53:55 2015
Last change: Fri Nov  6 21:53:51 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured

Online: [ node1.a.com node2.a.com ]

 webip  (ocf::heartbeat:IPaddr):        Started node1.a.com 
 webserver      (lsb:httpd):    Started node1.a.com 
 mynfs  (ocf::heartbeat:Filesystem):    Started node1.a.com
 
4.删除资源组:
crm(live)# resource
crm(live)resource# stop  webcluster
crm(live)configure# delete webcluster

5.重新创建资源组:
crm(live)# configure
crm(live)configure# group webcluster webip mynfs webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)# status
Last updated: Sat Nov  7 11:31:22 2015
Last change: Sat Nov  7 11:31:10 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured

Online: [ node1.a.com node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node1.a.com 
     mynfs      (ocf::heartbeat:Filesystem):    Started node1.a.com 
     webserver  (lsb:httpd):    Started node1.a.com

6.创建顺序约束说明启动顺序为VIP  NFS  httpd
crm(live)# configure
crm(live)configure# order webip_before_mynfs_before_webserver mandatory: webip mynfs webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node  node1.a.com  \
      attributes standby=off  
node  node2.a.com  
primitive  mynfs Filesystem \
      params  device="192.168.85.143:/mysqldata" directory="/var/www/html" fstype=nfs \
      op start timeout=60s interval=0 \
      op stop  timeout=60s interval=0  
primitive  webip IPaddr \
      params ip=192.168.85.128  
primitive  webserve   lsb:httpd
group      webcluster webip mynfs webserver \
           meta  target-role=Started  
order      webip_before_mynfs_before_webserver Mandatory: webip mynfs webserver  
property   cib-bootstrap-options: \
        dc-version=1.1.11-97629de   \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2  \
        stonith-enabled=false  \
        no-quorum-policy=ignore  

7.测试
crm(live)# status
Last updated: Sat Nov  7 11:34:50 2015
Last change: Sat Nov  7 11:33:44 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum #注意,如果DC,资源不在同一个节点上,页面会无法显示
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured

Online: [ node1.a.com node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node1.a.com 
     mynfs      (ocf::heartbeat:Filesystem):    Started node1.a.com 
     webserver  (lsb:httpd):    Started node1.a.com
7.1打开浏览器测试

7.2将node1改为备节点测试资源是否转移
crm(live)# node
crm(live)node# standby node1.a.com
crm(live)node# show
node1.a.com: normal
        standby=on
node2.a.com: normal

crm(live)# status
Last updated: Sat Nov  7 11:38:13 2015
Last change: Sat Nov  7 11:37:55 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured

Node node1.a.com: standby
Online: [ node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node2.a.com 
     mynfs      (ocf::heartbeat:Filesystem):    Started node2.a.com 
     webserver  (lsb:httpd):    Started node2.a.com

浏览器测试:


7.3重新让node1上线
crm(live)node# online node1.a.com
crm(live)node# show
node1.a.com: normal
        standby=off
node2.a.com: normal

crm(live)# status
Last updated: Sat Nov  7 11:39:31 2015
Last change: Sat Nov  7 11:38:58 2015
Stack: classic openais (with plugin)
Current DC: node1.a.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured

Online: [ node1.a.com node2.a.com ]

 Resource Group: webcluster
     webip      (ocf::heartbeat:IPaddr):        Started node2.a.com 
     mynfs      (ocf::heartbeat:Filesystem):    Started node2.a.com 
     webserver  (lsb:httpd):    Started node2.a.com
node1上线了,当前DC为node2

至此,基于NFS的web高可用完成了,不过有些功能还未加上,等熟悉了crm命令后再加上吧;

参考资料:





阅读(1265) | 评论(0) | 转发(0) |
0

上一篇:crm用法

下一篇:DRBD详解

给主人留下些什么吧!~~