corosync， drbd的配置-随风飘云-ChinaUnix博客

随风飘云wkgbc.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

随风飘云

博客访问： 1182743
博文数量： 150
博客积分： 2739
博客等级：少校
技术积分： 2392
用户组：普通用户
注册时间： 2010-12-07 12:28

文章分类

全部博文（150）

Window_Azure（7）
RouterOS（0）
虚拟化（2）
Oracle（6）
技术文档翻译（0）
MongoDB（4）
常见问题（0）
邮件问题（3）
安全（2）
问题-tomcat（1）
dns+双线（1）
杂七杂八（2）
数据库集群（0）
linux储存（3）
个人心情（0）
启动脚本（6）
tomcat（8）
转载（13）
监控（9）
linux集群及高可（6）
shell（17）
漏洞木马（2）
windows（13）
数据库（15）
linux专区（30）
未分配的博文（0）

文章存档

2015年（2）

2014年（16）

2013年（10）

2012年（58）

2011年（64）

我的朋友

相关博文

corosync， drbd的配置

分类： LINUX

2011-09-15 22:15:44

Corosync的配置：

配置准备工作：准备两台机器，分布分别是node1.a.org ,node2.a.org ,相应的IP地址：192.168.0.3 ，192.168.0.134 ，安装集群服务apache的httpd服务：

一：编辑/etc/host文件加入以下内容：

192.168.0.134 node1.a.org node1

192.168.0.3 node2.a.org node2

1在node1, node2上用hostname命名或者直接编辑/etc/sysconfig/network文件更改主机名

2、设置两个节点基于密钥进行ssh通信

node1:
#ssh-keygen –t rsa
#ssh-copy-id –I /root/.ssh/id_rsa.pub node2
node2:
#ssh-keygen –t rsa
#ssh-copy-id –I /root/.ssh/id_rsa.pub node2

在node1, node2上安装apache服务，为了测试在node1上创建含’node1.a.org’的index.html文件，在node2上创建’node2.a.org’的index.html确保服务能启动，这里采用yum安装：

#yum install httpd –y
#chkconfig httpd stop
#chkconfig httpd off

二：安装软件包：

libibverbs, librdmacm, lm_sensors, libtool-ltdl, openhpi-libs, openhpi, perl-TimeDate 1 将这些软件放在/root/cluster

#cd /root/cluster
#yum –y localinstall *.rpm –nogpgcheck

2编辑配置corosync文件：

# cp corosync.conf.example corosync.conf
在该文件中加入以下内容：
service {
ver: 0
name: pacemaker
}
ai***ec {
user: root
group: root
}
将bindnet addr该成：bindnet addr: 192.168.0.0

3 节点通信时生成认证密钥文件：

#corosync-keygen
#scp –p authkey node:/etc/corosync
#mkdir /var/log/cluster

4:启动: /etc/init.d/corosync start

说明：以上操作是在node1节点中进行的，在节点node2上做相同的操作然后在node1节点上启动node2的服务：ssh node2 ‘/etc/init.d/corosync start’启动

验证启动corosync是否正常：

查看corosync引擎是否正常启动：

# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/messages
Jun 14 19:02:08 node1 corosync[5103]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Jun 14 19:02:08 node1 corosync[5103]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Jun 14 19:02:08 node1 corosync[5103]: [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1397.
Jun 14 19:03:49 node1 corosync[5120]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Jun 14 19:03:49 node1 corosync[5120]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

查看初始化成员节点通知是否正常发出：

# grep TOTEM /var/log/messages
Jun 14 19:03:49 node1 corosync[5120]: [TOTEM ] Initializing transport (UDP/IP).
Jun 14 19:03:49 node1 corosync[5120]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 14 19:03:50 node1 corosync[5120]: [TOTEM ] The network interface [192.168.0.5] is now up.
Jun 14 19:03:50 node1 corosync[5120]: [TOTEM ] A processor joined or left the membership and a new membership was formed.

检查启动过程中是否有错误产生：

# grep ERROR: /var/log/messages | grep -v unpack_resources

查看pacemaker是否正常启动：

# grep pcmk_startup /var/log/messages
Jun 14 19:03:50 node1 corosync[5120]: [pcmk ] info: pcmk_startup: CRM: Initialized
Jun 14 19:03:50 node1 corosync[5120]: [pcmk ] Logging: Initialized pcmk_startup
Jun 14 19:03:50 node1 corosync[5120]: [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295
Jun 14 19:03:50 node1 corosync[5120]: [pcmk ] info: pcmk_startup: Service: 9
Jun 14 19:03:50 node1 corosync[5120]: [pcmk ] info: pcmk_startup: Local hostname: node1.a.org

配置集群服务：

为web集群创建一个ip地址资源：

# crm configure primitive WebIP ocf:heartbeat:IPaddr params ip=192.168.0.99

修改忽略quorum不能满足的集群状态检查：

# crm configure property no-quorum-policy=ignore

为资源设置默认黏性值：

# crm configure rsc_defaults resource-stickiness=100
# crm configure property stonith-enabled=false

WebIP和WebSite可能会运行于不同节点的问题，通过以下解决

# crm configure colocation website-with-ip INFINITY: WebSite WebIP

确保website在魔鬼节点启动前先启动webip

# crm configure order httpd-after-ip mandatory: WebIP WebSite

设置约束

# crm configure location prefer-node1 WebSite rule 200: node1

在node1,node2上启动corosync服务:

通过游览器访问192.168.0.99看是否有效果，然后任意停止一个服务在此访问验证：

到现在配置openais完成：

DRBD的配置

配置前，需要在node1,node2上添加一块硬盘并创建分区：

#fdisk /dev/sdb

安装软件包：

drbd共有两部分组成：内核模块和用户空间的管理工具。其中drbd内核模块代码已经整合进Linux内核2.6.33以后的版本中，因此，如果您的内核版本高于此版本的话，你只需要安装管理工具即可；否则，您需要同时安装内核模块和管理工具两个软件包，并且此两者的版本号一定要保持对应。下载这些软件并安装在node1,node2做相同的操作配置

# yum -y --nogpgcheck localinstall drbd83-8.3.8-1.el5.centos.i386.rpm kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

配置drbd:

主要配置/etc/drbd.conf文件：

# cp /usr/share/doc/drbd83-8.3.8/drbd.conf /etc
配置/etc/drbd.d/global-common.conf
global {
usage-count no;
# minor-count dialog-refresh disable-ip-verification
}
common {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup {
wfc-timeout 120;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
fencing resource-only;
}
net {
cram-hmac-alg "sha1";
shared-secret "mydrbdlab";
}
syncer {
rate 100M;
}
}

3、定义一个资源/etc/drbd.d/web.res，内容如下：

resource web {
on node1.a.org {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.134:7789;
meta-disk internal;
}
on node2.a.org {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.3:7789;
meta-disk internal;
}
}

初始化资源并启动服务：

#drbdad create-md web

在node1, node2上启动服务： # /etc/init.d/drbd start

设置主节点：

# drbdsetup /dev/drbd0 primary –o
或
# drbdadm -- --overwrite-data-of-peer primary web

创建文件系统，文件系统的挂载的primary节点进行：

# mke2fs -j -L DRBD /dev/drbd0
# mkdir /web
# mount /dev/drbd0 /web

验证drbd：

在主节点（node1）上/web的文件中复制一些内容并设置为从服务然后：

#umount /web
#drbdadm secondary web
在node2上：drbdm primary web 设置为主节点
#mount /dev/drbd0 /web

有关cororync，drdb的命令有关介绍

corosync常用命令
corosync-keygen 生成密钥
crm status 查看集群状态
crm_verify –L 检查集群是否出现故障
在ra模式下：classes显示资源的子类
crm_attribute 修改某个或全局属性
crm_node 修改跟节点有关命令
crm_node –q 显示票数

cibadmin 集群配置修改工具

常用 –Q显示CIB文档 , -E 清空CIB内容, -R 修改替换CIB, -D 删除某个选项, -d清空所有资源

例：cibadmin –Q >/tem/qq.xml 修改qq.xml文件后在替换cibadmin –Q /tem/qq.xml

删除某个资源：crm(live)configure#edit 直接编辑，或在该模式下用delete删除，或cibadmin

crm_shadow

crm(live)configure ra# list ocf heartbeat 查看文件系统

资源约束：

位置：资源更乐意留在哪个节点上
help location 查看帮助
例：location Web_on_node1 Web 500: node1.a.org
次序:定义资源的先后顺序
help order 查看帮助
例：order WebServer_after_WebIP mandatory: WebServer:start WebIP
排序：是否能同时运行在两节点上
help colocation 查看帮助

DRBD常用命令介绍：

# drbd-overview 查看主从
# cat /proc/drbd 查看启动状态

crm交互式模式介绍：

在shell中直接输入crm进入交互式：

[root@node1 ~]# crm
crm(live)# help 查看帮助
This is the CRM command line interface program.
Available commands:
cib manage shadow CIBs
resource resources management
node nodes management
options user preferences
configure CRM cluster configuration
ra resource agents information center
status show cluster status
quit,bye,exit exit the program
help show help
end,cd,up go back one level
crm(live)#

在输入configure进入配置模式：

crm(live)configure#
crm(live)configure# cd 用来切换
crm(live)# status 查看状态
============
Last updated: Wed Sep 14 22:09:13 2011
Stack: openais
Current DC: node1.a.org - partition WITHOUT quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ node1.a.org ]
OFFLINE: [ node2.a.org ]
Master/Slave Set: MS_Webdrbd
Slaves: [ node1.a.org ]
Stopped: [ webdrbd:1 ]

ra 可以查看资源代理类型：

crm(live)configure ra# classes
heartbeat
lsb
ocf / heartbeat linbit pacemaker
stonith

在configure模式中配置完需要用commit提交才保存并能生效：

crm node standby 在某个节点上执行将该节点将模拟故障
crm node online让该节点重新上线

drbd+pacemaker配置：

drbd配置如上下面配置pacemaker:

[root@node1 ~]# crm configure show
node node1.a.org
node node2.a.org
property $id="cib-bootstrap-options" \
dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \ 确保含有此项
stonith-enabled="false" \ 确保含有此项
[root@node1 ~]#
[root@node1 ~]#
[root@node1 ~]# /etc/init.d/drbd stop 将node1,node2的drbd关掉
Stopping all DRBD resources: .
[root@node1 ~]# chkconfig drbd off

配置drbd资源：

]# crm
crm(live)# configure
crm(live)configure# primitive webdrbd ocf:heartbeat:drbd params drbd_resource=web op monitor role=Master interval=50s timeout=30s op monitor role=Slave interval=60s timeout=30s
WARNING: webdrbd: default timeout 20s for start is smaller than the advised 240
WARNING: webdrbd: default timeout 20s for stop is smaller than the advised 100
crm(live)configure# master MS_Webdrbd webdrbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
crm(live)configure# show webdrbd
primitive webdrbd ocf:heartbeat:drbd \
params drbd_resource="web" \
op monitor interval="50s" role="Master" timeout="30s" \
op monitor interval="60s" role="Slave" timeout="30s"
crm(live)configure# show MS_Webdrbd
ms MS_Webdrbd webdrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
crm(live)configure#

在node2上查看主机是否成为primary节点：

# drbdadm role web
Primary/Secondary

为Primary节点上的web资源创建自动挂载的集群服务

才

阅读(3357) | 评论(0) | 转发(0) |

上一篇：linux后门及ROOKIT技术

下一篇：nginx介绍以及编译安装

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6