DRBD + NFS + HEAREBEAT-xinyv-ChinaUnix博客

xinyv

首页　| 　博文目录　| 　关于我

xinyv

博客访问： 248808
博文数量： 41
博客积分： 1523
博客等级：上尉
技术积分： 579
用户组：普通用户
注册时间： 2007-02-05 21:23

文章分类

全部博文（41）

windows（1）
linux（40）

C_program（4）

防病毒网关（1）

command（7）

other（7）

services（3）

shell_script（4）

linux进程间通讯（1）

大型网站集群构建（10）
未分配的博文（0）

文章存档

2014年（1）

2013年（2）

2012年（1）

2011年（2）

2010年（3）

2009年（1）

2008年（20）

2007年（11）

我的朋友

相关博文

DRBD + NFS + HEAREBEAT

分类： LINUX

2007-12-17 09:37:12

DRBD8.0 配置

−目录

DRBD是一种块设备,可以被用于高可用(HA)之中.它类似于一个网络RAID-1功能.当你将数据写入本地
文件系统时,数据还将会被发送到网络中另一台主机上.以相同的形式记录在一个文件系统中.
本地(主节点)与远程主机(备节点)的数据可以保证实时同步.当本地系统出现故障时,远程主机上还会
保留有一份相同的数据,可以继续使用.

在高可用(HA)中使用DRBD功能,可以代替使用一个共享盘阵.因为数据同时存在于本地主机和远程主机上,
切换时,远程主机只要使用它上面的那份备份数据,就可以继续进行服务了.
DRBD的工作原理如下图:

        +--------+
        |  文件系统 |
        +--------+
             |
             V
        +----------+
        |   块设备层  |
        | (/dev/drbd1) |
        +----------+
         |            |
         |            |
         V           V
   +----------+  +-----------+
   |  本地硬盘   |   | 远程主机硬盘 |
   | (/dev/hdb1)  |   | (/dev/hdb1)  |
   +----------+  +-----------+

从官方网站下载源码包:

我们在Turbolinux10.5上使用drbd-8.0.4版.分别在两台主机上安装DRBD:

# tar jxf drbd-8.0.4.tar.gz
# cd drbd-8.0.4
# make
# make install

make install执行之后:
drbd.ko被安装到/lib/modules/$KernelVersion/kernel/drivers/block下.
drbd相关工具(drbdadm,drbdsetup)被安装到/sbin下.
并会在/etc/init.d/下建立drbd启动脚本.

你需要为本地主机和远程主机,指定一个DRBD使用的硬盘分区.这两个分区的大小必须相同.
我们指定两台主机的/dev/hdb1分区作为DRBD的使用的分区.这两个分区大小都为300MB.

DRBD运行时,会读取一个配置文件/etc/drbd.conf.这个文件里描述了DRBD设备与硬盘分区的映射关系,
和DRBD的一些配置参数.
下面是一个drbd.conf文件的简单示例:
<主机>Turbolinux 10.5操作系统,主机名为g105-1,IP地址为 10.0.1.2,DRBD分区为/dev/hdb1.
<备机>Turbolinux 10.5操作系统,主机名为g105-2,IP地址为 10.0.2.2,DRBD分区为/dev/hdb1.

       # 是否参加DRBD使用者统计.默认是yes
       global { usage-count yes; }
       # 设置主备节点同步时的网络速率最大值,单位是字节.
       common { syncer { rate 1M; } }
       # 一个DRBD设备(即:/dev/drbdX),叫做一个"资源".里面包含一个DRBD设备的主备节点的
       # 相关信息.
       #
       resource r0 {
            # 使用协议C.表示收到远程主机的写入确认后,则认为写入完成.
            protocol C; 
            net {
                 # 设置主备机之间通信使用的信息算法.
                 cram-hmac-alg sha1;
                 shared-secret "FooFunFactory";
            }
            # 每个主机的说明以"on"开头,后面是主机名.在后面的{}中为这个主机的配置.
            on g105-1 {
                 # /dev/drbd1使用的磁盘分区是/dev/hdb1
                 device    /dev/drbd1;
                 disk      /dev/hdb1;
                 # 设置DRBD的监听端口,用于与另一台主机通信
                 address   10.0.1.2:7898;
                 meta-disk  internal;
            }
            on g105-2 {
                 device    /dev/drbd1;
                 disk      /dev/hdb1;
                 address   10.0.2.2:7898;
                 meta-disk  internal;
            }
       }

然后将这个drbd.conf文件分别复制到两台主机的/etc目录下.

在启动DRBD之前,你需要分别在两台主机的hdb1分区上,创建供DRBD记录信息的数据块.分别在
两台主机上执行:

[root@g105-1 /]# drbdadm create-md r0
[root@g105-2 /]# drbdadm create-md r0

“r0”是我们在drbd.conf里定义的资源名称.
现在我们可以启动DRBD了,分别在两台主机上执行:

[root@g105-1 /]# /etc/init.d/drbd start
[root@g105-2 /]# /etc/init.d/drbd start

现在可以查看DRBD的状态,然后在g105-1主机上执行:

[root@g105-1 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-1, 2007-07-28 07:22:30

 1: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

”/proc/drbd”中显示了drbd当前的状态.第一行的st表示两台主机的状态,都是”备机”状态.
ds是磁盘状态,都是”不一致”状态.
这是由于,DRBD无法判断哪一方为主机,以哪一方的磁盘数据作为标准数据.所以,我们需要初始化
一个主机.在g105-1上执行:

[root@g105-1 /]# drbdsetup /dev/drbd1 primary -o

现在再看一个g105-1上的DRBD状态:

[root@g105-1 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-1, 2007-07-28 07:22:30

 1: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
    ns:42688 nr:0 dw:0 dr:42688 al:0 bm:2 lo:4 pe:0 ua:4 ap:0
        [==>.................] sync&#039;ed: 14.7% (262464/305152)K
        finish: 0:02:58 speed: 1,440 (1,292) K/sec
        resync: used:1/31 hits:2669 misses:3 starving:0 dirty:0 changed:3
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

主备机状态分别是”主/备”,主机磁盘状态是”实时”,备机状态是”不一致”.
在第3行,可以看到数据正在同步中,即主机正在将磁盘上的数据,传递到备机上.现在的进度是14.7%.
现在看一下g105-2上面的DRBD状态.

[root@g105-2 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-2, 2007-07-28 07:13:14

 1: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C r---
    ns:0 nr:56608 dw:56608 dr:0 al:0 bm:3 lo:0 pe:0 ua:0 ap:0
        [===>................] sync&#039;ed: 20.0% (248544/305152)K
        finish: 0:02:57 speed: 1,368 (1,284) K/sec
        resync: used:0/31 hits:3534 misses:4 starving:0 dirty:0 changed:4
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

稍等一会,在数据同步完后,再查看一下g105-1的DRBD状态:

[root@g105-1 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-1, 2007-07-28 07:22:30

 1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
    ns:305152 nr:0 dw:0 dr:305152 al:0 bm:19 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:19053 misses:19 starving:0 dirty:0 changed:19
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

磁盘状态都是”实时”,表示数据同步完成了.

你现在可以把主机上的DRBD设备挂载到一个目录上进行使用.备机的DRBD设备无法被挂载,因为它是
用来接收主机数据的,由DRBD负责操作.
在g105-1上执行:

[root@g105-1 /]# mount /dev/drbd1 /mnt/drbd1
[root@g105-1 /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda3             9.3G  6.5G  2.4G  73% /
/dev/hda1              99M  9.3M   85M  10% /boot
none                  249M     0  249M   0% /dev/shm
/dev/drbd1            289M   11M  264M   4% /mnt/drbd1

现在,我们在drbd1目录里建立一个200M的文件:

[root@g105-1 /]# dd if=/dev/zero of=/mnt/drbd1/tempfile1.tmp bs=104857600 count=2

操作完成后,在g105-2(备机)上执行:
我们先停止DRBD

[root@g105-2 /]# /etc/init.d/drbd stop

现在,我们可以将hdb1进行挂载

[root@g105-2 /]# mount /dev/hdb1 /mnt/drbd1
[root@g105-2 /]# ls /mnt/drbd1 -hl
total 201M
drwx------  2 root root  12K Jul 28 23:44 lost+found
-rw-r--r--  1 root root 200M Jul 29 00:20 tempfile1.tmp
[root@g105-2 /]# umount /mnt/drbd1

可以看到,在主机g105-1上产生的文件tmpfile1.tmp,也完整的保存在备机g105-2的DRBD分区上.
这就是DRBD的网络RAID-1功能. 在主机上的任何操作,都会被同步到备机的相应磁盘分区上,达到数据备份的效果.

有时,你需要将DRBD的主备机互换一下.可以执行下面的操作:
在主机上,先要卸载掉DRBD设备.

[root@g105-1 /]# umount /mnt/drbd1

将主机降级为”备机”.

[root@g105-1 /]# drbdadm secondary r0
[root@g105-1 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-1, 2007-07-28 07:13:14

 1: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
    ns:0 nr:5 dw:5 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

现在,两台主机都是”备机”.
在备机g105-2上,将它升级为”主机”.

[root@g105-2 /]# drbdadm primary r0
[root@g105-2 /]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@g105-2, 2007-07-28 07:13:14

 1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
    ns:0 nr:5 dw:5 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

现在,g105-2成为了”主机”.你可以把它的/dev/drbd1进行挂载和使用了.同样,数据会被同步到
g105-1上面.

文章转自 http://mdjhaitao.blog.163.com/blog/static/101431212007922111554577/

#----------------------------------------------------------the end-------#

本来想自己写，后来实在是懒得这么做了，转了网上的一篇文章，这里只把其中的问题说明以一下，另外nfs配置的文章，请看我转的另一篇。

上面这篇文章只是把 drbd nfs heartbeat 分别实现了。至于整合用于生产环境我相信他根本没有去做。
下面就是我遇到问题并解决的过程。
在安装drbd的时候，在你的drbd源码目录里面有一个 scripts/drbddisk 这个文件就是heartbeat用来切换 Primary/Secondary 节点的脚本。你可以把这个文件拷贝到/etc/ha.d/resourcd.d 下面，实现自动切换。
下面是我heartbeat的配置脚本。
#-------------------------------------------#
authkeys
auth 2
2 sha1 drbd_test
ha.cf
keepalive 2
deadtime 10
warntime 3
bcast   bond0           # Linux
auto_failback off
node    test01
node    test02
ping 192.16.1.100
respawn hacluster /usr/local/heartbeat/lib/heartbeat/ipfail
use_logd yes
haresources
test01 my-shell
官方文档用的是这个样子的
host-a drbddisk::drbd-resource-0 \
        Filesystem::/dev/drbd0::/share/spool0/data::reiserfs \
        killnfsd \
        nfs-common \
        nfs-kernel-server \
        Delay::3::0 \
        IPaddr::10.0.0.30/24/eth0
#-------------------------------------------#
上面的 bond0 是因为当使用nfs作为服务器的时候，读的速度瓶颈在网卡上，写的瓶颈在磁盘上，用 bonding技术绑定多块网卡共同服务，解决网络瓶颈问题。
#-------------------------------------------#
bonding 配置文件
在 modprobe.conf 中添加：
alias bond0 bonding
在/etc/sysconfig/netconfig-scripts/增加：
ifcfg-bond0
DEVICE=bond0
BOOTPROTO="static"
BROADCAST="10.10.10.255"
IPADDR="10.0.0.254"
NETMASK="255.255.255.0"
NETWORK="10.0.0.0"
MACADDR=01:02:03:04:05:06
STARTMODE="onboot"
BONDING_MASTER="yes"
BONDING_MODULE_OPTS="mode=5 miimon=50"
BONDING_SLAVE0="eth0"
BONDING_SLAVE1="eth1"
... ...
ONBOOT=yes
修改eth0 ... ... ethN的内容为：
DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
#-------------------------------------------#
既然是高可用那就必须保证在主服务器down掉以后从服务器能顺利接管，并且保证数据一致性。
我们看看heartbeat官方的写法，我来分析一下。
在官方的脚本中要 1、drbd切换成状态 2、挂接文件系统 3、启动nfs服务 4、绑定集群 ip。
在这种方法中我遇到的一个问题就是当集群中主节点down机或拔掉网线的时候，从可以顺利接管主服务器，实现高可用，但这个时候如果你往现在的主（原来的从）的drbd磁盘上写入数据，那么在另一台机器重新启动或插上网线的时候，就会发生“split brain” ，这个时候drbd的数据就不是同步的了，想要同步就必须手工恢复。很奇怪的是如果没有mount drbd就很少发生这种情况。这就好像一个半自动的高可用，我们需要经常去监视他是否断掉了，那怕是重启一台机器都经常会发生“split brain”，我看了drbd的文档，里面有很多策略应该可以避免我上面的情况，可惜我的实验全部失败了。希望有成功的指点我一下。
我的e文不太好，我这段帮助贴上来，e文好的自己看吧。
       -A, --after-sb-0pri asb-0p-policy
              possible policies are:

              disconnect
                     No automatic resynchronisation, simply disconnect.

              discard-younger-primary
                     Auto sync from the node that was primary before the split- brain situation occurred.

              discard-older-primary
                     Auto sync from the node that became primary as second during the split-brain situation.

              discard-zero-changes
                     In case one node did not write anything since the split brain became evident, sync from the node that wrote something to the node that did
                     not write anything. In case none wrote anything this policy uses a random decission to perform a "resync" of 0 blocks. In case both have
                     written something this policy disconnects the nodes.

              discard-least-changes
                     Auto sync from the node that touched more blocks during the split brain situation.

              discard-node-NODENAME
                     Auto sync to the named node.
       -B, --after-sb-1pri asb-1p-policy
              possible policies are:

              disconnect
                     No automatic resynchronisation, simply disconnect.

              consensus
                     Discard the version of the secondary if the outcome of the after-sb-0pri algorithm would also destroy the current secondary’s data. Other-
                     wise disconnect.

              discard-secondary
                     Discard the secondary’s version.

              call-pri-lost-after-sb
                     Always honour the outcome of the after-sb-0pri
                      algorithm. In case it decides the current secondary has the right data, call the pri-lost-after-sb on the current primary.

              violently-as0p
                     Always honour the outcome of the after-sb-0pri
                      algorithm. In case it decides the current secondary has the right data, accept a possible instantaneous change of the primary’s data.

       -C, --after-sb-2pri asb-2p-policy
              possible policies are:

              disconnect
                     No automatic resynchronisation, simply disconnect.

              call-pri-lost-after-sb
                     Always honour the outcome of the after-sb-0pri
                      algorithm. In case it decides the current secondary has the right data, call the pri-lost-after-sb on the current primary.

              violently-as0p
                     Always honour the outcome of the after-sb-0pri
                      algorithm. In case it decides the current secondary has the right data, accept a possible instantaneous change of the primary’s data.

我理解的意思就是 split brain， after - sb （不知道你会不会和我的理解一样……）
这里我解决的办法是用一个脚本来检测 split brain ，并自动判断同步。
shell.1 发到启动组里面
#!/bin/sh
PATH=$PATH:/sbin:/usr/sbin:/usr/local/bin
[ -f /proc/drbd ] || exit 1
if ( grep 'Secondary/Unknown' /proc/drbd );then
        drbdadm disconnect all
        drbdadm -- --discard-my-data connect all
        (sleep 1;echo 'drbd';sleep 2;echo 'drbd';sleep 3)|telnet 192.16.1.22
fi
shell.2 由heartbeat执行。
#! /bin/bash
#
# chkconfig: 345 15 88
# description: Linux High availability services .

# Source function library.
. /etc/init.d/functions
[ ! -f /etc/sysconfig/network ] && exit 1
. /etc/sysconfig/network

# Check that networking is up.
[ "${NETWORKING}" = "no" ] && exit 0

# if the ip configuration utility isn't around we can't function.
[ -x /sbin/ip ] || exit 1;[ -f /proc/drbd ] || exit 1
DRBDSTATE=$(drbdadm state all)
while $(grep -E "SyncSource.*Inconsistent" /proc/drbd >/dev/null 2<&-)
   do
        sleep 10
   done

start () {
        sleep 5
        ip addr add 192.16.1.20/24 brd 192.16.3.255 dev bond0
        /etc/init.d/portmap start
        drbdadm primary all
        mount -t ext3 -o rw /dev/drbd0 /mnt/disk0
        mount -t ext3 -o rw /dev/drbd1 /mnt/disk1
        /etc/init.d/nfs start
        /etc/init.d/nfslock start
        exportfs -avr
        return $RETVAL
}
stop () {
        ip addr del 192.16.1.20/24 brd 192.16.1.255 dev bond0
        /etc/init.d/nfs stop
        /etc/init.d/nfslock stop
        umount /mnt/disk0
        umount /mnt/disk1
        drbdadm secondary all
        /etc/init.d/portmap stop
        if ( grep 'Secondary/Unknown' /proc/drbd );then
        exec /etc/rc.d/my-shell.sh;fi
        return $RETVAL
}

# See how we were called.
case "$1" in
start|stop)
        $1
        ;;
restart|reload)
        /etc/init.d/$0 stop
        /etc/init.d/$0 start
        ;;
*)
        echo $"Usage: $0 {start|stop|restart|reload}"
        exit 1
esac

exit 0
添加用户 drbd
passwd
drbd:x:105:105:DRBD:/home/drbd:/sbin/drbdsh
drbdsh文件
#!/bin/sh
# Variables and Function definition
PATH=$PATH:/sbin:/usr/sbin:/usr/local/bin

#Program Main
[ -f /proc/drbd ] || exit 1
TEMP=$(drbdadm state all)
D_STATE=(${TEMP//\// })
if ( echo ${D_STATE[@]}|grep Primary >/dev/null 2<&- ) && \
   ( echo ${D_STATE[@]}|grep Unknown >/dev/null 2<&- );then
        drbdadm connect all
        else exit 1
fi
exit 0
启动 telnet
在 hosts.deny 里添加 in.telnet :ALL :ALL EXCEPT 192.16.1.22
这样就可以保证每次启动后数据的同步了。
下面是我的部分配置文件：
drbd.conf
global { usage-count yes; }
common { syncer { rate 10M; } }
resource r0 {
        protocol C;
        handlers { pri-on-incon-degr "halt -f"; }
        disk { on-io-error detach; }
        net {
                cram-hmac-alg sha1;
                shared-secret "800hr_disk_0";
        }
        on test01 {
                device          /dev/drbd0;
                disk            /dev/sda6;
                address         192.16.1.21:7789;
                meta-disk       internal;
        }
        on test02 {
                device          /dev/drbd0;
                disk            /dev/sda6;
                address         192.16.1.22:7789;
                meta-disk       internal;
        }
}

阅读(1518) | 评论(0) | 转发(0) |

上一篇：配置nfs服务器。

下一篇：ssh chroot

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6