一、介绍
这篇文档旨在介绍如何结合lvs+keepalived实现mysql cluster的高可用及负载均衡。此文是在1文(双机高可用)基础上增加lvs+keepalived,稍加修改就可适合更多结点的集群.
安装环境及软件包:
vmware workstation 5.5.3
mysql-5.2.3-falcon-alpha.tar.gz
gentoo 2006.1
ipvsadm-1.24.tar.gz
keepalived-1.1.13.tar.gz
linux-2.6.20.3.tar.bz2
iproute2-2.6.15-060110.tar.gz
Server1: 192.168.1.111 (ndb_mgmd, id=1)
Server2: 192.168.1.110 (ndb_mgmd,id=2)
二、在Server1和Server2上安装MySQL
以下步骤需要在Server1和Server2上各做一次
# mv mysql-5.2.3-falcon-alpha.tar.gz /tmp/package
# cd /tmp/package
# groupadd mysql
# useradd -g mysql mysql
# tar -zxvf mysql-5.2.3-falcon-alpha.tar.gz
# rm -f mysql-5.2.3-falcon-alpha.tar.gz
# mv mysql-5.2.3-falcon-alpha mysql
# cd mysql
# ./configure --prefix=/usr --with-extra-charsets=complex --with-plugin-ndbcluster --with-plugin-partition --with-plugin-innobase
# make && make install
#ln -s /usr/libexec/ndbd /usr/bin
#ln -s /usr/libexec/ndb_mgmd /usr/bin
#ln -s /usr/libexec/ndb_cpcd /usr/bin
#ln -s /usr/libexec/mysqld /usr/bin
#ln -s /usr/libexec/mysqlmanager /usr/bin
#mysql_install_db --user=mysql
三、安装并配置节点
以下步骤需要在Server1和Server2上各做一次
配置管理节点配置文件:
# mkdir /var/lib/mysql-cluster
# cd /var/lib/mysql-cluster
# vi config.ini
在config.ini中添加如下内容:
[ndbd default]
NoOfReplicas= 2
MaxNoOfConcurrentOperations= 10000
DataMemory= 80M
IndexMemory= 24M
TimeBetweenWatchDogCheck= 30000
DataDir= /var/lib/mysql-cluster
MaxNoOfOrderedIndexes= 512
StartPartialTimeout=100
StartPartitionedTimeout=100
ArbitrationTimeout=5000
TransactionDeadlockDetectionTimeout=5000
HeartbeatIntervalDbDb=5000
StopOnError=0
[ndb_mgmd default]
DataDir= /var/lib/mysql-cluster
[ndb_mgmd]
Id=1
HostName= 192.168.1.111
[ndb_mgmd]
Id=2
HostName= 192.168.1.110
[ndbd]
Id= 3
HostName= 192.168.1.111
[ndbd]
Id= 4
HostName= 192.168.1.110
[mysqld]
ArbitrationRank=2 (非常重要,全靠有它,才可以形成仲裁竞争,从而当另一个机子当了时,此机还可以有知道partion完整的节点)
[mysqld]
ArbitrationRank=2
[tcp default]
PortNumber= 63132
配置通用my.cnf文件,mysqld及ndbd,ndb_mgmd均使用此文件.
# vi /etc/my.cnf
在my.cnf中添加如下内容:
[mysqld]
default-storage-engine=ndbcluster 避免在sql语句中还要加入ENGINE=NDBCLUSTER。
ndbcluster
ndb-connectstring=192.168.1.111,192.168.1.110
[ndbd]
connect-string=192.168.1.111,192.168.1.110
[ndb_mgm]
connect-string=192.168.1.111,192.168.1.110
[ndb_mgmd]
config-file=/var/lib/mysql-cluster/config.ini
[mysql_cluster]
ndb-connectstring= 192.168.1.111,192.168.1.110
保存退出后,启动管理节点Server1为:
# ndb_mgmd --ndb_nodeid=1
启动管理节点Server2为:
# ndb_mgmd --ndb_nodeid=2
在启动时有一个警告提示
Cluster configuration warning:
arbitrator with id 1 and db node with id 3 on same host 192.168.1.111
arbitrator with id 2 and db node with id 4 on same host 192.168.1.110
Running arbitrator on the same host as a database node may
cause complete cluster shutdown in case of host failure.
说节点1和3,2和4的arbitrator一样,可能引起整个集群失败。(可以不用放在心上)
四、初始化集群
在Server1中
# ndbd --ndb_nodeid=3 --initial
在Server2中
# ndbd --ndb_nodeid=4 --iniitial
注:只有在第一次启动ndbd时或者对config.ini进行改动后才需要使用--initial参数!
五、检查工作状态
在任意一台机子上启动管理终端:
# ndb_mgm
键入show命令查看当前工作状态:(下面是一个状态输出示例)
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 192.168.1.111:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @192.168.1.111 (Version: 5.2.3, Nodegroup: 0, Master)
id=4 @192.168.1.110 (Version: 5.2.3, Nodegroup: 0)
[ndb_mgmd(MGM)] 2 node(s)
id=1 @192.168.1.111 (Version: 5.2.3)
id=2 @192.168.1.110 (Version: 5.2.3)
[mysqld(API)] 2 node(s)
id=5 (not connected, accepting connect from any host)
id=6 (not connected, accepting connect from any host)
ndb_mgm>
如果上面没有问题,现在开始加入mysqld(API):
注意,这篇文档对于MySQL并没有设置root密码,推荐你自己设置Server1和Server2的MySQL root密码。
在Server1 中:
#mysqld_safe --ndb_nodeid=5 --user=mysql &
在Server2 中:
#mysqld_safe --ndb_nodeid=6 --user=mysql &
# ndb_mgm -e show
信息如下:
Connected to Management Server at: 192.168.1.111:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @192.168.1.111 (Version: 5.2.3, Nodegroup: 0, Master)
id=4 @192.168.1.110 (Version: 5.2.3, Nodegroup: 0)
[ndb_mgmd(MGM)] 2 node(s)
id=1 @192.168.1.111 (Version: 5.2.3)
id=2 @192.168.1.110 (Version: 5.2.3)
[mysqld(API)] 4 node(s)
id=5 @192.168.1.111 (Version: 5.2.3)
id=6 @192.168.1.110 (Version: 5.2.3)
ok,可以测试了:
在Server1 中
# /usr/local/mysql/bin/mysql -u root -p
>create database aa;
> use aa;
> CREATE TABLE ctest (i INT) ;
> INSERT INTO ctest () VALUES (1);
> SELECT * FROM ctest;
应该可以看到1 row returned信息(返回数值1)。
如果上述正常,则换到Server2,观察效果。如果成功,则在Server2中执行INSERT再换回到Server1观察是否工作正常。
如果都没有问题,那么恭喜成功!
六、破坏性测试
将Server1或Server2的网线拔掉(即ifconfig eth0 down),观察另外一台集群服务器工作是否正常(可以使用SELECT查询测试)。测试完毕后,重新插入网线即可。
注意:在未对集群做任何读写操作前,此测试结果无效,因为,集群初始后只在/var/lib/mysql-cluster/下建了几个空目录,还没有正常协同工作,会出现整个所有数据节点关闭.
也可以这样测试:在Server1或Server2上:
# ps aux | grep ndbd
将会看到所有ndbd进程信息:
root 5578 0.0 0.3 6220 1964 ? S 03:14 0:00 ndbd
root 5579 0.0 20.4 492072 102828 ? R 03:14 0:04 ndbd
root 23532 0.0 0.1 3680 684 pts/1 S 07:59 0:00 grep ndbd
然后杀掉一个ndbd进程以达到破坏MySQL集群服务器的目的:
# kill -9 5578 5579
之后在另一台集群服务器上使用SELECT查询测试。并且在管理节点服务器的管理终端中执行show命令会看到被破坏的那台服务器的状态。
测试完成后,只需要重新启动被破坏服务器的ndbd进程即可:
# ndbd --ndb_nodeid=此节点的id
注意!前面说过了,此时是不用加--inital参数的!
至此,MySQL双机集群就配置完成了! (三篇文档重复的部分主要是考虑每篇可以单独成文使用)
以下步骤需要在Server1和Server2上各做一次
七、内核linux-2.6.20.3.tar.bz2安装
# tar xvjf linux-2.6.20.3.tar.bz2 -C /usr/src
#cd /usr/src/linux-2.6.20.3
#zcat /proc/config.gz .config
#make menuconfig
选择 [*] Network packet filtering framework (Netfilter) ---> 后在
[ ] TCP: MD5 Signature Option support (RFC2385) (EXPERIMENTAL) 下出现
IP: Virtual Server Configuration --->
关天netfilter内的配置及Virtual Server的配置根据自己的需要选择。
选择[*] IP: advanced router
Choose IP: FIB lookup algorithm (choose FIB_HASH if unsure) (FIB_HASH) --->
[*] IP: policy routing
# make all && make modules_install && make install
#vi /boot/grub.conf 加入
title=2.6.20.3
kernel /vmlinuz-2.6.20.3 root=/你的根设备
#reboot (以新内核启动系统)
八、安装ipvsadm和keepalived
#tar -zxvf ipvsadm-1.24.tar.gz -C /tmp/package
# cd /tmp/package/ipvsadm-1.24
# make && make install
#tar -zxvf keepalived-1.1.13.tar.gz -C /tmp/package
#cd /tmp/package/keepalived-1.1.13
#vi keepalived/vrrp/vrrp_arp.c
将 26 #include
27
28 /* local includes */
29 #include "vrrp_arp.h"
30 #include "memory.h"
31 #include "utils.h"
修改为
26 /* local includes */
27 #include "vrrp_arp.h"
28 #include "memory.h"
29 #include "utils.h"
30#include
31
就是将#include 这行移到下面.
#./configure --prefix=/usr --with-kernel-dir=/usr/src/linux-2.6.20.3
#make && make install
#vi /etc/init.d/keepalived 加入以下内容
#!/sbin/runscript
# Copyright 1999-2004 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/sys-cluster/keepalived/files/init-keepalived,v 1.3 2004/07/15 00:55:17 agriffis Exp $
depend() {
use logger
need net
}
checkconfig() {
if [ ! -e /etc/keepalived/keepalived.conf ] ; then
eerror "You need an /etc/keepalived/keepalived.conf file to run keepalived"
return 1
fi
}
start() {
checkconfig || return 1
ebegin "Starting Keepalived"
start-stop-daemon --start --quiet --pidfile /var/run/keepalived.pid \
--startas /usr/sbin/keepalived
eend $?
}
stop() {
ebegin "Stopping Keepalived"
start-stop-daemon --stop --quiet --pidfile /var/run/keepalived.pid
eend $?
}
此为gentoo的keepalived的脚本.
#chmod 755 /etc/init.d/keepalived
#rc-update add keepalived default
#vi /etc/keepalived/keepalived.conf 加入
! Configuration File for keepalived
global_defs {
router_id mysql_cluster
}
vrrp_sync_group VG1 { (此处是ha部分)
group {
VI_1
}
}
vrrp_instance VI_1 {
state MASTER
interface eth0
lvs_sync_daemon_interface eth0
virtual_router_id 1 (此处server1为1,server2为2)
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass mysqlcluster
}
virtual_ipaddress {
192.168.1.120
}
}
virtual_server 192.168.1.120 3306 { (此处定义负载均衡部分,使用DR方式)
delay_loop 6
lvs_sched wlc
lvs_method DR
persistence_timeout 60
ha_suspend
protocol TCP
real_server 192.168.1.110 3306 {
weight 1
TCP_CHECK {
connect_timeout 10
}
}
real_server 192.168.1.111 3306 {
weight 1
TCP_CHECK {
connect_timeout 10
}
}
}
九,启动
#/etc/init.d/keepalived start
#ip addr list (未安装iproute2 ,无此命今,可以使用emerge iproute2安装,注意emerge是gentoo的命今)
出现类似下面的信息
eth0: mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:6f:f9:21 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.111/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.120/32 scope global eth0 (此行表明虚拟的ip生效了)
inet6 fe80::20c:29ff:fe6f:f921/64 scope link
valid_lft forever preferred_lft forever
#tail /var/log/messages 可以查看更多信息.
类似如下
Keepalived: Starting Keepalived v1.1.13 (03/26,2007)
Keepalived_healthcheckers: Using LinkWatch kernel netlink reflector...
Keepalived_healthcheckers: Registering Kernel netlink reflector
Keepalived_healthcheckers: Registering Kernel netlink command channel
Keepalived_healthcheckers: Configuration is using : 9997 Bytes
Keepalived: Starting Healthcheck child process, pid=27738
Keepalived_vrrp: Using LinkWatch kernel netlink reflector...
Keepalived_vrrp: Registering Kernel netlink reflector
Keepalived_vrrp: Registering Kernel netlink command channel
Keepalived_vrrp: Registering gratutious ARP shared channel
Keepalived_vrrp: Configuration is using : 36549 Bytes
Keepalived: Starting VRRP child process, pid=27740
Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.110:3306]
Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.111:3306]
IPVS: sync thread started: state = MASTER, mcast_ifn = eth0, syncid = 2
Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
Keepalived_vrrp: VRRP_Group(VG1) Syncing instances to MASTER state
Keepalived_vrrp: Netlink: skipping nl_cmd msg...
十、结束语
三篇文档旨在从mysql cluster应用角度考虑如何更好的使用mysql及linux和相关工具.文中有不到及错误的地方敬请不吝指正。