MySQL高可用方案MHA的部署和原理-huangxifeng607-ChinaUnix博客

michael数据库kingdba.blog.chinaunix.net

博客访问： 390958
博文数量： 75
博客积分： 1732
博客等级：上尉
技术积分： 812
用户组：普通用户
注册时间： 2008-04-11 16:02

个人简介

博客很久没有更新了，原因是大多数时间都忙在研究技术上，却懒得腾时间出来把技术分享，最近在开源力量上开课《Mongodb管理与维护》，让屌丝们从0到精通，敬请关注。本博客技术原创更新滞后一些，找时间更新有关mysql,mongodb等内容，谢谢大家关注。

文章分类

全部博文（75）

postgresql（14）
oracle（2）
mysql（22）
SQL SERVER（6）
linux脚本（6）
未分配的博文（25）

文章存档

2021年（1）

2011年（20）

2010年（40）

2009年（7）

2008年（7）

我的朋友

相关博文

MySQL高可用方案MHA的部署和原理

分类： Mysql/postgreSQL

2021-08-05 18:04:18

集群信息

角色 IP地址 ServerID 类型

Master 192.168.138.128 1 写入

Candicate master 192.168.138.129 2 读

Slave 192.168.138.130 3 读

Monitor host 192.168.138.131 监控集群组

注：操作系统均为RHEL 7

其中，master对外提供写服务，备选master提供读服务，slave也提供相关的读服务，一旦master宕机，将会把备选master提升为新的master，slave指向新的master

一、在所有节点上安装MHA node

1. 在MySQL服务器上安装MHA node所需的perl模块（DBD:mysql）

# yum install perl-DBD-MySQL -y
yum install perl-ExtUtils-MakeMaker -y
yum install perl-CPAN -y

2. 在所有的节点上安装mha node

下载地址为：

# tar xvf mha4mysql-node-0.56.tar.gz

# cd mha4mysql-node-0.56

# perl Makefile.PL
# make

# make install

至此，MHA node节点安装完毕，会在/usr/local/bin下生成以下脚本文件

# ll /usr/local/bin/ total 44

-r-xr-xr-x 1 root root 16367 Jul 20 07:00 apply_diff_relay_logs

 -r-xr-xr-x 1 root root 4807 Jul 20 07:00 filter_mysqlbinlog

-r-xr-xr-x 1 root root 8261 Jul 20 07:00 purge_relay_logs

-r-xr-xr-x 1 root root 7525 Jul 20 07:00 save_binary_logs

二、在Monitor host节点上部署MHA Manager

# tar xvf mha4mysql-manager-0.56.tar.gz

# cd mha4mysql-manager-0.56

# perl Makefile.PL

# make

# make install

执行完毕后，会在/usr/local/bin下新增以下几个文件
# ll /usr/local/bin/
total 40
-r-xr-xr-x 1 root root 1991 Jul 20 00:50 masterha_check_repl
-r-xr-xr-x 1 root root 1775 Jul 20 00:50 masterha_check_ssh
-r-xr-xr-x 1 root root 1861 Jul 20 00:50 masterha_check_status
-r-xr-xr-x 1 root root 3197 Jul 20 00:50 masterha_conf_host
-r-xr-xr-x 1 root root 2513 Jul 20 00:50 masterha_manager
-r-xr-xr-x 1 root root 2161 Jul 20 00:50 masterha_master_monitor
-r-xr-xr-x 1 root root 2369 Jul 20 00:50 masterha_master_switch
-r-xr-xr-x 1 root root 5167 Jul 20 00:50 masterha_secondary_check
-r-xr-xr-x 1 root root 1735 Jul 20 00:50 masterha_stop

三、配置SSH登录无密码验证

1. 在manager上配置到所有Node节点的无密码验证

# ssh-keygen

一路按“Enter”

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.128

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.129

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.130

2. 在Master（192.168.138.128）上配置

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.129

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.130

3. 在Candicate master（192.168.138.129）上配置

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.128

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.130

4. 在Slave（192.168.138.130）上配置

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.128

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.138.129

四、搭建主从复制环境

1. 在Master上执行备份

# mysqldump --master-data=2 --single-transaction -R --triggers -A > all.sql

其中，-R是备份存储过程，--triggers是备份触发器 -A代表全库

2. 在Master上创建复制用户

mysql> grant replication slave on *.* to 'repl'@'192.168.%' identified by 'repl';
Query OK, 0 rows affected (0.09 sec)

3. 查看备份文件all.sql中的CHANGE MASTER语句

# head -n 30 all.sql

-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=120;

4. 将备份文件复制到Candicate master和Slave上

# scp all.sql 192.168.138.129:/root/

# scp all.sql 192.168.138.130:/root/

5. 在Candicate master上搭建从库

# mysql < all.sql

mysql> CHANGE MASTER TO
-> MASTER_HOST='192.168.138.128',
-> MASTER_USER='repl',
-> MASTER_PASSWORD='repl',
-> MASTER_LOG_FILE='mysql-bin.000002',
-> MASTER_LOG_POS=120;
Query OK, 0 rows affected, 2 warnings (0.19 sec)

mysql> start slave;
Query OK, 0 rows affected (0.02 sec)

mysql> show slave status\G

6. 在Slave上搭建从库

7. slave服务器设置为read only

mysql> set global read_only=1;
Query OK, 0 rows affected (0.04 sec)

8. 在Master中创建监控用户

mysql> grant all privileges on *.* to 'monitor'@'%' identified by 'monitor123';
Query OK, 0 rows affected (0.07 sec)

五、配置MHA

1. 在Monitor host（192.168.138.131）上创建MHA工作目录，并且创建相关配置文件

# mkdir -p /etc/masterha

# vim /etc/masterha/app1.cnf

[server default]
manager_log=/masterha/app1/manager.log //设置manager的日志
manager_workdir=/masterha/app1 //设置manager的工作目录
master_binlog_dir=/var/lib/mysql //设置master默认保存binlog的位置，以便MHA可以找到master的日志
master_ip_failover_script= /usr/local/bin/master_ip_failover //设置自动failover时候的切换脚本
master_ip_online_change_script= /usr/local/bin/master_ip_online_change //设置手动切换时候的切换脚本
user=monitor // 设置监控用户
password=monitor123 //设置监控用户的密码
ping_interval=1 //设置监控主库，发送ping包的时间间隔，默认是3秒，尝试三次没有回应的时候进行自动failover
remote_workdir=/tmp //设置远端mysql在发生切换时binlog的保存位置
repl_user=repl //设置复制环境中的复制用户名
repl_password=repl //设置复制用户的密码
report_script=/usr/local/bin/send_report //设置发生切换后发送的报警的脚本
secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.138.129 -s 192.168.138.130 --user=root --master_host=192.168.138.128 --master_ip=192.168.138.128 --master_port=3306 //一旦MHA到master的监控之间出现问题，MHA Manager将会判断其它两个slave是否能建立到master_ip 3306端口的连接
shutdown_script="" //设置故障发生后关闭故障主机脚本（该脚本的主要作用是关闭主机防止发生脑裂）
ssh_user=root //设置ssh的登录用户名

[server1]
hostname=192.168.138.128
port=3306

[server2]
hostname=192.168.138.129
port=3306
candidate_master=1 //设置为候选master，如果设置该参数以后，发生主从切换以后将会将此从库提升为主库，即使这个主库不是集群中最新的slave
check_repl_delay=0 //默认情况下如果一个slave落后master 100M的relay logs的话，MHA将不会选择该slave作为一个新的master，因为对于这个slave的恢复需要花费很长时间，通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时，这个参数对于设置了candidate_master=1的主机非常有用，因为它保证了这个候选主在切换过程中一定是最新的master

[server3]
hostname=192.168.138.130
port=3306

注意：

1> 在编辑该文件时，后面的注释切记要去掉，MHA并不会将后面的内容识别为注释。

2> 配置文件中设置了master_ip_failover_script，secondary_check_script，master_ip_online_change_script，report_script，对应的文件见文章末尾。

2. 设置relay log清除方式（在每个Slave上）

mysql> set global relay_log_purge=0;
Query OK, 0 rows affected (0.00 sec)

MHA在发生切换过程中，从库在恢复的过程中，依赖于relay log的相关信息，所以我们这里要将relay log的自动清楚设置为OFF，采用手动清楚relay log的方式。

在默认情况下，从服务器上的中继日志会在SQL线程执行完后被自动删除。但是在MHA环境中，这些中继日志在恢复其它从服务器时可能会被用到，因此需要禁用中继日志的自动清除。改为定期手动清除SQL线程应用完的中继日志。

在ext3文件系统下，删除大的文件需要一定的时间，这样会导致严重的复制延迟，所以在Linux中，一般都是通过硬链接的方式来删除大文件。

3. 设置定期清理relay脚本

MHA节点中包含了purge_relay_logs脚本，它可以为relay log创建硬链接，执行set global relay_log_purge=1，等待几秒钟以便SQL线程切换到新的中继日志，再执行set global relay_log_purge=0。

下面看看脚本的使用方法：

# purge_relay_logs --user=monitor --password=monitor123 -disable_relay_log_purge --workdir=/tmp/

其中，

--user：mysql用户名

--password：mysql用户的密码

--host： mysqlserver地址

--workdir：指定创建relay log的硬链接的位置，默认的是/var/tmp。由于系统不同分区创建硬链接文件会失败，故需要指定具体的硬链接的位置。

--disable_relay_log_purge：默认情况下，如果relay_log_purge=1，则脚本会直接退出。通过设置这个参数，该脚本会首先将relay_log_purge设置为1，清除掉relay log后，再将该参数设置为0。

设置crontab来定期清理relay log

MHA在切换的过程中会直接调用mysqlbinlog命令，故需要在环境变量中指定mysqlbinlog的具体路径。

# vim /etc/cron.d/purge_relay_logs

0 4 * * * /usr/local/bin/purge_relay_logs --user=monitor --password=monitor123 -disable_relay_log_purge --workdir=/tmp/ >> /tmp/purge
_relay_logs.log 2>&1

注意：最好是每台slave服务器在不同时间点执行该计划任务。

4. 将mysqlbinlog的路径添加到环境变量中

六、检查SSH的配置

在Monitor host上执行

# masterha_check_ssh --conf=/etc/masterha/app1.cnf

七、查看整个集群的状态

在Monitor host上执行

# masterha_check_repl --conf=/etc/masterha/app1.cnf

八、检查MHA Manager的状态

# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 is stopped(2:NOT_RUNNING).

如果正常，会显示“PING_OK”，否则会显示“NOT_RUNNING”，代表MHA监控还没有开启。

九、开启MHA Manager监控

# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 &

其中，

remove_dead_master_conf：该参数代表当发生主从切换后，老的主库的IP将会从配置文件中移除。

ignore_last_failover：在默认情况下，MHA发生切换后将会在/masterha/app1下产生app1.failover.complete文件，下次再次切换的时候如果发现该目录下存在该文件且两次切换的时间间隔不足8小时的话，将不允许触发切换。除非在第一次切换后手动rm -rf /masterha/app1/app1.failover.complete。该参数代表忽略上次MHA触发切换产生的文件。

查看MHA Manager监控是否正常

# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 (pid:1873) is running(0:PING_OK), master:192.168.138.128

十、关闭MHA Manager监控

# masterha_stop --conf=/etc/masterha/app1.cnf 
Stopped app1 successfully.
[1]+  Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1

至此，MHA部分配置完毕，下面，来配置VIP。

十一、VIP配置

2. 通过脚本的方式管理VIP

编辑/usr/local/bin/master_ip_failover

#!/usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);

my $vip = '192.168.138.20';
my $key = "2";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip/32";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $ssh_send_garp = "/sbin/arping -U $vip -I ens33 -c 1";

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP an old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {

my $new_master_handler = new MHA::DBHelper();

# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );

## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print "Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();

## Creating an app user on the new master
# print "Creating app user on the new master..\n";
# FIXME_xxx_create_user( $new_master_handler->{dbh} );
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();

print "Enabling the VIP $vip on the new master: $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;

# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {

# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}

sub start_vip(){
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
`ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`;
}

sub stop_vip(){ return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

实际生产环境中，推荐这种方式来管理VIP，可有效防止脑裂情况的发生。

至此，MHA高可用环境基本搭建完毕。

关于MHA的常见操作，包括自动Failover，手动Failover，在线切换，

阅读(1917) | 评论(0) | 转发(0) |

上一篇：shell脚本操作mysql数据库（一）

下一篇：没有了

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6