Chinaunix首页 | 论坛 | 认证专区 | 博客 登录 | 注册
  • 博客访问: 102283
  • 博文数量: 6
  • 博客积分: 172
  • 博客等级: 入伍新兵
  • 技术积分: 77
  • 用 户 组: 普通用户
  • 注册时间: 2011-12-25 17:24
文章分类

全部博文(6)

文章存档

2012年(5)

2011年(1)

我的朋友
微信关注

IT168企业级官微



微信号:IT168qiye



系统架构师大会



微信号:SACC2013

订阅
热词专题

分类: Mysql/postgreSQL

1.背景

         mysql的replication运行过程中有时会因为网络故障等一些原因,主从停止同步。但此时,show slave status显示IO/SQL线程都是正常的。常规的check_mysql监控方式根本发现不了问题。

2. 解决方案

       利用replication中主从的日志位置差判断是否同步。

       Master端: show master status中的position  显然,这是master端写binlog的位置

       Slave端: show slave stuatus中的exec_master_log_pos  是slave端sql线程读取、执行的位置
       这样如果IO/SQL线程不正常,或者同步有异常都能检测出来。

3. 实施

        自己写了个shell脚本对比两个值监控nagios同步情况。后来,偶然在网上发现一个perl脚本,写的很不错,和大家共享。

#!/usr/bin/perl -w
use strict;
use Getopt::Long;
use DBI;

my $options = { 'master' => 'server1', 'slave' => 'server2', 'crit' => 1000000000,
                'warn' => 10000000 , 'dbuser' => 'repl_test', 'dbpass' => 'secret',
                'mport' => '3306', 'sport' => '3306'};
GetOptions($options, "master=s", "slave=s", "dbuser=s", "dbpass=s", "crit=i", "warn=i", "mport=i", "sport=i", "help");
my $max_binlog;

if (defined $options->{'help'}) {
    print <$0: check replication between mysql databases

check_replication.pl [ --master ] [ --slave ]
[ --crit ] [ --warn ] [ --dbuser ]
[ --dbpass ]

  --master     - MySQL instance running as a master server
  --mport      - Port on the master server to connect to
  --slave      - MySQL instance running as a slave server
  --sport      - Port on the slave server to connect to
  --crit - Number of binlog positions for critical state
  --warn - Number of binlog positions for warning state
  --dbuser     - Username with File and Process privs to check status
  --dbpass     - Password for above user
  --help             - This help page

The user that is testing must be the same on all instances, eg:
  GRANT File, Process on *.* TO repl_test\@192.168.0.% IDENTIFIED BY

Note: Any mysqldump tables (for backups) may lock large tables for a long 
time. If you dump from your slave for this, then your master will gallop
away from your slave, and the difference will become large. The trick is to
set crit above this differnce and warn below.

(c) 2004 Fotango. James Bromberger .
FOO
exit;
}

sub get_status {
    my $host = shift;
    my $role = shift;
    my $port = "";
    if (lc ($role) eq 'master') {
        $port = $options->{'mport'};
    }
    if (lc ($role) eq 'slave') {
        $port = $options->{'sport'};
    }
    my $dbh = DBI->connect("DBI:mysql:host=$host;port=$port", $options->{'dbuser'},
    $options->{'dbpass'});
    if (not $dbh) {
        print "UNKNOWN: cannot connect to $host";
        exit 3;
    }

    if (lc ($role) eq 'master') {
        my $sql1 = "show variables like 'max_binlog_size'";
        my $sth1 = $dbh->prepare($sql1);
        my $res1 = $sth1->execute;
        my $ref1 = $sth1->fetchrow_hashref;
        $max_binlog = $ref1->{'Value'};
    }
    my $sql = sprintf "SHOW %s STATUS", $role;
    my $sth = $dbh->prepare($sql);
    my $res = $sth->execute;
    if (not $res) {
        die "No results";
    }
    my $ref = $sth->fetchrow_hashref;
    $sth->finish;
    #print "$host:\n";
    #print join (', ', map { sprintf " %s: %s", $_, $ref->{$_} } keys %{$ref}) . "\n";
    $dbh->disconnect;
    return $ref;
}

sub compare_status {
    my ($a, $b) = @_;
    my ($master, $slave);
    if (defined($a->{'File'})) {
        $master = $a;
        $slave = $b;
    } elsif (defined($b->{'File'})) {
        $master = $b;
        $slave = $a;
    }
    $master->{'File_No'} = $1 if ($master->{'File'} =~ /(\d+)$/);
    $slave->{'File_No'} = $1 if ($slave->{'Master_Log_File'} =~ /(\d+)$/);


    my $diff = ($master->{'File_No'} - $slave->{'File_No'}) * $max_binlog;
    #printf "Master: %d Slave: %d\n", $master->{'Position'}, $slave->{'Pos'};
    $diff+= $master->{'Position'} - $slave->{'Exec_Master_Log_Pos'};
    my $state = sprintf "Master: %d/%d  Slave: %d/%d  Diff: %d/%d\n",
        $master->{'File_No'}, $master->{'Position'}, $slave->{'File_No'},
        $slave->{'Exec_Master_Log_Pos'}, ($diff/$max_binlog), ($diff % $max_binlog);
    if ($diff >= $options->{'crit'}) {
        print "CRITICAL: $state";
        exit 2;
    } elsif ($diff >= $options->{'warn'}) {
        print "WARN: $state";
        exit 1;
    }
    print "OK: $state";
    exit 0;
}

compare_status(get_status($options->{'slave'}, 'slave'), get_status($options->{'master'}, 'master'));

Have fun!
 
Refs:
阅读(681) | 评论(1) | 转发(0) |
0

上一篇:用jstack诊断java应用故障

下一篇:没有了

给主人留下些什么吧!~~

坏坏小丸子2012-03-23 06:41:36

网络故障会导致主从停止同步?

评论热议
请登录后评论。

登录 注册