Chinaunix首页 | 论坛 | 博客
  • 博客访问: 25197
  • 博文数量: 5
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 137
  • 用 户 组: 普通用户
  • 注册时间: 2014-03-03 08:00
文章分类

全部博文(5)

文章存档

2014年(5)

我的朋友

分类: Mysql/postgreSQL

2014-03-03 08:24:44

1,主机规划
10.10.54.26  计划安装        MHA mha4mysql-manager-0.53.tar.gz
10.10.54.27  计划安装        MHA mha4mysql-node-0.53.tar.gz
10.10.54.25  计划安装        MHA mha4mysql-node-0.53.tar.gz
10.10.54.22  计划安装        MHA mha4mysql-node-0.53.tar.gz
2.下载软件并安装
所需软件依赖
 perl-DBD-MySQL.x86_64(软件源安装即可)
源码包
List-MoreUtils-0.12.tar.gz
Log-Dispatch-2.28.tar.gz
Module-Build-0.3601.tar.gz
Module-CoreList-3.07.tar.gz
Module-Implementation-0.07.t
Module-Install-1.06.tar.gz
Module-Metadata-1.000019.tar
Module-Runtime-0.014.tar.gz
Module-ScanDeps-1.13.tar.gz
Parallel-ForkManager-1.06.ta
Params-Validate-1.08.tar.gz
PAR-Dist-0.49.tar.gz
Parse-CPAN-Meta-1.4413.tar.g
Perl-OSType-1.007.tar.gz
Software-License-0.103009.ta
Test-Fatal-0.013.tar.gz
yaml-0.1.4-7.ram0.98.x86_64.
YAML-0.90.tar.gz
YAML-Tiny-1.61.tar.gz


MHA管理软件




mha4mysql-node-0.53.tar.gz
master:
[root@wang mha4mysql-node-0.53]# perl Makefile.PL 

点击(此处)折叠或打开

  1. *** Module::AutoInstall version 1.03
  2. *** Checking for Perl dependencies...
  3. [Core Features]
  4. - DBI ...loaded. (1.609)
  5. - DBD::mysql ...loaded. (4.013)
  6. *** Module::AutoInstall configuration finished.
  7. Checking if your kit is complete...
  8. Looks good
  9. Writing Makefile for mha4mysql::node
[root@wang mha4mysql-node-0.53]# make  && make install

[root@wang mha4mysql-manager-0.53]# perl Makefile.PL 

点击(此处)折叠或打开

  1. *** Module::AutoInstall version 1.03
  2. *** Checking for Perl dependencies...
  3. [Core Features]
  4. - DBI ...loaded. (1.609)
  5. - DBD::mysql ...loaded. (4.013)
  6. - Time::HiRes ...loaded. (1.9726)
  7. - Config::Tiny ...loaded. (2.20)
  8. - Log::Dispatch ...loaded. (2.28)
  9. - Parallel::ForkManager ...loaded. (1.06)
  10. - MHA::NodeConst ...loaded. (0.53)
  11. *** Module::AutoInstall configuration finished.
  12. Writing Makefile for mha4mysql::manager
[root@wang mha4mysql-manager-0.53]# make && make install

slave
[root@wang mha4mysql-node-0.53]# perl Makefile.PL 
[root@wang mha4mysql-node-0.53]# make  && make install
3)检查和配置SSH无密码验证设置
创建配置文件
[root@wang home]# vi /etc/app1.cnf 

点击(此处)折叠或打开

  1. [server default]
  2. # mysql user and password
  3. user=root
  4. password=123
  5. ssh_user=root
  6. # working directory on the manager
  7. manager_workdir=/var/log/masterha/app1 #不存在Node会自动创建
  8. manager_log=/var/log/masterha/app1/app1.log
  9. # working directory on MySQL servers
  10. remote_workdir=/var/log/masterha/app1
  11. master_binlog_dir=/var/lib/mysql/
  12. check_repl_delay=0
  13. [server1]
  14. hostname=10.10.54.27
  15. candidate_master=1
  16. [server2]
  17. hostname=10.10.54.22
  18. candidate_master=1
  19. [server3]
  20. hostname=10.10.54.25
  21. no_master=1

-------------------------
在manager上10.10.54.26配置到个Node的无密码验证:
[root@wang home]# ssh-keygen -t rsa
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.22
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.25
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.27


在Node:10.10.54.22上:
[root@wang home]# ssh-keygen -t rsa
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.25
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.27
在Node:10.10.54.25上:
[root@wang home]# ssh-keygen -t rsa
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.22
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.27
在Node:10.10.54.27上:
[root@wang home]# ssh-keygen -t rsa
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.22
[root@wang home]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.10.54.27


检查SSH:
[root@wang home]# masterha_check_ssh --conf=/etc/app1.cnf 

点击(此处)折叠或打开

  1. Fri Feb 28 10:44:31 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  2. Fri Feb 28 10:44:31 2014 - [info] Reading application default configurations from /etc/app1.cnf..
  3. Fri Feb 28 10:44:31 2014 - [info] Reading server configurations from /etc/app1.cnf..
  4. Fri Feb 28 10:44:31 2014 - [info] Starting SSH connection tests..
  5. Fri Feb 28 10:44:31 2014 - [debug]
  6. Fri Feb 28 10:44:31 2014 - [debug] Connecting via SSH from root@10.10.54.27(10.10.54.27:22) to root@10.10.54.22(10.10.54.22:22)..
  7. Fri Feb 28 10:44:31 2014 - [debug] ok.
  8. Fri Feb 28 10:44:31 2014 - [debug] Connecting via SSH from root@10.10.54.27(10.10.54.27:22) to root@10.10.54.25(10.10.54.25:22)..
  9. Fri Feb 28 10:44:31 2014 - [debug] ok.
  10. Fri Feb 28 10:44:32 2014 - [debug]
  11. Fri Feb 28 10:44:31 2014 - [debug] Connecting via SSH from root@10.10.54.22(10.10.54.22:22) to root@10.10.54.27(10.10.54.27:22)..
  12. Fri Feb 28 10:44:31 2014 - [debug] ok.
  13. Fri Feb 28 10:44:31 2014 - [debug] Connecting via SSH from root@10.10.54.22(10.10.54.22:22) to root@10.10.54.25(10.10.54.25:22)..
  14. Fri Feb 28 10:44:31 2014 - [debug] ok.
  15. Fri Feb 28 10:44:32 2014 - [debug]
  16. Fri Feb 28 10:44:32 2014 - [debug] Connecting via SSH from root@10.10.54.25(10.10.54.25:22) to root@10.10.54.27(10.10.54.27:22)..
  17. Fri Feb 28 10:44:32 2014 - [debug] ok.
  18. Fri Feb 28 10:44:32 2014 - [debug] Connecting via SSH from root@10.10.54.25(10.10.54.25:22) to root@10.10.54.22(10.10.54.22:22)..
  19. Fri Feb 28 10:44:32 2014 - [debug] ok.
  20. Fri Feb 28 10:44:32 2014 - [info] All SSH connection tests passed successfully.

4)检查复制状态
[root@wang mha4mysql-manager-0.53]# masterha_check_repl --conf=/etc/app1.cnf

点击(此处)折叠或打开

  1. Fri Feb 28 10:50:40 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  2. Fri Feb 28 10:50:40 2014 - [info] Reading application default configurations from /etc/app1.cnf..
  3. Fri Feb 28 10:50:40 2014 - [info] Reading server configurations from /etc/app1.cnf..
  4. Fri Feb 28 10:50:40 2014 - [info] MHA::MasterMonitor version 0.53.
  5. Creating directory /var/log/masterha/app1 #不存在Node会自动创建.. done.
  6. Fri Feb 28 10:50:40 2014 - [info] Dead Servers:
  7. Fri Feb 28 10:50:40 2014 - [info] 10.10.54.25(10.10.54.25:3306)
  8. Fri Feb 28 10:50:40 2014 - [info] Alive Servers:
  9. Fri Feb 28 10:50:40 2014 - [info] 10.10.54.27(10.10.54.27:3306)
  10. Fri Feb 28 10:50:40 2014 - [info] 10.10.54.22(10.10.54.22:3306)
  11. Fri Feb 28 10:50:40 2014 - [info] Alive Slaves:
  12. Fri Feb 28 10:50:40 2014 - [info] 10.10.54.27(10.10.54.27:3306) Version=5.5.35-ndb-7.2.15-cluster-gpl-log (oldest major version between slaves) log-bin:enabled

错误处理:
1,Fri Feb 28 11:24:33 2014 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln193] There is no alive slave. We can't do failover
解决办法:
更改10.10.54.22的server-id=10
让10.10.54.22 与10.10.54.27建立主主互备构架即可


2,Fri Feb 28 11:34:39 2014 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln732] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:
解决办法:
当前是主主互备构架,设置一个candicate主机22以及其他从机(27)均为read_only。


3,Fri Feb 28 13:24:46 2014 - [info]   Connecting to root@10.10.54.27(10.10.54.27:22).. 
Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 99.
mysqlbinlog version not found!

解决办法
在/usr/local/bin目录下建立相关命令的连接如下:
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
2}修改/etc/ssh/sshd_config 文件
PermitUserEnvironment yes
vim ~/.ssh/environment
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mysql/bin
然后要重启sshd服务。

正确状态

点击(此处)折叠或打开

  1. Fri Feb 28 13:57:14 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  2. Fri Feb 28 13:57:14 2014 - [info] Reading application default configurations from /etc/app1.cnf..
  3. Fri Feb 28 13:57:14 2014 - [info] Reading server configurations from /etc/app1.cnf..
  4. Fri Feb 28 13:57:14 2014 - [info] MHA::MasterMonitor version 0.53.
  5. Fri Feb 28 13:57:14 2014 - [info] Multi-master configuration is detected. Current primary(writable) master is 10.10.54.22(10.10.54.22:3306)
  6. Fri Feb 28 13:57:14 2014 - [info] Master configurations are as below:
  7. Master 10.10.54.27(10.10.54.27:3306), replicating from 10.10.54.22(10.10.54.22:3306), read-only
  8. Master 10.10.54.22(10.10.54.22:3306), replicating from 10.10.54.27(10.10.54.27:3306)


  9. Fri Feb 28 13:57:14 2014 - [info] Dead Servers:
  10. Fri Feb 28 13:57:14 2014 - [info] Alive Servers:
  11. Fri Feb 28 13:57:14 2014 - [info] 10.10.54.27(10.10.54.27:3306)
  12. Fri Feb 28 13:57:14 2014 - [info] 10.10.54.22(10.10.54.22:3306)
  13. Fri Feb 28 13:57:14 2014 - [info] 10.10.54.25(10.10.54.25:3306)
  14. Fri Feb 28 13:57:14 2014 - [info] Alive Slaves:
  15. Fri Feb 28 13:57:14 2014 - [info] 10.10.54.27(10.10.54.27:3306) Version=5.5.35-ndb-7.2.15-cluster-gpl-log (oldest major version between slaves) log-bin:enabled
  16. Fri Feb 28 13:57:14 2014 - [info] Replicating from 10.10.54.22(10.10.54.22:3306)
  17. Fri Feb 28 13:57:14 2014 - [info] Primary candidate for the new Master (candidate_master is set)
  18. Fri Feb 28 13:57:14 2014 - [info] 10.10.54.25(10.10.54.25:3306) Version=5.5.35-ndb-7.2.15-cluster-gpl-log (oldest major version between slaves) log-bin:enabled
  19. Fri Feb 28 13:57:14 2014 - [info] Replicating from 10.10.54.22(10.10.54.22:3306)
  20. Fri Feb 28 13:57:14 2014 - [info] Not candidate for the new Master (no_master is set)
  21. Fri Feb 28 13:57:14 2014 - [info] Current Alive Master: 10.10.54.22(10.10.54.22:3306)
  22. Fri Feb 28 13:57:14 2014 - [info] Checking slave configurations..
  23. Fri Feb 28 13:57:14 2014 - [info] Checking replication filtering settings..
  24. Fri Feb 28 13:57:14 2014 - [info] binlog_do_db= , binlog_ignore_db=
  25. Fri Feb 28 13:57:14 2014 - [info] Replication filtering check ok.
  26. Fri Feb 28 13:57:14 2014 - [info] Starting SSH connection tests..
  27. Fri Feb 28 13:57:16 2014 - [info] All SSH connection tests passed successfully.
  28. Fri Feb 28 13:57:16 2014 - [info] Checking MHA Node version..
  29. Fri Feb 28 13:57:16 2014 - [info] Version check ok.
  30. Fri Feb 28 13:57:16 2014 - [info] Checking SSH publickey authentication settings on the current master..
  31. Fri Feb 28 13:57:16 2014 - [info] HealthCheck: SSH to 10.10.54.22 is reachable.
  32. Fri Feb 28 13:57:16 2014 - [info] Master MHA Node version is 0.53.
  33. Fri Feb 28 13:57:16 2014 - [info] Checking recovery script configurations on the current master..
  34. Fri Feb 28 13:57:16 2014 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/ndb --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.53 --start_file=slave22-bin.000007
  35. Fri Feb 28 13:57:16 2014 - [info] Connecting to root@10.10.54.22(10.10.54.22)..
  36.   Creating /var/log/masterha/app1 if not exists.. ok.
  37.   Checking output directory is accessible or not..
  38.    ok.
  39.   Binlog found at /data/ndb, up to slave22-bin.000007
  40. Fri Feb 28 13:57:17 2014 - [info] Master setting check done.
  41. Fri Feb 28 13:57:17 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
  42. Fri Feb 28 13:57:17 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=10.10.54.27 --slave_ip=10.10.54.27 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.35-ndb-7.2.15-cluster-gpl-log --manager_version=0.53 --relay_log_info=/data/ndb/relay-log.info --relay_dir=/data/ndb/ --slave_pass=xxx
  43. Fri Feb 28 13:57:17 2014 - [info] Connecting to root@10.10.54.27(10.10.54.27:22)..
  44.   Checking slave recovery environment settings..
  45.     Opening /data/ndb/relay-log.info ... ok.
  46.     Relay log found at /data/ndb, up to relay-log.000002
  47.     Temporary relay log file is /data/ndb/relay-log.000002
  48.     Testing mysql connection and privileges.. done.
  49.     Testing mysqlbinlog output.. done.
  50.     Cleaning up test file(s).. done.
  51. Fri Feb 28 13:57:17 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=10.10.54.25 --slave_ip=10.10.54.25 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.35-ndb-7.2.15-cluster-gpl-log --manager_version=0.53 --relay_log_info=/data/ndb/relay-log.info --relay_dir=/data/ndb/ --slave_pass=xxx
  52. Fri Feb 28 13:57:17 2014 - [info] Connecting to root@10.10.54.25(10.10.54.25:22)..
  53.   Checking slave recovery environment settings..
  54.     Opening /data/ndb/relay-log.info ... ok.
  55.     Relay log found at /data/ndb, up to wang-relay-bin.000012
  56.     Temporary relay log file is /data/ndb/wang-relay-bin.000012
  57.     Testing mysql connection and privileges.. done.
  58.     Testing mysqlbinlog output.. done.
  59.     Cleaning up test file(s).. done.
  60. Fri Feb 28 13:57:17 2014 - [info] Slaves settings check done.
  61. Fri Feb 28 13:57:17 2014 - [info]
  62. 10.10.54.22 (current master)
  63.  +--10.10.54.27
  64.  +--10.10.54.25


  65. Fri Feb 28 13:57:17 2014 - [info] Checking replication health on 10.10.54.27..
  66. Fri Feb 28 13:57:17 2014 - [info] ok.
  67. Fri Feb 28 13:57:17 2014 - [info] Checking replication health on 10.10.54.25..
  68. Fri Feb 28 13:57:17 2014 - [info] ok.
  69. Fri Feb 28 13:57:17 2014 - [warning] master_ip_failover_script is not defined.
  70. Fri Feb 28 13:57:17 2014 - [warning] shutdown_script is not defined.
  71. Fri Feb 28 13:57:17 2014 - [info] Got exit code 0 (Not master dead).


  72. MySQL Replication Health is OK.

当前是主主互备构架,设置一个candicate主机22以及其他从机均为read_only。

mysql> set global read_only=1;
Query OK, 0 rows affected (0.00 sec)


mysql> show variables like 'read_only';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | ON    |
+---------------+-------+
1 row in set (0.00 sec)


设置主机和其他从机位relay_log_purge:


mysql>  set global relay_log_purge=0;
Query OK, 0 rows affected (0.00 sec)


mysql>  show variables like 'relay_log_purge';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| relay_log_purge | OFF   |
+-----------------+-------+
1 row in set (0.00 sec)


设置relay_log_purge=0 后必须对relay log文件进行有效的管理:
vi /etc/cron.d/purge_relay_logs 
# purge relay logs at 5am  
0 5 * * * app /usr/bin/purge_relay_logs --user=root --password=123 --disable_relay_log_purge >> /var/log/masterha/purge_relay_logs.log 2>&1
/etc/init.d/crond restart


五,测试运营
1)启动MHA manager:
[root@wang ~]#  masterha_manager --conf=/etc/app1.cnf --remove_dead_master_conf
Fri Feb 28 14:12:32 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Feb 28 14:12:32 2014 - [info] Reading application default configurations from /etc/app1.cnf..
Fri Feb 28 14:12:32 2014 - [info] Reading server configurations from /etc/app1.cnf..

可以后台运营
[root@wang ~]# nohup masterha_manager --conf=/etc/app1.cnf --remove_dead_master_conf < /dev/null > /var/log/masterha/app1/app1.log 2>&1 &

2)检查manager运营状态:
[root@wang ~]# masterha_check_status --conf=/etc/app1.cnf
app1 (pid:1169) is running(0:PING_OK), master:10.10.54.22

3)手工切换测试
方式一:手工对dead master进行切换
如果,MHA manager检测到没有dead的server,将报错,并结束failover:
[root@wang ~]# masterha_master_switch --master_state=dead --conf=/etc/app1.cnf --dead_master_host=10.10.54.28 --new_master_host=10.10.54.27
Fri Feb 28 14:16:29 2014 - [info] Dead Servers:
Fri Feb 28 14:16:29 2014 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln181] None of server is dead. Stop failover.
Fri Feb 28 14:16:29 2014 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln178] Got ERROR:  at /usr/local/bin/masterha_master_switch 

方式二:当前master在正常运行情况的切换:
[root@wang ~]# masterha_master_switch --conf=/etc/app1.cnf --master_state=alive --new_master_host=10.10.54.27


报错:
Fri Feb 28 14:18:41 2014 - [error][/usr/local/share/perl5/MHA/MasterRotate.pm, ln142] Getting advisory lock failed on the current master. MHA Monitor runs on the current master. Stop MHA Manager/Monitor and try again.
解决办法:
[root@wang ~]# vi /etc/app1.cnf 
[root@wang ~]# masterha_stop --conf=/etc/app1.cnf
Stopped app1 successfully.
[1]+  Exit 1                  nohup masterha_manager --conf=/etc/app1.cnf --remove_dead_master_conf < /dev/null > /var/log/masterha/app1/app1.log 2>&1
[root@wang ~]# masterha_master_switch --conf=/etc/app1.cnf --master_state=alive --new_master_host=10.10.54.27

部分结果为

点击(此处)折叠或打开

  1. It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.10.54.22(10.10.54.22:3306)? (YES/no): yes
  2. Fri Feb 28 14:24:46 2014 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
  3. Fri Feb 28 14:24:46 2014 - [info] ok.
  4. Fri Feb 28 14:24:46 2014 - [info] Checking MHA is not monitoring or doing failover..
  5. Fri Feb 28 14:24:46 2014 - [info] Checking replication health on 10.10.54.27..
  6. Fri Feb 28 14:24:46 2014 - [info] ok.
  7. Fri Feb 28 14:24:46 2014 - [info] Checking replication health on 10.10.54.25..
  8. Fri Feb 28 14:24:46 2014 - [info] ok.
  9. Fri Feb 28 14:24:46 2014 - [info] 10.10.54.27 can be new master.
  10. Fri Feb 28 14:24:46 2014 - [info]
  11. From:
  12. 10.10.54.22 (current master)
  13.  +--10.10.54.27
  14.  +--10.10.54.25
  15. To:
  16. 10.10.54.27 (new master)
  17.  +--10.10.54.25
  18. Fri Feb 28 14:24:56 2014 - [info] 10.10.54.27: Resetting slave info succeeded.
  19. Fri Feb 28 14:24:56 2014 - [info] Switching master to 10.10.54.27(10.10.54.27:3306) completed successfully.

4)停止manager,不停止mysql server
[root@wang ~]# masterha_stop --conf=/etc/app1.cnf
MHA Manager is not running on app1(2:NOT_RUNNING).




阅读(3781) | 评论(1) | 转发(0) |
0

上一篇:没有了

下一篇:nfs服务器的搭建

给主人留下些什么吧!~~

啦哆A梦2014-03-04 10:04:07

文明上网,理性发言...