mysql主从失败的解决参考-ubuntuer-ChinaUnix博客

人生如逆旅，我亦是行人！江湖人称wsjjeremy.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

ubuntuer

博客访问： 4900113
博文数量： 930
博客积分： 12070
博客等级：上将
技术积分： 11448
用户组：普通用户
注册时间： 2008-08-15 16:57

文章分类

全部博文（930）

html5（0）
python（1）
google_gnu fans（8）
高品位（2）
perl（4）
mobile_dev（2）
openssl（1）
libcurl（2）
windows内核安全（5）
自己的C_LIB（5）
高性能MySQL学习（94）
多线程（4）
ldd学习笔记（3）
netfilter（3）
笔试题（5）
师徒之言传身教（1）
转载（15）
work（146）
introduction to （9）
debug（3）

intern（3）
mobile ip（0）
毕业设计（2）
linux防火墙（10）
c++（16）
database（13）
CentOS（11）
data structure（5）
kernel（50）
DIY（4）
酷软（19）
iptables（9）
linux c（105）

string（19）
APUE学习笔记（7）
facetea（13）
shell（68）
tcp_ip（23）
apache（3）
linux（258）

正则表达式（5）
未分配的博文（1）

文章存档

2011年（60）

2010年（220）

2009年（371）

2008年（279）

我的朋友

相关博文

mysql主从失败的解决参考

分类： Mysql/postgreSQL

2011-01-06 22:41:12

星期一上班，就听到开发说一台mysql数据库down掉（此台数据库只做备份用）。连上系统，用ps -ef |grep mysql查看下进程，果真以个进程都没有。那就重启mysql服务，启动不起来。就查看mysql的错误日志。
InnoDB: Doing recovery: scanned up to log sequence number 0 48155
InnoDB: Last MySQL binlog file position 0 79, file name /var/log/mysql/updatelog.000006
100621 5:36:33 InnoDB: Flushing modified pages from the buffer pool...
100621 5:36:33 InnoDB: Started; log sequence number 0 48155
100621 5:36:34 [ERROR] /mysql/libexec/mysqld: Error writing file '/data/db2.ihome.com.pid' (Errcode: 28)
100621 5:36:34 [ERROR] Can't start server: can't create PID file: No space left on device

Number of processes running now: 0
100621 05:36:34 mysqld restarted
/mysql/libexec/mysqld: Error writing file './db2-slow.log' (Errcode: 28)
100621 5:36:34 [ERROR] Could not use db2-slow.log for logging (error 28). Turning logging off for the whole duration of the MySQL server process. To turn it on again: fix the cause, shutdown the MySQL server and restart it.
100621 5:36:34 InnoDB: Database was not shut down normally!

      哦原来磁盘空间满了，使用df -h 查看下，存放data的目录已经使用100%.进入data目录查看。原来是一个备份数据库文件在里面占用的了大量的磁盘空间。查看备份脚本原来是备份脚本里面的路径写错了。修改路径，把备份文件移动到备份目录。启动mysql服务正常启动。

    使用命令登陆上mysql，发现主从已经不同步了。查看mysql的错误日志得到
100621 9:32:21 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000023' at position 211845, relay log './db2-relay-bin.000013' position: 217010
100621 9:32:21 [ERROR] Error in Log_event::read_log_event(): 'read error', data_len: 112, event_type: 2
100621 9:32:21 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
100621 9:32:21 [ERROR] Slave: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 0
100621 9:32:21 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000023' position 211845

首先想到尝试着手动来解决主从同步。连接上数据库

[root@db2 data]# mysql -u root -p
Enter password:      输入密码
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 55 to server version: 4.1.12-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>stop slave ; #停掉slave 进程，手动解决mysql主从时，需要先结束掉mysql slave。
mysql> change master to
mysql->MASTER_LOG_POS=211845
mysql ->MASTER_LOG_FILE = 'mysql-bin.000023';
mysql>start slave;
mysql>show slave status\G;
查看同步依然是不成功。继续查看mysql 错误日志
100621 9:41:35 [Note] Slave SQL thread initialized, starting replication in log 'binglog mysql-bin.000023' at position 211845, relay log './db2-relay-bin.000001' position: 4
100621 9:41:35 [Note] Slave I/O thread: connected to master , replication started in log 'binglog mysql-bin.000023' at position 211845
100621 9:41:35 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file (server_errno=1236)
100621 9:41:35 [ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
100621 9:41:35 [ERROR] Slave I/O thread exiting, read up to log 'binglog mysql-bin.000023', position 211845

删除掉data目录下文件
db2-relay-bin.000001
db2-relay-bin.index
master.info
relay-log.info # 这些文件里面存放着同步时的信息。
     然后重启下mysql 服务，查看同步信息。问题依旧。然后查看上次错误日志与此次的错误日志。找不到 log文件，连接上主的mysql数据库。查看mysql信息一切正常。
#mysqlbinlog --start-position=211845 mysql-bin.000023 > a.sql
把从 211845到现在不同步的二进制文件导入到一个文本文件里面。然后使用scp 传到从数据库上
scp a.sql     输入密码自动copy到mysql 从服务器上的根目录下。
打开a.sql 查看最后一行的
然后到从数据库上停止mysql slave 服务。stop slave
使用#mysql -uuser -p passwd database < /a.sql #手动导入没有同步的数据
导入完成之后删除掉 db2-relay-bin.000001 db2-relay-bin.index master.info relay-log.info 这些文件。使用
mysql>stop slave ;
mysql>change master to MASTER_LOG_POS = 155383248 MASTER_LOG_FILE = 'mysql-bin.000023';
Query OK, 0 rows affected (0.01 sec)
然后重启mysql服务，service mysqld restart
启动slave
mysql>start slave;
查看mysql主从状态mysql>show slave status\G;
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
看到这两行状态为Yes我心甚慰。看到这两行为Yes表示已经开始同步数据了。
   下面是CHANGE MASTER TO语法
CHANGE MASTER TO master_def[,master_def] ...

master_def:
      MASTER_HOST = 'host_name'
    | MASTER_USER = 'user_name'
    | MASTER_PASSWORD = 'password'
    | MASTER_PORT =port_num
    | MASTER_CONNECT_RETRY =count
    | MASTER_LOG_FILE = 'master_log_name'
    | MASTER_LOG_POS =master_log_pos
    | RELAY_LOG_FILE = 'relay_log_name'
    | RELAY_LOG_POS =relay_log_pos
    | MASTER_SSL = {0|1}
    | MASTER_SSL_CA = 'ca_file_name'
    | MASTER_SSL_CAPATH = 'ca_directory_name'
    | MASTER_SSL_CERT = 'cert_file_name'
    | MASTER_SSL_KEY = 'key_file_name'
    | MASTER_SSL_CIPHER = 'cipher_list'
可以更改从属服务器用于与主服务器进行连接和通讯的参数。
MASTER_USER,MASTER_PASSWORD,MASTER_SSL,MASTER_SSL_CA,MASTER_SSL_CAPATH,MASTER_SSL_CERT,MASTER_SSL_KEY和MASTER_SSL_CIPHER用于向从属服务器提供有关如何与主服务器连接的信息

检查从服务器一般使用show slave status命令来检查

mysql > SHOW SLAVE STATUS\G
*************************** 1 . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168 . 0.100
Master_User: root
Master_Port: 3306
Connect_Retry: 3
Master_Log_File: mysql - bin. 003
Read_Master_Log_Pos: 79
Relay_Log_File: mysql - relay - bin. 003
Relay_Log_Pos: 548
Relay_Master_Log_File: mysql - bin. 003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Last_Errno: 0
…..

在上面这些信息中我们主要关注的是Slave_IO_Running和Slave_SQL_Running
Slave_IO_Running:从服务器正从主服务器上读取BINLOG日志,并写入从服务器的中继日志
Slave_SQL_Running:进程正在读取从服务器的BINLOG中继日志，并转化为SQL执行
以前有一个进程是no状态，表示复制的进程停止，在Last_Errno会看到是什么情况

有时候因为主服务器的更新过于频繁,造成了从服务器更新速度较慢,当然问题是多种多样,有可能是网络搭建的结构不好或者硬件的性能较差,从而使得主从服务器之间的差距越来越大,最终对某些应用产生了影响,在这种情况下,我们需要定期进行主从服务器的数据同步,具体步骤如下
在主服务器上

mysql > FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected ( 0.03 sec)
mysql > show master status\G;
*************************** 1 . row ***************************
File : mysql - bin. 000004
Position: 102
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set ( 0.00 sec)

记录出日志的名字和偏移量,这些是从服务器复制的目的目标

在从服务器上,使用MASTER_POS_WAIT()函数得到复制坐标值

mysql > select master_pos_wait( ' mysql-bin.000004 ' , ' 102 ' );
+ -- -----------------------------------------+
| master_pos_wait( ' mysql-bin.000004 ' , ' 102 ' ) |
+ -- -----------------------------------------+
| 0 |
+ -- -----------------------------------------+
1 row in set ( 0.00 sec)

这个select 语句会阻塞直到从服务器达到指定日志文件和偏移量后,返回0,如果是-1,则表示超时推出,查询是0时,表示从服务器与主服务器已经同步

在某些情况下,会出现从服务器更新失败,首先需要确定是否从服务器的表与主服务器的不同造成的,如果是表结构造成的,则需要修改从服务器的表和主服务器一致,然后重新运行start slave
如果不是表结构不同造成的更新失败，则需要确认手动更新是否安全，然后忽视来自主服务器的更新失败语句，跳过来来自主服务器的语句，命令为SET GLOBAL SQL_SLAVE_SKIP_COUNTER=n,其中,n=1表示来自主服务器的更新语句不使用AUTO_INCREMENT或LAST_INSERT_ID(),n=2时则反之,原因是使用AUTO_INCREMENT或LAST_INSERT_ID的语句需要从二进制日志中取得两个事件.

阅读(2202) | 评论(0) | 转发(0) |

上一篇：给source insight添加.cc的C++文件后缀识别

下一篇：mysql sql_cache缓存使用

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6