分类: Mysql/postgreSQL
2013-08-30 07:36:05
1.Sandy飓风导致NYC机房停电,重启后看到的日志如下:
121101 16:35:25 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position', Error_code: 1236
121101 16:35:25 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.014497', position 38542146
121101 16:41:36 [Note] Error reading relay log event: slave SQL thread was killed
然后看binlog如下:
# at 38539267
#121101 13:11:04 server id 1 end_log_pos 38539294 Xid = 934362432
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
master突然当机造成的master的pos要小于slave的错误日志记录的pos,mysql-bin.014497的最后一个位置是: end_log_pos 38539294,但没有被commit,所以上一个是 38539267 那么直接设pos为master的最后有效的位置即可,
change master to master_log_file='mysql-bin.014497',master_log_pos=38539267;
但也有可能情况相反,可能slave丢了部分数据或延迟,此时把pos往前移,反复试验即可。
2. Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
logfile有空格,或是master上对应的Log被删了。
3. Show processlist 看到很多sleep,可能是应用代码做完query之后没用close()主动关闭链接。这样会一直到timeout才断掉,但这个timeout太小的话,会导致mysql has gone away 这种错误。