首先,我们还是先来看看这部分处理逻辑.
先来看看flush_master_info的代码:
-
623 int flush_master_info(Master_info* mi, bool force)
-
624 {
-
625 DBUG_ENTER("flush_master_info");
-
626 DBUG_ASSERT(mi != NULL && mi->rli != NULL);
-
627 /*
-
628 The previous implementation was not acquiring locks.
-
629 We do the same here. However, this is quite strange.
-
630 */
-
631 /*
-
632 With the appropriate recovery process, we will not need to flush
-
633 the content of the current log.
-
634
-
635 For now, we flush the relay log BEFORE the master.info file, because
-
636 if we crash, we will get a duplicate event in the relay log at restart.
-
637 If we change the order, there might be missing events.
-
638
-
639 If we don't do this and the slave server dies when the relay log has
-
640 some parts (its last kilobytes) in memory only, with, say, from master's
-
641 position 100 to 150 in memory only (not on disk), and with position 150
-
642 in master.info, there will be missing information. When the slave restarts,
-
643 the I/O thread will fetch binlogs from 150, so in the relay log we will
-
644 have "[0, 100] U [150, infinity[" and nobody will notice it, so the SQL
-
645 thread will jump from 100 to 150, and replication will silently break.
-
646 */
-
647 mysql_mutex_t *log_lock= mi->rli->relay_log.get_log_lock();
-
648
-
649 mysql_mutex_lock(log_lock);
-
650
-
651 int err= (mi->rli->flush_current_log() ||
-
652 mi->flush_info(force));
-
653
-
654 mysql_mutex_unlock(log_lock);
-
655
-
656 DBUG_RETURN (err);
-
657 }
从注释我们可以很清晰的看到,在刷新master_info文件的数据之前,必须先刷新relay log的数据,保证CACHE中relay log的数据已经全部写到文件。
否则就会出现
1.tx1 写入了relay log
2.mysqld crash
3.tx2,tx3写入了cache中,但是未落地到磁盘,最终丢失
4.mysqld重启,之后IO THREAD接收到tx4
5.最终relay log中只包含了tx1,tx4,因此丢失了tx2,tx3
所以,每次flush master_info文件,都会先刷新relay log,从而保证不会有数据丢失。
当然,该操作的也不是堪称完美的。
如果relay log写成功了,但是在flush master_info的时候失败了,可能导致重复的数据被写入relay,从而被SQL THREAD重复的执行。
还好,5.6版本提供了crash safe相关的表来保证了这一点,不通过写文件,而通过写 innodb表来保证数据一致性,在master info写入失败,而relay log写入成功,
crash recovery时,通过relay_log_info表来构建master_info表的数据。
不过最好设置了如下参数
-
#crash safe options
-
relay_log_recovery =1
-
master_info_repository =TABLE
-
relay_log_info_repository =TABLE
关于为何通过表来保证crash safe 可以参见:
http://blog.booking.com/better_crash_safe_replication_for_mysql.html
阅读(3435) | 评论(0) | 转发(0) |