percona 5.6 flush_master_info如何保证数据一致性？-zxszcaijin-ChinaUnix博客

每个人都是设计师

首页　| 　博文目录　| 　关于我

zxszcaijin

博客访问： 674829
博文数量： 66
博客积分： 15
博客等级：民兵
技术积分： 2204
用户组：普通用户
注册时间： 2010-10-26 21:43

个人简介

曾就职于阿里巴巴担任Oracle DBA，MySQL DBA，目前在新美大担任SRE。[是普罗米修斯还是一块石头，你自己选择！] 欢迎关注微信公众号 “自己的设计师”，不定期有原创运维文章推送。

文章分类

全部博文（66）

随笔（2）
CPU（1）
销售（1）
心理学（1）
诗集（1）
道学（3）
Linux（0）
其他宗教（0）
佛学（1）
文学（0）
法律（0）
JAVA（0）
C/C++（0）
存储和硬件（1）
NoSql（1）
ORACLE分析（20）
Mysql（34）
未分配的博文（0）

文章存档

2017年（2）

2016年（3）

2015年（7）

2014年（12）

2013年（42）

我的朋友

相关博文

percona 5.6 flush_master_info如何保证数据一致性？

分类： Mysql/postgreSQL

2015-06-23 13:32:36

首先，我们还是先来看看这部分处理逻辑.
先来看看flush_master_info的代码:

点击(此处)折叠或打开

623 int flush_master_info(Master_info* mi, bool force)
624 {
625 DBUG_ENTER("flush_master_info");
626 DBUG_ASSERT(mi != NULL && mi->rli != NULL);
627 /*
628 The previous implementation was not acquiring locks.
629 We do the same here. However, this is quite strange.
630 */
631 /*
632 With the appropriate recovery process, we will not need to flush
633 the content of the current log.
634
635 For now, we flush the relay log BEFORE the master.info file, because
636 if we crash, we will get a duplicate event in the relay log at restart.
637 If we change the order, there might be missing events.
638
639 If we don't do this and the slave server dies when the relay log has
640 some parts (its last kilobytes) in memory only, with, say, from master's
641 position 100 to 150 in memory only (not on disk), and with position 150
642 in master.info, there will be missing information. When the slave restarts,
643 the I/O thread will fetch binlogs from 150, so in the relay log we will
644 have "[0, 100] U [150, infinity[" and nobody will notice it, so the SQL
645 thread will jump from 100 to 150, and replication will silently break.
646 */
647 mysql_mutex_t *log_lock= mi->rli->relay_log.get_log_lock();
648
649 mysql_mutex_lock(log_lock);
650
651 int err= (mi->rli->flush_current_log() ||
652 mi->flush_info(force));
653
654 mysql_mutex_unlock(log_lock);
655
656 DBUG_RETURN (err);
657 }

从注释我们可以很清晰的看到，在刷新master_info文件的数据之前，必须先刷新relay log的数据，保证CACHE中relay log的数据已经全部写到文件。

否则就会出现

1.tx1 写入了relay log
2.mysqld crash
3.tx2,tx3写入了cache中，但是未落地到磁盘，最终丢失
4.mysqld重启，之后IO THREAD接收到tx4
5.最终relay log中只包含了tx1,tx4，因此丢失了tx2,tx3

所以，每次flush master_info文件，都会先刷新relay log,从而保证不会有数据丢失。

当然，该操作的也不是堪称完美的。
如果relay log写成功了，但是在flush master_info的时候失败了，可能导致重复的数据被写入relay，从而被SQL THREAD重复的执行。

还好，5.6版本提供了crash safe相关的表来保证了这一点，不通过写文件，而通过写 innodb表来保证数据一致性，在master info写入失败，而relay log写入成功，
crash recovery时，通过relay_log_info表来构建master_info表的数据。
不过最好设置了如下参数

点击(此处)折叠或打开

#crash safe options
relay_log_recovery =1
master_info_repository =TABLE
relay_log_info_repository =TABLE

关于为何通过表来保证crash safe 可以参见：
http://blog.booking.com/better_crash_safe_replication_for_mysql.html

阅读(10218) | 评论(2) | 转发(1) |

上一篇：mysql5.6对于thread_running_concurrency处理的源码分析

下一篇：由mysql timestamp字段引发的一个系统bug

给主人留下些什么吧！~~

lifen122015-08-26 11:16:41

http://www.0379px.com/school-4283/document-typeid-1.html

回复 | 举报

lifen122015-08-26 11:16:32

http://www.0379px.com/school-4283/document-typeid-1.html

回复 | 举报

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6