MySQL DBA
分类: Mysql/postgreSQL
2013-03-07 13:11:02
innodb insert buffer(翻译)
Despite being standard Innodb feature forever Insert Buffers remains some kind of mysterious thing for a lot of people, so let me try to explain thing a little bit.
Innodb uses insert buffer to "cheat" and not to update index leaf pages when at once but "buffer" such updates so several updates to the same page can be performed with single sweep. Insert buffer only can work for non-unique keys because until the merge is performed it is impossible to check if the value is unique.
innodb 不会立刻更新索引页面而是将这些操作缓存起来,然后将在同一页面的update操作一起进行(节省了IO操作)。在合并未提交前,无法去判断是否是唯一索引,所以insert buffer只对非唯一索引有效。
Insert buffer is allocated in the Innodb system table space. Even though it is called "buffer" similar to "doublewrite buffer" it is really the space in the tablepace. Though it can be cached in the buffer pool same as other pages. This property allows insert buffer to survive transaction commits and even MySQL restarts. Really it may take weeks before the given index page is merged, though usually it is much sooner than that.
和doublewrite一样,insert buffer 实际存放在表空间中,同时也可加载到缓存中。而合并操作可能会花很长时间来进行。(在事务提交和mysql重启后,合并操作任然会进行)。
There are two ways of insert buffer merge is happening. First is on demand merge – if accessed page contains unmerged records in insert buffer the merge is performed before page is made available. This means insert buffer can slow down read operations.
合并操作的触发有2中方式:第一种是当查询的页面包含未合并的记录时,合并操作将执行,所以,insert buffer会降低查询的效率。
The other way insert buffer is merged is by background thread. There are very little merges happening if system is loaded and merge process becomes more active if system is idle. This behavior can cause interesting results, like you had system lightly used and have very little IO activity, but when you remove the load from the system completely you see high IO load which goes for hours even after all buffer pool dirty pages are completed. This can be very surprising.
另一种方式是后台线程来执行合并。如果系统很闲,合并线程会频繁活动,反之,合并线程会在系统繁忙的时候缓慢进行。这样带来一个奇怪的现象:当你移除数据库的负载后,即使buffer pool dirty pages已经全部完成,却发现长时间的高IO负载。(这个时候其实是合并在执行)
Stats about Innodb Insert Merge Buffer are available in SHOW INNODB STATUS output:
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 7545, free list len 3790, seg size 11336,
8075308 inserts, 7540969 merged recs, 2246304 merges
The "seg size" is a full allocated size of segment in pages. So in this case it is about 180MB
The "free list" is number of pages which are free – containing no unmerged records. The "size" is size (in pages) of insert buffer which is not merged.
seg size 为分段的总大小(页),这里为180M左右,free list 为空闲页数量,包含还未合并的记录。size 为不进行合并的insert buffer大小(页)
The fact size is in pages is not really helpful because depending on the row size there can be different number of rows in the insert buffer – and it is rows we see in performance stats, for example to understand when insert buffer merge will be completed.
在insert buffer中,不同的行的大小不一致,状态值得数据是未判断合并是否结束,而buffer的总大小没有什么意义。
The "inserts" is number of inserts to insert buffer since start and number of merged records is number of records which were merged to their appropriate page locations since start. So we know in this case insert buffer has grown 534339 records since start. There is a temptation to use this number as count of unmerged rows in insert buffer but this would not be correct – insert buffer may not be empty at the start. So you can only tell insert buffer has at least this number of records. For the same reason do not get scared if you see more merged records than inserted.
inserts 为从服务启动inserts操作的记录总数,merged rec为从服务器启动合并到同一页面的记录数, merge为合并的次数。由于服务启动后insert buffer 可能不为空,则当前insert buffer值大于inserts-merged rec。
The value of 2246304 merges shows us there was about 3 records merged for each merge operation, meaning insert buffer could in theory reduce IO needed to update leaf pages 3 times.
As I mentioned Insert buffer merge can take quite a while – with 100 records merged per second we're looking at least 5343 seconds or 1.5 hours on this server… and there are insert buffers which are 10x and 100x larger than this.
merges=2246304 体现了每次合并操作合并了3条记录,理论上insert buffer 节省了3倍的io。由于合并的时间较长,如果每秒合并100条记录,哪在这个服务器上需要1.5 个小时来完成合并,如果insert buffer更大,时间将更长。
Innodb unfortunately offers no control for insert buffer while it surely would be benefiting for different workloads and hardware configuration. For example there is very good question if insert buffer really makes sense for SSD because saving random IO is not so much needed for these devices.
innodb未提供insert buffer 的控制,虽然在不同的应用场景他发挥的作用不一样。
insert buffer
手册翻译:
一般情况下,对顺序增长的唯一主键(clustered index)的插入效率很高,可以从磁盘上顺序读取数据页,避免随机访问。
二级索引通常不是唯一的,对二级索引的插入带来大量的磁盘IO。innodb处理机制为:检查二级索引所在页面是否在缓存中,如果是,innodb直接执行insert操作;如果没有在缓存中,innodb将记录很快地写入insert buffer。insert buffer 将会周期性地刷新到磁盘中,这样可以把在相同页面的insert 一并写入。insert buffer能提升15倍查询速度。
insert buffer 合并在事务提交完成、服务重启后会继续进行。
个人理解:insert buffer 加速了插入速度,同时有几个疑问:
1.buffer的merge(合并)过程效率如何,能否检查到状态?
2.在合并未完成的状态下,未刷新到磁盘的数据能查询到吗?
3.在关闭mysql后,show innodb status 中得insert buffer行,inserts值为0,但如果有未刷新的buffer将继续刷新,如何知道他是否结束?
4.在重启后会继续刷新,而这段buffer存放在缓存中,在服务器关闭过程中,这些数据写入磁盘文件?写在何处?
2011-12-31解答:
1. 服务器未重启的情况下,show innodb status 中可观察到合并的进行状态。
2. 未结束的记录存放在缓存或者表空间文件中
3. 无法控制或检测merge的执行进度,但可以确定写入缓存的记录最终会写入索引页中
4. 同doublewritebuffer一样,实际存储在表空间文件中,可加载到缓存中