在Mysql中,表的设计是关键的一环,前期的设计失误,给后期带来的代价将是惨重的。在Mysql中,我们经常会用到计数表,特别是在并发的环境中,较差的计数表,可能带来锁的大量争用,从而带来性能问题。比如在web环境中,关于朋友的统计,文件下载量的统计等等都会用到计数表。Mysql High Performance 3rd中关于计数表的设计非常的经典。
比如我们设计如下的一个计数表:
mysql> CREATE TABLE hit_counter (
-> cnt int unsigned not null
-> ) ENGINE=InnoDB;
首先我们必须保证我们的是串行的更新这个表,这样在高并发的环境中,我们执行如下操作:
mysql> UPDATE hit_counter SET cnt = cnt + 1;
这样就会经常出现锁资源争用比较激烈的情况。我们在来看一个好的设计方式:
mysql> CREATE TABLE hit_counter (
-> slot tinyint unsigned not null primary key,
-> cnt int unsigned not null
-> ) ENGINE=InnoDB;
假设我们有100行,每次更新随机选择其中一行就可以了。
mysql> UPDATE hit_counter SET cnt = cnt + 1 WHERE slot = RAND() * 100;
这样我们就把在一行上的竞争,分摊到100行。如果要获得统计结果,我们只用执行如下语句就可以了:
mysql> SELECT SUM(cnt) FROM hit_counter;
当然对于其他较为复杂的情况我们可以参考High performance Mysql 3rd
A common requirement is to start new counters every so often (for example, once a
day). If you need to do this, you can change the schema slightly:
mysql> CREATE TABLE daily_hit_counter (
-> day date not null,
-> slot tinyint unsigned not null,
-> cnt int unsigned not null,
-> primary key(day, slot)
-> ) ENGINE=InnoDB;
You don’t want to pregenerate rows for this scenario. Instead, you can use ON DUPLICATE
KEY UPDATE:
mysql> INSERT INTO daily_hit_counter(day, slot, cnt)
-> VALUES(CURRENT_DATE, RAND() * 100, 1)
-> ON DUPLICATE KEY UPDATE cnt = cnt + 1;
If you want to reduce the number of rows to keep the table smaller, you can write a
periodic job that merges all the results into slot 0 and deletes every other slot:
mysql> UPDATE daily_hit_counter as c
-> INNER JOIN (
-> SELECT day, SUM(cnt) AS cnt, MIN(slot) AS mslot
-> FROM daily_hit_counter
-> GROUP BY day
-> ) AS x USING(day)
-> SET c.cnt = IF(c.slot = x.mslot, x.cnt, 0),
-> c.slot = IF(c.slot = x.mslot, 0, c.slot);
mysql> DELETE FROM daily_hit_counter WHERE slot <> 0 AND cnt = 0;
总而言之,第二次将每天的对一条数据的更新,分为N条来处理,最后将数据汇总成需要的数据,从而较少DML带来的lock contention,如果你的表经常需要这样的real-time计数,我想这样的办法还是不错的。
对于热点数据的锁资源,分摊到N行来处理,在一些应用环境中确实是个不错的办法,如果你也遇到过这种类似的情况,我想这样分而治之的方法或许是个好的选择!
阅读(2738) | 评论(0) | 转发(0) |