Chinaunix首页 | 论坛 | 博客
  • 博客访问: 4179742
  • 博文数量: 240
  • 博客积分: 11504
  • 博客等级: 上将
  • 技术积分: 4277
  • 用 户 组: 普通用户
  • 注册时间: 2006-12-28 14:24
文章分类

全部博文(240)

分类: Mysql/postgreSQL

2008-10-23 15:45:22

关于分页的优化。
我们知道,在MySQL中分页很简单,直接LIMIT page_no,page_total 就可以了。
可是当记录数慢慢增大时,她就不那么好使了。
这里我们创建摘要表来记录页码和原表之间的关联。
下面为测试数据。

原表:
CREATE TABLE `t_group` (
  `id` int(11) NOT NULL auto_increment,
  `money` decimal(10,2) NOT NULL,
  `user_name` varchar(20) NOT NULL,
  `create_time` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `idx_combination1` (`user_name`,`money`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

原表总记录数:
mysql> select count(*) from t_group;
+----------+
| count(*) |
+----------+
| 10485760 |
+----------+
1 row in set (0.00 sec)
分页表:


CREATE TABLE `t_group_ids` (
  `id` int(11) NOT NULL,
  `group_id` int(11) NOT NULL,
  PRIMARY KEY  (`id`,`group_id`),
  KEY `idx_id` (`id`),
  KEY `idx_group_id` (`group_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


插入分页表数据。当然这里如果你的表主键不是ID,那你得自己想办法搞这个分页表的数据了。这个好实现,就不说了。
mysql> insert into t_group_ids select ceil(id/20),id from t_group;
Query OK, 10485760 rows affected (2 min 56.19 sec)
Records: 10485760  Duplicates: 0  Warnings: 0

现在来看看对比数据。

用普通LIMIT来实现分页。

mysql> select * from t_group where 1 limit 20;
+----+--------+-----------+---------------------+
| id | money  | user_name | create_time         |
+----+--------+-----------+---------------------+
|  1 |  50.23 | david     | 2008-10-23 12:55:49 |
|  2 |  55.23 | livia     | 2008-10-23 10:02:09 |
|  3 | 100.83 | leo       | 2008-10-23 10:02:22 |
|  4 |  99.99 | lucy      | 2008-10-23 10:02:39 |
|  5 | 299.99 | simon     | 2008-10-23 10:02:52 |
|  6 | 599.99 | sony      | 2008-10-23 10:03:03 |
|  7 | 599.99 | rick      | 2008-10-23 10:03:12 |
|  8 |   9.99 | anne      | 2008-10-23 10:03:47 |
|  9 |   9.99 | sarah     | 2008-10-23 10:04:31 |
| 10 | 900.99 | john      | 2008-10-23 10:04:50 |
| 11 |   0.23 | david     | 2008-10-23 10:05:31 |
| 12 |   5.23 | livia     | 2008-10-23 10:05:31 |
| 13 |  50.83 | leo       | 2008-10-23 10:05:31 |
| 14 |  49.99 | lucy      | 2008-10-23 10:05:31 |
| 15 | 249.99 | simon     | 2008-10-23 10:05:31 |
| 16 | 549.99 | sony      | 2008-10-23 10:05:31 |
| 17 | 549.99 | rick      | 2008-10-23 10:05:31 |
| 18 | -40.01 | anne      | 2008-10-23 10:05:31 |
| 19 | -40.01 | sarah     | 2008-10-23 10:05:31 |
| 20 | 850.99 | john      | 2008-10-23 10:05:31 |
+----+--------+-----------+---------------------+
20 rows in set (0.01 sec)


用分页表来实现分页。

mysql> select a.* from t_group as a inner join t_group_ids as b where a.id = b.g
roup_id and b.id = 1;
+----+--------+-----------+---------------------+
| id | money  | user_name | create_time         |
+----+--------+-----------+---------------------+
|  1 |  50.23 | david     | 2008-10-23 12:55:49 |
|  2 |  55.23 | livia     | 2008-10-23 10:02:09 |
|  3 | 100.83 | leo       | 2008-10-23 10:02:22 |
|  4 |  99.99 | lucy      | 2008-10-23 10:02:39 |
|  5 | 299.99 | simon     | 2008-10-23 10:02:52 |
|  6 | 599.99 | sony      | 2008-10-23 10:03:03 |
|  7 | 599.99 | rick      | 2008-10-23 10:03:12 |
|  8 |   9.99 | anne      | 2008-10-23 10:03:47 |
|  9 |   9.99 | sarah     | 2008-10-23 10:04:31 |
| 10 | 900.99 | john      | 2008-10-23 10:04:50 |
| 11 |   0.23 | david     | 2008-10-23 10:05:31 |
| 12 |   5.23 | livia     | 2008-10-23 10:05:31 |
| 13 |  50.83 | leo       | 2008-10-23 10:05:31 |
| 14 |  49.99 | lucy      | 2008-10-23 10:05:31 |
| 15 | 249.99 | simon     | 2008-10-23 10:05:31 |
| 16 | 549.99 | sony      | 2008-10-23 10:05:31 |
| 17 | 549.99 | rick      | 2008-10-23 10:05:31 |
| 18 | -40.01 | anne      | 2008-10-23 10:05:31 |
| 19 | -40.01 | sarah     | 2008-10-23 10:05:31 |
| 20 | 850.99 | john      | 2008-10-23 10:05:31 |
+----+--------+-----------+---------------------+
20 rows in set (0.00 sec)



取第50W页的数据。
原来表:

mysql> select * from t_group where 1 limit 9999980,20;
+----------+---------+-----------+---------------------+
| id       | money   | user_name | create_time         |
+----------+---------+-----------+---------------------+
|  9999981 |  810.13 | david     | 2008-10-23 10:09:24 |
|  9999982 |  815.13 | livia     | 2008-10-23 10:09:24 |
|  9999983 |  860.73 | leo       | 2008-10-23 10:09:24 |
|  9999984 |  859.89 | lucy      | 2008-10-23 10:09:24 |
|  9999985 | 1059.89 | simon     | 2008-10-23 10:09:24 |
|  9999986 | 1359.89 | sony      | 2008-10-23 10:09:24 |
|  9999987 | 1359.89 | rick      | 2008-10-23 10:09:24 |
|  9999988 |  769.89 | anne      | 2008-10-23 10:09:24 |
|  9999989 |  769.89 | sarah     | 2008-10-23 10:09:24 |
|  9999990 | 1660.89 | john      | 2008-10-23 10:09:24 |
|  9999991 |  760.13 | david     | 2008-10-23 10:09:24 |
|  9999992 |  765.13 | livia     | 2008-10-23 10:09:24 |
|  9999993 |  810.73 | leo       | 2008-10-23 10:09:24 |
|  9999994 |  809.89 | lucy      | 2008-10-23 10:09:24 |
|  9999995 | 1009.89 | simon     | 2008-10-23 10:09:24 |
|  9999996 | 1309.89 | sony      | 2008-10-23 10:09:24 |
|  9999997 | 1309.89 | rick      | 2008-10-23 10:09:24 |
|  9999998 |  719.89 | anne      | 2008-10-23 10:09:24 |
|  9999999 |  719.89 | sarah     | 2008-10-23 10:09:24 |
| 10000000 | 1610.89 | john      | 2008-10-23 10:09:24 |
+----------+---------+-----------+---------------------+
20 rows in set (4.21 sec)

分页表:

mysql> select a.* from t_group as a inner join t_group_ids as b where a.id = b.g
roup_id and b.id = 500000;
+----------+---------+-----------+---------------------+
| id       | money   | user_name | create_time         |
+----------+---------+-----------+---------------------+
|  9999981 |  810.13 | david     | 2008-10-23 10:09:24 |
|  9999982 |  815.13 | livia     | 2008-10-23 10:09:24 |
|  9999983 |  860.73 | leo       | 2008-10-23 10:09:24 |
|  9999984 |  859.89 | lucy      | 2008-10-23 10:09:24 |
|  9999985 | 1059.89 | simon     | 2008-10-23 10:09:24 |
|  9999986 | 1359.89 | sony      | 2008-10-23 10:09:24 |
|  9999987 | 1359.89 | rick      | 2008-10-23 10:09:24 |
|  9999988 |  769.89 | anne      | 2008-10-23 10:09:24 |
|  9999989 |  769.89 | sarah     | 2008-10-23 10:09:24 |
|  9999990 | 1660.89 | john      | 2008-10-23 10:09:24 |
|  9999991 |  760.13 | david     | 2008-10-23 10:09:24 |
|  9999992 |  765.13 | livia     | 2008-10-23 10:09:24 |
|  9999993 |  810.73 | leo       | 2008-10-23 10:09:24 |
|  9999994 |  809.89 | lucy      | 2008-10-23 10:09:24 |
|  9999995 | 1009.89 | simon     | 2008-10-23 10:09:24 |
|  9999996 | 1309.89 | sony      | 2008-10-23 10:09:24 |
|  9999997 | 1309.89 | rick      | 2008-10-23 10:09:24 |
|  9999998 |  719.89 | anne      | 2008-10-23 10:09:24 |
|  9999999 |  719.89 | sarah     | 2008-10-23 10:09:24 |
| 10000000 | 1610.89 | john      | 2008-10-23 10:09:24 |
+----------+---------+-----------+---------------------+
20 rows in set (0.03 sec)


我们来取最后一页的数据。
原表:


mysql> select * from t_group where 1 limit 10485740,20;
+----------+---------+-----------+---------------------+
| id       | money   | user_name | create_time         |
+----------+---------+-----------+---------------------+
| 10485741 | 1935.42 | david     | 2008-10-23 10:09:24 |
| 10485742 | 1955.42 | livia     | 2008-10-23 10:09:24 |
| 10485743 | 2137.82 | leo       | 2008-10-23 10:09:24 |
| 10485744 | 2134.46 | lucy      | 2008-10-23 10:09:24 |
| 10485745 | 2934.46 | simon     | 2008-10-23 10:09:24 |
| 10485746 | 4134.46 | sony      | 2008-10-23 10:09:24 |
| 10485747 | 4134.46 | rick      | 2008-10-23 10:09:24 |
| 10485748 | 1774.46 | anne      | 2008-10-23 10:09:24 |
| 10485749 | 1774.46 | sarah     | 2008-10-23 10:09:24 |
| 10485750 | 5338.46 | john      | 2008-10-23 10:09:24 |
| 10485751 | 1735.42 | david     | 2008-10-23 10:09:24 |
| 10485752 | 1755.42 | livia     | 2008-10-23 10:09:24 |
| 10485753 | 1937.82 | leo       | 2008-10-23 10:09:24 |
| 10485754 | 1934.46 | lucy      | 2008-10-23 10:09:24 |
| 10485755 | 2734.46 | simon     | 2008-10-23 10:09:24 |
| 10485756 | 3934.46 | sony      | 2008-10-23 10:09:24 |
| 10485757 | 3934.46 | rick      | 2008-10-23 10:09:24 |
| 10485758 | 1574.46 | anne      | 2008-10-23 10:09:24 |
| 10485759 | 1574.46 | sarah     | 2008-10-23 10:09:24 |
| 10485760 | 5138.46 | john      | 2008-10-23 10:09:24 |
+----------+---------+-----------+---------------------+
20 rows in set (4.88 sec)

分页表:

mysql> select a.* from t_group as a inner join t_group_ids as b where a.id = b.g
roup_id and b.id = 524288;
+----------+---------+-----------+---------------------+
| id       | money   | user_name | create_time         |
+----------+---------+-----------+---------------------+
| 10485741 | 1935.42 | david     | 2008-10-23 10:09:24 |
| 10485742 | 1955.42 | livia     | 2008-10-23 10:09:24 |
| 10485743 | 2137.82 | leo       | 2008-10-23 10:09:24 |
| 10485744 | 2134.46 | lucy      | 2008-10-23 10:09:24 |
| 10485745 | 2934.46 | simon     | 2008-10-23 10:09:24 |
| 10485746 | 4134.46 | sony      | 2008-10-23 10:09:24 |
| 10485747 | 4134.46 | rick      | 2008-10-23 10:09:24 |
| 10485748 | 1774.46 | anne      | 2008-10-23 10:09:24 |
| 10485749 | 1774.46 | sarah     | 2008-10-23 10:09:24 |
| 10485750 | 5338.46 | john      | 2008-10-23 10:09:24 |
| 10485751 | 1735.42 | david     | 2008-10-23 10:09:24 |
| 10485752 | 1755.42 | livia     | 2008-10-23 10:09:24 |
| 10485753 | 1937.82 | leo       | 2008-10-23 10:09:24 |
| 10485754 | 1934.46 | lucy      | 2008-10-23 10:09:24 |
| 10485755 | 2734.46 | simon     | 2008-10-23 10:09:24 |
| 10485756 | 3934.46 | sony      | 2008-10-23 10:09:24 |
| 10485757 | 3934.46 | rick      | 2008-10-23 10:09:24 |
| 10485758 | 1574.46 | anne      | 2008-10-23 10:09:24 |
| 10485759 | 1574.46 | sarah     | 2008-10-23 10:09:24 |
| 10485760 | 5138.46 | john      | 2008-10-23 10:09:24 |
+----------+---------+-----------+---------------------+
20 rows in set (0.01 sec)


总结:我们看到,当表记录数增加时,LIMIT的性能随着线性增长。而当我们存放了页码与主键的关联后,性能大增。
阅读(5919) | 评论(6) | 转发(0) |
给主人留下些什么吧!~~

chinaunix网友2011-07-19 10:04:13

楼上质疑的傻逼,和楼主完全不是一个凳次。楼主对你们都无语了。支持楼主,很好的思路。

chinaunix网友2010-05-11 15:38:32

可以借鉴!

goto9992009-02-23 09:53:52

很不错的思路,收藏了

yueliangdao06082008-10-28 07:37:45

你这样得把所有分页的条件都列出来吧

chinaunix网友2008-10-27 20:46:26

这种方式一般来说仅适用于静态表的方式,如果是动态表,确实维护成本会比较大,即使通过其他一些方式:比如应用来维护同时插入两个表,或者数据库的trigger来插入,都会带来很多附加成本,而且很容易造成数据不一致的情况。 其实也可以通过其他的方式来优化分页,比如创建一个包含主键的索引,出了主键之外还包含分页需要用到的条件,然后再利用像上面所说的类似方式来实现。虽然同样也会有一些附加成本,但是在维护方面和数据一致性方面可以得到保证。