Chinaunix首页 | 论坛 | 博客
  • 博客访问: 428599
  • 博文数量: 137
  • 博客积分: 5190
  • 博客等级: 大校
  • 技术积分: 997
  • 用 户 组: 普通用户
  • 注册时间: 2010-02-21 16:19
文章存档

2011年(17)

2010年(120)

我的朋友

分类: Mysql/postgreSQL

2010-02-28 19:02:21

 

错误的按日期分区例子

最直观的方法,就是直接用年月日这种日期格式来进行常规的分区:

CODE:
  1. mysql>  create table rms (d date)
  2.     ->  partition by range (d)
  3.     -> (partition p0 values less than ('1995-01-01'),
  4.     ->  partition p1 VALUES LESS THAN ('2010-01-01'));

上面的例子中,就是直接用"Y-m-d"的格式来对一个table进行分区,可惜想当然往往不能奏效,会得到一个错误信息:

ERROR 1064 (42000): VALUES value must be of same type as partition function near '),
partition p1 VALUES LESS THAN ('2010-01-01'))' at line 3

上述分区方式没有成功,而且明显的不经济,老练的DBA会用整型数值来进行分区:

CODE:
  1. mysql> CREATE TABLE part_date1
  2.     ->      (  c1 int default NULL,
  3.     ->  c2 varchar(30) default NULL,
  4.     ->  c3 date default NULL) engine=myisam
  5.     ->      partition by range (cast(date_format(c3,'%Y%m%d') as signed))
  6.     -> (PARTITION p0 VALUES LESS THAN (19950101),
  7.     -> PARTITION p1 VALUES LESS THAN (19960101) ,
  8.     -> PARTITION p2 VALUES LESS THAN (19970101) ,
  9.     -> PARTITION p3 VALUES LESS THAN (19980101) ,
  10.     -> PARTITION p4 VALUES LESS THAN (19990101) ,
  11.     -> PARTITION p5 VALUES LESS THAN (20000101) ,
  12.     -> PARTITION p6 VALUES LESS THAN (20010101) ,
  13.     -> PARTITION p7 VALUES LESS THAN (20020101) ,
  14.     -> PARTITION p8 VALUES LESS THAN (20030101) ,
  15.     -> PARTITION p9 VALUES LESS THAN (20040101) ,
  16.     -> PARTITION p10 VALUES LESS THAN (20100101),
  17.     -> PARTITION p11 VALUES LESS THAN MAXVALUE );
  18. Query OK, 0 rows affected (0.01 sec)

搞定?接着往下分析

CODE:
  1. mysql> explain partitions
  2.     -> select count(*) from part_date1 where
  3.     ->      c3> date '1995-01-01' and c3 '1995-12-31'\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         table: part_date1
  8.    partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11
  9.          type: ALL
  10. possible_keys: NULL
  11.           key: NULL
  12.       key_len: NULL
  13.           ref: NULL
  14.          rows: 8100000
  15.         Extra: Using where
  16. 1 row in set (0.00 sec)

万恶的mysql居然对上面的sql使用全表扫描,而不是按照我们的日期分区分块查询。原文中解释到的优化器并不认这种日期形式的分区,花了大量的篇幅来引诱俺走上歧路,过分。

正确的日期分区例子

mysql优化器支持以下两种内置的日期函数进行分区:

  • TO_DAYS()
  • YEAR()

看个例子:

CODE:
  1. mysql> CREATE TABLE part_date3
  2.     ->      (  c1 int default NULL,
  3.     ->  c2 varchar(30) default NULL,
  4.     ->  c3 date default NULL) engine=myisam
  5.     ->      partition by range (to_days(c3))
  6.     -> (PARTITION p0 VALUES LESS THAN (to_days('1995-01-01')),
  7.     -> PARTITION p1 VALUES LESS THAN (to_days('1996-01-01')) ,
  8.     -> PARTITION p2 VALUES LESS THAN (to_days('1997-01-01')) ,
  9.     -> PARTITION p3 VALUES LESS THAN (to_days('1998-01-01')) ,
  10.     -> PARTITION p4 VALUES LESS THAN (to_days('1999-01-01')) ,
  11.     -> PARTITION p5 VALUES LESS THAN (to_days('2000-01-01')) ,
  12.     -> PARTITION p6 VALUES LESS THAN (to_days('2001-01-01')) ,
  13.     -> PARTITION p7 VALUES LESS THAN (to_days('2002-01-01')) ,
  14.     -> PARTITION p8 VALUES LESS THAN (to_days('2003-01-01')) ,
  15.     -> PARTITION p9 VALUES LESS THAN (to_days('2004-01-01')) ,
  16.     -> PARTITION p10 VALUES LESS THAN (to_days('2010-01-01')),
  17.     -> PARTITION p11 VALUES LESS THAN MAXVALUE );
  18. Query OK, 0 rows affected (0.00 sec)

以to_days()函数分区成功,我们分析一下看看:

CODE:
  1. mysql> explain partitions
  2.     -> select count(*) from part_date3 where
  3.     ->      c3> date '1995-01-01' and c3 '1995-12-31'\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         table: part_date3
  8.    partitions: p1
  9.          type: ALL
  10. possible_keys: NULL
  11.           key: NULL
  12.       key_len: NULL
  13.           ref: NULL
  14.          rows: 808431
  15.         Extra: Using where
  16. 1 row in set (0.00 sec)

可以看到,优化器这次不负众望,仅仅在p1分区进行查询。在这种情况下查询,真的能够带来提升查询效率么?下面分别对这次建立的part_date3和之前分区失败的part_date1做一个查询对比:

CODE:
  1. mysql> select count(*) from part_date3 where
  2.     ->      c3> date '1995-01-01' and c3 '1995-12-31';
  3. +----------+
  4. | count(*) |
  5. +----------+
  6. |   805114 |
  7. +----------+
  8. 1 row in set (4.11 sec)
  9.  
  10. mysql> select count(*) from part_date1 where
  11.     ->      c3> date '1995-01-01' and c3 '1995-12-31';
  12. +----------+
  13. | count(*) |
  14. +----------+
  15. |   805114 |
  16. +----------+
  17. 1 row in set (40.33 sec)

可以看到,分区正确的话query花费时间为4秒,而分区错误则花费时间40秒(相当于没有分区),效率有90%的提升!所以我们千万要正确的使用分区功能,分区后务必用explain验证,这样才能获得真正的性能提升。

热切期待msyql 5.1稳定版发布!

版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本声明

阅读(2707) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~