关于关系表的设计归根结底有两个方面。
第一,就是完全按照范式理论去设计,一般来说达到第三范式就可以了,或者你可以划分的更细到达更上一层次。比如第四,第五,第六等等。这种设计有自己的可读性很强,但是有一点,在检索数据的时候增加了多张关系表来做关联的开销。
第二,就是在范式理论上适当的做些反范式,有的东西还是不要太剥离的好。(窄表以及宽表) 这点和软件设计中的紧耦合松耦合理论一致。
下面我就以常用的LOG表来做下演示,其中有两种表的实际,一种是窄表,一种是稍微宽一点的表。
窄表:log_ytt
-
mysql> show create table log_ytt;
-
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
| Table | Create Table |
-
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
| log_ytt | CREATE TABLE `log_ytt` (
-
`ids` bigint(20) DEFAULT NULL,
-
`log_time` datetime DEFAULT NULL,
-
KEY `idx_u1` (`ids`,`log_time`)
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
-
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
1 row in set (0.00 sec)
表记录数
-
mysql> select * from log_ytt where ids > '4875000001'; +------------+---------------------+
-
| ids | log_time |
-
+------------+---------------------+
-
| 7110000001 | 2014-05-20 21:56:42 |
-
| 6300000001 | 2014-05-20 21:56:42 |
-
| 6750000001 | 2014-05-20 21:56:42 |
-
| 5310000001 | 2014-05-20 21:56:42 |
-
| 7200000001 | 2014-05-20 21:56:42 |
-
| 7380000001 | 2014-05-20 21:56:42 |
-
| 5760000001 | 2014-05-20 21:56:42 |
-
| 6930000001 | 2014-05-20 21:56:42 |
-
| 6660000001 | 2014-05-20 21:56:42 |
-
| 5670000001 | 2014-05-20 21:56:42 |
-
| 6210000001 | 2014-05-20 21:56:42 |
-
| 5850000001 | 2014-05-20 21:56:42 |
-
| 6570000001 | 2014-05-20 21:56:42 |
-
| 5580000001 | 2014-05-20 21:56:42 |
-
| 5130000001 | 2014-05-20 21:56:42 |
-
| 7290000001 | 2014-05-20 21:56:42 |
-
| 6390000001 | 2014-05-20 21:56:42 |
-
| 5490000001 | 2014-05-20 21:56:42 |
-
| 5220000001 | 2014-05-20 21:56:42 |
-
| 7560000001 | 2014-05-20 21:56:42 |
-
| 7470000001 | 2014-05-20 21:56:42 |
-
| 7020000001 | 2014-05-20 21:56:42 |
-
| 6840000001 | 2014-05-20 21:56:42 |
-
| 6030000001 | 2014-05-20 21:56:42 |
-
| 6480000001 | 2014-05-20 21:56:42 |
-
| 7650000001 | 2014-05-20 21:56:42 |
-
| 5940000001 | 2014-05-20 21:56:42 |
-
| 6120000001 | 2014-05-20 21:56:42 |
-
| 7740000001 | 2014-05-20 21:56:42 |
-
| 5400000001 | 2014-05-20 21:56:42 |
-
| 5760000001 | 2014-05-21 03:19:07 |
-
| 6840000001 | 2014-05-21 03:19:17 |
-
| 7020000001 | 2014-05-21 03:19:32 |
-
| 7200000001 | 2014-05-21 03:19:45 |
-
| 7110000001 | 2014-05-21 03:19:46 |
-
| 7380000001 | 2014-05-21 03:19:48 |
-
| 5670000001 | 2014-05-21 03:19:58 |
-
| 6930000001 | 2014-05-21 03:19:59 |
-
| 6030000001 | 2014-05-21 03:20:00 |
-
| 5940000001 | 2014-05-21 03:20:00 |
-
| 7290000001 | 2014-05-21 03:20:02 |
-
| 6120000001 | 2014-05-21 03:20:09 |
-
| 5850000001 | 2014-05-21 03:20:18 |
-
| 5580000001 | 2014-05-21 03:20:24 |
-
| 6480000001 | 2014-05-21 03:25:05 |
-
| 6390000001 | 2014-05-21 03:25:37 |
-
| 6210000001 | 2014-05-21 03:25:45 |
-
| 7470000001 | 2014-05-21 03:26:14 |
-
| 6750000001 | 2014-05-21 03:27:17 |
-
| 5310000001 | 2014-05-21 03:27:33 |
-
| 5130000001 | 2014-05-21 03:27:34 |
-
| 6570000001 | 2014-05-21 03:27:34 |
-
| 7560000001 | 2014-05-21 03:27:45 |
-
| 5220000001 | 2014-05-21 03:27:45 |
-
| 5400000001 | 2014-05-21 03:27:53 |
-
| 5490000001 | 2014-05-21 03:27:55 |
-
| 6660000001 | 2014-05-21 03:28:07 |
-
| 6300000001 | 2014-05-21 03:28:13 |
-
| 7740000001 | 2014-05-21 03:28:26 |
-
| 7650000001 | 2014-05-21 03:28:37 |
-
+------------+---------------------+
-
60 rows in set (0.00 sec)
接下来,我们要检索所有IDS的平均时间。 有以下两种方式:
第一, 对表进行了两次访问,并且有GROUP BY 操作,不可取。
-
mysql> select sec_to_time(avg(timestampdiff(second,a.times,b.times))) as 'running'
-
-> from
-
-> (select ids,min(log_time) as times from log_ytt where 1 group by ids ) as a,
-
-> (select ids,max(log_time) as times from log_ytt where 1 group by ids) as b where a.ids = b.ids;
-
+---------------+
-
| running |
-
+---------------+
-
| 05:27:08.8333 |
-
+---------------+
-
1 row in set (0.00 sec)
第二,虽然对表进行了最少的访问,但是也有一次GROUP BY 操作。也没办法,表设计如此。
-
mysql> SELECT SEC_TO_TIME(AVG(times)) AS 'Running' FROM
-
-> (
-
-> SELECT TIMESTAMPDIFF(SECOND,MIN(log_time),MAX(log_time)) AS times FROM log_ytt GROUP BY ids
-
-> ) AS T;
-
+---------------+
-
| Running |
-
+---------------+
-
| 05:27:08.8333 |
-
+---------------+
-
1 row in set (0.00 sec)
宽表:log_ytt_horizontal.
-
mysql> show create table log_ytt_horizontal;
-
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
| Table | Create Table |
-
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
| log_ytt_horizontal | CREATE TABLE `log_ytt_horizontal` (
-
`ids` bigint(20) NOT NULL,
-
`start_time` datetime DEFAULT NULL,
-
`end_time` datetime DEFAULT NULL,
-
PRIMARY KEY (`ids`)
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
-
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
1 row in set (0.00 sec)
表记录数:
-
mysql> select * from log_ytt_horizontal;
-
+------------+---------------------+---------------------+
-
| ids | start_time | end_time |
-
+------------+---------------------+---------------------+
-
| 5130000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:34 |
-
| 5220000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:45 |
-
| 5310000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:33 |
-
| 5400000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:53 |
-
| 5490000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:55 |
-
| 5580000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:24 |
-
| 5670000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:58 |
-
| 5760000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:07 |
-
| 5850000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:18 |
-
| 5940000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:00 |
-
| 6030000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:00 |
-
| 6120000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:09 |
-
| 6210000001 | 2014-05-20 21:56:42 | 2014-05-21 03:25:45 |
-
| 6300000001 | 2014-05-20 21:56:42 | 2014-05-21 03:28:13 |
-
| 6390000001 | 2014-05-20 21:56:42 | 2014-05-21 03:25:37 |
-
| 6480000001 | 2014-05-20 21:56:42 | 2014-05-21 03:25:05 |
-
| 6570000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:34 |
-
| 6660000001 | 2014-05-20 21:56:42 | 2014-05-21 03:28:07 |
-
| 6750000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:17 |
-
| 6840000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:17 |
-
| 6930000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:59 |
-
| 7020000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:32 |
-
| 7110000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:46 |
-
| 7200000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:45 |
-
| 7290000001 | 2014-05-20 21:56:42 | 2014-05-21 03:20:02 |
-
| 7380000001 | 2014-05-20 21:56:42 | 2014-05-21 03:19:48 |
-
| 7470000001 | 2014-05-20 21:56:42 | 2014-05-21 03:26:14 |
-
| 7560000001 | 2014-05-20 21:56:42 | 2014-05-21 03:27:45 |
-
| 7650000001 | 2014-05-20 21:56:42 | 2014-05-21 03:28:37 |
-
| 7740000001 | 2014-05-20 21:56:42 | 2014-05-21 03:28:26 |
-
+------------+---------------------+---------------------+
-
30 rows in set (0.00 sec)
如果对这种稍微冗余一些的表来进行查询,那么对表的访问以及CPU的资源占用都达到了最低。
-
mysql> select sec_to_time(avg(timestampdiff(second,start_time,end_time))) as 'Running' from log_ytt_horizontal;
-
+---------------+
-
| Running |
-
+---------------+
-
| 05:27:08.8333 |
-
+---------------+
-
1 row in set (0.00 sec)
阅读(6907) | 评论(0) | 转发(0) |