阿里巴巴DBA,原去哪儿网DBA。专注于MySQL源码研究、DBA运维、CGroup虚拟化及Linux Kernel源码研究等。 github:https://github.com/HengWang/ Email:king_wangheng@163.com 微博 :@王恒-Henry QQ :506437736
分类: Mysql/postgreSQL
2012-08-25 14:19:09
目的
基于之前对MySQL查询优化器源码的分析,本文对查询优化器进行进一步的实验分析。一方面便于DBA和开发人员在设计SQL语句时,了解SQL查询和优化的策略,以及生成查询计划的过程,进一步有助于SQL语句的设计。另一方面,对现有的SQL查询语句进行优化,提高查询效率,减少慢查询。
测试方案
本文基于数据库学习过程中最经典的模型:学生选课数据库样例,展开SQL语句的实验测试分析。SQL测试语句尽量提供各种可能的情况,从普通查询,复杂join连接查询,子查询,嵌套join查询等方面进行实验测试,分析查询优化和生成查询计划的过程。
1、数据表设计
数据表包括学生表(student)、课程表(course)、选课表(std_cur),具体数据表设计如下:
学生表(student)如下所示:
Field |
Type |
Key |
Comment |
std_id |
int |
primary key |
学号 |
std_name |
varchar(20) |
|
姓名 |
std_spec |
varchar(20) |
|
专业 |
std_*** |
tinyint |
|
性别 |
std_age |
tinyint unsigned |
|
年龄 |
课程表(course)如下所示:
Field |
Type |
Key |
Comment |
cur_id |
int |
primary key |
课程号 |
cur_name |
varchar(20) |
|
课程名 |
cur_credit |
tinyint unsigned |
|
课程学分 |
cur_hours |
smallint unsigned |
|
课程学时 |
学生选课表(std_cur)如下所示:
Field |
Type |
Key |
Comment |
std_id |
int |
primary key |
学号 |
cur_id |
int |
primary key |
课程号 |
score |
tinyint unsigned |
|
分数 |
2、数据表定义
student表定义如下所示:
CREATE TABLE `student` ( `std_id` int(11) NOT NULL, `std_name` varchar(20) NOT NULL DEFAULT '""', `std_spec` varchar(20) NOT NULL DEFAULT '""', `std_***` tinyint(4) NOT NULL DEFAULT '0', `std_age` tinyint(3) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`std_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; |
course表定义如下所示:
CREATE TABLE `course` ( `cur_id` int(11) NOT NULL, `cur_name` varchar(20) NOT NULL DEFAULT '""', `cur_credit` tinyint(4) NOT NULL DEFAULT '0', `cur_hours` smallint(6) NOT NULL DEFAULT '0', PRIMARY KEY (`cur_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; |
std_cur表定义如下所示:
CREATE TABLE `std_cur` ( `std_id` int(11) NOT NULL, `cur_id` int(11) NOT NULL, `score` tinyint(4) NOT NULL DEFAULT '0', PRIMARY KEY (`std_id`,`cur_id`), KEY `std_cur_ibfk_2` (`cur_id`), CONSTRAINT `std_cur_ibfk_1` FOREIGN KEY (`std_id`) REFERENCES `student` (`std_id`) ON DELETE NO ACTION ON UPDATE CASCADE, CONSTRAINT `std_cur_ibfk_2` FOREIGN KEY (`cur_id`) REFERENCES `course` (`cur_id`) ON DELETE NO ACTION ON UPDATE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8; |
3、测试数据设计
其中插入的测试数据描述如下:
1、student添加测试数据26条;
2、course添加测试数据20条;
3、std_cur添加测试数据520条。
4、SQL语句设计
测试的SQL语句分为普通查询测试语句、复杂join查询测试语句、子查询测试语句、嵌套join查询测试语句等测试。其中,很多测试SQL语句是等价的,之所以如此设计,是考虑到对不同形式,研究MySQL查询优化器的工作原理。具体如下所示:
单表操作 1)查询student中的所有字段 SELECT * FROM student; 2)查询student中的指定字段 SELECT std_id, std_name, std_spec, std_***, std_age FROM student; 3)查询student中没有重复的字段 SELECT DISTINCT std_spec, std_***, std_age FROM student; select std_spec, std_***, std_age FROM student GROUP BY std_spec, std_***, std_age; 4)查询student中主键字段 SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_id = 2012072306; 5)查询student中普通字段 SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_name="fff"; 注意:为std_name添加索引,查看查询计划。 6)查询student中多个普通字段(包含常量条件) SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_name="fff" AND std_age > 20 AND 1=1; 7)查询student中limit条件 SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_spec = "computer" LIMIT 5; 8)查询student中in条件(索引和普通字段) SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_name in ("bbb","ccc","ddd"); SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_spec in ("math","information"); 9)查询student中group by、order by、having联合条件 SELECT std_spec, COUNT(std_spec) cnt FROM student GROUP BY std_spec HAVING cnt > 3 ORDER BY cnt; 10)查询student中union条件 SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_id=2012072306 UNION SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_name="bbb" UNION SELECT std_id, std_name, std_spec, std_***, std_age FROM student WHERE std_spec="math";
复合查询 1)多表联合查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student, course, std_cur WHERE student.std_id = std_cur.std_id AND course.cur_id = std_cur.cur_id AND course.cur_id = 101; 2)Join查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student JOIN(course, std_cur) ON (student.std_id=std_cur.std_id AND std_cur.cur_id=course.cur_id AND course.cur_id = 101); 3)Join嵌套查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student JOIN std_cur ON student.std_id = std_cur.std_id JOIN course ON std_cur.cur_id=course.cur_id WHERE course.cur_id = 101; 4)Join嵌套查询(普通字段) SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student JOIN (std_cur JOIN course ON std_cur.cur_id=course.cur_id) ON student.std_id = std_cur.std_id WHERE course.cur_name = 'PHP'; 5)Left join嵌套(join)查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student LEFT JOIN(course JOIN std_cur ON std_cur.cur_id=course.cur_id) ON (student.std_id=std_cur.std_id) WHERE course.cur_name = 'C'; 6)Natural join查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student NATURAL JOIN std_cur NATURAL JOIN course WHERE course.cur_id = 101; 7)Straight_join查询 SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student STRAIGHT_JOIN(course STRAIGHT_JOIN std_cur ON std_cur.cur_id=course.cur_id) ON (student.std_id=std_cur.std_id) WHERE course.cur_name = 'C'; SELECT student.std_id, std_name, std_spec, std_***, std_age, cur_name, cur_credit, cur_hours, score FROM student STRAIGHT_JOIN(std_cur STRAIGHT_JOIN course ON std_cur.cur_id=course.cur_id) ON (student.std_id=std_cur.std_id) WHERE course.cur_name = 'C'; 8)子查询 SELECT student.std_id, std_name, student.std_spec, std_***, std_age, SUM(cur_credit) total FROM student, course, std_cur, (SELECT std_spec, AVG(score) savg FROM student LEFT JOIN std_cur ON (student.std_id = std_cur.std_id) GROUP BY std_spec) tmp WHERE student.std_id = std_cur.std_id AND course.cur_id = std_cur.cur_id AND student.std_spec = tmp.std_spec AND score > tmp.savg GROUP BY std_id ORDER BY total; |
具体的测试过程将在接下来的《MySQL 查询优化器实验分析(二)》文档中给出。