今天在和网友Mablevi探讨连接查询使用索引的情况,在改变连接顺序后,用EXPLAIN查看执行计划,都是小表在前面,大表在后面。我们知道MYSQL执行连接查询的时候,使用的叫“内循环”的技术,即依次取出外层循环的表(驱动表)的记录,然后到内层循环表(被驱动表)去查找符合条件的记录。驱动表是依次循环,全表扫描的,被驱动表可以通过索引查找,提高效率。
同时我们知道,MYSQL的执行计划是会去计算开销的,一个SQL查询可能会产生好几百种执行计划,根据开销最小原则,EXPLAIN出来的,就是经过挑选后产生的最优开销。下面我们来实际操作验证一下MYSQL的执行技术是否真的够“聪明”。
先看看表记录数:
mysql> select count(*) from terminal; +----------+ | count(*) | +----------+ | 1579 | +----------+ 1 row in set (0.06 sec)
mysql> select count(*) from area; +----------+ | count(*) | +----------+ | 3350 | +----------+ 1 row in set (0.09 sec)
mysql>
|
在看看索引情况
mysql> show index from terminal; +----------+------------+---------------+--------------+-------------+---------- -+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------+------------+---------------+--------------+-------------+---------- -+-------------+----------+--------+------+------------+---------+ | terminal | 0 | PRIMARY | 1 | terminal_id | A | 1565 | NULL | NULL | | BTREE | | | terminal | 1 | index_version | 1 | equ_version | A | 5 | NULL | NULL | YES | BTREE | | +----------+------------+---------------+--------------+-------------+---------- -+-------------+----------+--------+------+------------+---------+ 2 rows in set (0.08 sec)
mysql> show index from area; +-------+------------+----------+--------------+-------------+-----------+------ -------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardi nality | Sub_part | Packed | Null | Index_type | Comment | +-------+------------+----------+--------------+-------------+-----------+------ -------+----------+--------+------+------------+---------+ | area | 0 | PRIMARY | 1 | area_id | A | 3811 | NULL | NULL | | BTREE | | +-------+------------+----------+--------------+-------------+-----------+------ -------+----------+--------+------+------------+---------+ 1 row in set (0.09 sec)
mysql>
|
terminal的字段area_center_id是area表的外键。
1:正常情况下,用小表驱动大表查询看看执行的开销
mysql> select sql_no_cache count(*) from terminal inner join area on terminal.area_center_id=area.area_id; +----------+ | count(*) | +----------+ | 1577 | +----------+ 1 row in set (0.13 sec)
mysql> explain select sql_no_cache count(*) from terminal inner join area on terminal.area_center_id=area.area_id; +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ | 1 | SIMPLE | terminal | ALL | NULL | NULL | NULL | NULL | 1565 | | | 1 | SIMPLE | area | eq_ref | PRIMARY | PRIMARY | 4 | cpc.terminal.area_center_id | 1 | Using index | +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ 2 rows in set (0.09 sec)
mysql> show status like '%last%'; +-----------------+-------------+ | Variable_name | Value | +-----------------+-------------+ | Last_query_cost | 1906.999000 | +-----------------+-------------+ 1 row in set (0.11 sec)
mysql>
|
很明显,首先查询的是TERMINAL表。在这个执行计划下,一共开销1906.999次随机IO读取。
现在换过来,用大表驱动小表,看看执行计划和开销
mysql> explain select sql_no_cache count(*) from area inner join terminal on terminal.area_center_id=area.area_id; +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ | 1 | SIMPLE | terminal | ALL | NULL | NULL | NULL | NULL | 1565 | | | 1 | SIMPLE | area | eq_ref | PRIMARY | PRIMARY | 4 | cpc.terminal.area_center_id | 1 | Using index | +----+-------------+----------+--------+---------------+---------+---------+-----------------------------+------+-------------+ 2 rows in set (0.47 sec)
|
执行计划还是自动选择先读小表,再读大表。
我们再强制执行计划安装查询语句定义的顺序读取表,看看开销
mysql> select sql_no_cache count(*) from area straight_join terminal on terminal.area_center_id=area.area_id; +----------+ | count(*) | +----------+ | 1577 | +----------+ 1 row in set (0.38 sec)
mysql> show status like '%last%'; +-----------------+----------------+ | Variable_name | Value | +-----------------+----------------+ | Last_query_cost | 1192884.999000 | +-----------------+----------------+ 1 row in set (0.06 sec)
mysql> explain select sql_no_cache count(*) from area straight_join terminal on terminal.area_center_id=area.area_id; +----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------------+ | 1 | SIMPLE | area | index | PRIMARY | PRIMARY | 4 | NULL | 3811 | Using index | | 1 | SIMPLE | terminal | ALL | NULL | NULL | NULL | NULL | 1565 | Using where; Using join buffer | +----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------------+ 2 rows in set (0.06 sec)
mysql>
|
执行开销猛增到1192884.999000。是原计划开销的625倍。
这也难怪MYSQL会无视我们的查询语句,而选择自己更优的执行计划了。
阅读(1230) | 评论(1) | 转发(0) |