Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1466046
  • 博文数量: 180
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3926
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc

文章分类

全部博文(180)

文章存档

2025年(8)

2024年(27)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

分类: Mysql/postgreSQL

2025-03-16 18:26:51

在MySQL里,如果order by + limit,where或join条件还有可选索引,有可能走错索引,默认的是偏向于走order by 列索引。

结构如下:

点击(此处)折叠或打开

  1. show create table empm\G 这个可以看索引创建的先后顺序
  2. *************************** 1. row ***************************
  3.        Table: empm
  4. Create Table: CREATE TABLE `empm` (
  5.   `emp_no` int NOT NULL,
  6.   `birth_date` date NOT NULL,
  7.   `first_name` varchar(100) CHARACTER SET utf8mb3 NOT NULL,
  8.   `last_name` varchar(16) CHARACTER SET utf8mb3 DEFAULT NULL,
  9.   `gender` enum('M','F') CHARACTER SET utf8mb3 NOT NULL,
  10.   `hire_date` date NOT NULL,
  11.   `upper_first_name` varchar(100) CHARACTER SET utf8mb3 DEFAULT NULL,
  12.   PRIMARY KEY (`emp_no`),
  13.   KEY `idx_empm` (`first_name`,`last_name`),
  14.   KEY `idx1_empm` (`first_name`),
  15.   KEY `idx2_empm` (`first_name`,`last_name`,`hire_date`),
  16.   KEY `idx3_empm` (`first_name`,`hire_date`)
  17. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
  18. 1 row in set (0.01 sec)


  19. *************************** 1. row ***************************
  20.            Name: empm
  21.          Engine: InnoDB
  22.         Version: 10
  23.      Row_format: Dynamic
  24.            Rows: 199422
  25.  Avg_row_length: 65
  26.     Data_length: 13107200
  27. Max_data_length: 0
  28.    Index_length: 23920640
  29.       Data_free: 5242880
  30.  Auto_increment: NULL
  31.     Create_time: 2024-12-18 17:15:17
  32.     Update_time: NULL
  33.      Check_time: NULL
  34.       Collation: utf8mb4_0900_ai_ci
  35.        Checksum: NULL
  36.  Create_options:
  37.         Comment:
  38. 1 row in set (0.00 sec)

  39. 索引如下:
  40.  show index from empm; 这个可以看索引创建的先后顺序
  41. +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
  42. | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
  43. +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
  44. | empm | 0 | PRIMARY | 1 | emp_no | A | 199422 | NULL | NULL | | BTREE | | | YES | NULL |
  45. | empm | 1 | idx_empm | 1 | first_name | A | 1260 | NULL | NULL | | BTREE | | | YES | NULL |
  46. | empm | 1 | idx_empm | 2 | last_name | A | 147521 | NULL | NULL | YES | BTREE | | | YES | NULL |
  47. | empm | 1 | idx1_empm | 1 | first_name | A | 1334 | NULL | NULL | | BTREE | | | YES | NULL |
  48. | empm | 1 | idx2_empm | 1 | first_name | A | 1299 | NULL | NULL | | BTREE | | | YES | NULL |
  49. | empm | 1 | idx2_empm | 2 | last_name | A | 143264 | NULL | NULL | YES | BTREE | | | YES | NULL |
  50. | empm | 1 | idx2_empm | 3 | hire_date | A | 199422 | NULL | NULL | | BTREE | | | YES | NULL |
  51. | empm | 1 | idx3_empm | 1 | first_name | A | 1269 | NULL | NULL | | BTREE | | | YES | NULL |
  52. | empm | 1 | idx3_empm | 2 | hire_date | A | 156396 | NULL | NULL | | BTREE | | | YES | NULL |
  53. +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
  54. 9 rows in set (0.00 sec)

  55. 这个不能看索引创建的先后顺序:
  56. select * from mysql.innodb_index_stats where table_name='empm';
  57. +---------------+------------+------------+---------------------+--------------+------------+-------------+---------------------------------------+
  58. | database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
  59. +---------------+------------+------------+---------------------+--------------+------------+-------------+---------------------------------------+
  60. | employees | empm | PRIMARY | 2024-12-18 17:15:23 | n_diff_pfx01 | 199422 | 20 | emp_no |
  61. | employees | empm | PRIMARY | 2024-12-18 17:15:23 | n_leaf_pages | 691 | NULL | Number of leaf pages in the index |
  62. | employees | empm | PRIMARY | 2024-12-18 17:15:23 | size | 800 | NULL | Number of pages in the index |
  63. | employees | empm | idx1_empm | 2024-12-18 17:15:23 | n_diff_pfx01 | 1334 | 20 | first_name |
  64. | employees | empm | idx1_empm | 2024-12-18 17:15:23 | n_diff_pfx02 | 189886 | 20 | first_name,emp_no |
  65. | employees | empm | idx1_empm | 2024-12-18 17:15:23 | n_leaf_pages | 203 | NULL | Number of leaf pages in the index |
  66. | employees | empm | idx1_empm | 2024-12-18 17:15:23 | size | 288 | NULL | Number of pages in the index |
  67. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | n_diff_pfx01 | 1299 | 20 | first_name |
  68. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | n_diff_pfx02 | 143264 | 20 | first_name,last_name |
  69. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | n_diff_pfx03 | 200569 | 20 | first_name,last_name,hire_date |
  70. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | n_diff_pfx04 | 199408 | 20 | first_name,last_name,hire_date,emp_no |
  71. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | n_leaf_pages | 352 | NULL | Number of leaf pages in the index |
  72. | employees | empm | idx2_empm | 2024-12-18 17:15:23 | size | 417 | NULL | Number of pages in the index |
  73. | employees | empm | idx3_empm | 2024-12-18 17:15:23 | n_diff_pfx01 | 1269 | 20 | first_name |
  74. | employees | empm | idx3_empm | 2024-12-18 17:15:23 | n_diff_pfx02 | 156396 | 20 | first_name,hire_date |
  75. | employees | empm | idx3_empm | 2024-12-18 17:15:23 | n_diff_pfx03 | 199752 | 20 | first_name,hire_date,emp_no |
  76. | employees | empm | idx3_empm | 2024-12-18 17:15:23 | n_leaf_pages | 240 | NULL | Number of leaf pages in the index |
  77. | employees | empm | idx3_empm | 2024-12-18 17:15:23 | size | 289 | NULL | Number of pages in the index |
  78. | employees | empm | idx_empm | 2024-12-18 17:15:23 | n_diff_pfx01 | 1260 | 20 | first_name |
  79. | employees | empm | idx_empm | 2024-12-18 17:15:23 | n_diff_pfx02 | 147521 | 20 | first_name,last_name |
  80. | employees | empm | idx_empm | 2024-12-18 17:15:23 | n_diff_pfx03 | 207457 | 20 | first_name,last_name,emp_no |
  81. | employees | empm | idx_empm | 2024-12-18 17:15:23 | n_leaf_pages | 342 | NULL | Number of leaf pages in the index |
  82. | employees | empm | idx_empm | 2024-12-18 17:15:23 | size | 466 | NULL | Number of pages in the index |
  83. +---------------+------------+------------+---------------------+--------------+------------+-------------+---------------------------------------+
  84. 23 rows in set (0.00 sec)


问题如下
SQL1竟然默认走idx_empm(first_name,last_name),那么order by emp_no desc这部分就走不了索引,因为没有last_name。
应该走SQL2的所有idx1_empm(first_name)才对,因为所以默认包含主键emp_no,前导列等值,可以利用reverse扫描,消除排序。


从估算的cost看,两条SQL cost一样,都是cost=10291.95,所以按照索引谁先创建的走谁。 idx_empm先于idx1_emp创建,所以走idx_empm,
为什么cost一样呢?因为比较访问路径时候不考虑排序,可以从optimizer trace看出,这个和limit分页里丢掉order by 列是一样的问题,是BUG
###############
这个BUG主要是order by pk或order by pk desc上,在索引访问路径选择的时候没有考虑cost,然后计算order by的时候丢掉了排序列,有undefined。


单表,使用index dive计算选择率。


###########
mysql不计算访问路径不考虑排序列,等到排序优化,看能不能通过索引消除排序,直接启发式选择。

点击(此处)折叠或打开

  1. SQL1:
  2. explain analyze select * from empm where first_name = 'Aamer' order by emp_no desc\G
  3. *************************** 1. row ***************************
  4. EXPLAIN: -> Sort: empm.emp_no DESC (cost=10291.95 rows=96912) (actual time=136.452..143.969 rows=50106 loops=1)
  5.     -> Index lookup on empm using idx_empm (first_name='Aamer') (actual time=0.070..113.065 rows=50106 loops=1)

  6. 1 row in set (0.15 sec)

  7. SQL2:
  8. explain analyze select/*+index(empm idx1_empm)*/ * from empm where first_name = 'Aamer' order by emp_no desc\G
  9. *************************** 1. row ***************************
  10. EXPLAIN: -> Index lookup on empm using idx1_empm (first_name='Aamer') (reverse) (cost=10291.95 rows=96912) (actual time=1.158..85.386 rows=50106 loops=1)

  11. 1 row in set (0.10 sec)

看看trace:
可以看到,因为先比较访问路径时候,没有考虑排序成本,然后idx_empm先计算的,和idx1_empm cost一样,所以选择了idx_empm,
{BANNED}最佳后考虑索引排序,发现排序字段是undefined,中间过程丢掉了,这个是mysql源代码部分丢掉了中间过程,是代码BUG。

这个BUG主要是order by pk或order by pk desc上,order by非主键不会有这个问题


点击(此处)折叠或打开

  1. set optimizer_trace=1;
  2. explain analyze select * from empm where first_name = 'Aamer' order by emp_no desc\G
  3. select * from information_schema.optimizer_trace\G


  4.  {
  5.             "rows_estimation": [
  6.               {
  7.                 "table": "`empm`",
  8.                 "range_analysis": {
  9.                   "table_scan": {
  10.                     "rows": 193825,
  11.                     "cost": 19584.8 --- 全表扫描cost
  12.                   },
  13.                   "potential_range_indexes": [
  14.                     {
  15.                       "index": "PRIMARY",
  16.                       "usable": false,
  17.                       "cause": "not_applicable"
  18.                     },
  19.                     {
  20.                       "index": "idx_empm",
  21.                       "usable": true,
  22.                       "key_parts": [
  23.                         "first_name",
  24.                         "last_name",
  25.                         "emp_no"
  26.                       ]
  27.                     },
  28.                     {
  29.                       "index": "idx1_empm",
  30.                       "usable": true,
  31.                       "key_parts": [
  32.                         "first_name",
  33.                         "emp_no"
  34.                       ]
  35.                     },
  36.                     {
  37.                       "index": "idx2_empm",
  38.                       "usable": true,
  39.                       "key_parts": [
  40.                         "first_name",
  41.                         "hire_date",
  42.                         "emp_no"
  43.                       ]
  44.                     }
  45.                   ],
  46.                   "setup_range_conditions": [
  47.                   ],
  48.                   "group_index_range": {
  49.                     "chosen": false,
  50.                     "cause": "not_group_by_or_distinct"
  51.                   },
  52.                   "skip_scan_range": {
  53.                     "potential_skip_scan_indexes": [
  54.                       {
  55.                         "index": "idx_empm",
  56.                         "usable": false,
  57.                         "cause": "query_references_nonkey_column"
  58.                       },
  59.                       {
  60.                         "index": "idx1_empm",
  61.                         "usable": false,
  62.                         "cause": "query_references_nonkey_column"
  63.                       },
  64.                       {
  65.                         "index": "idx2_empm",
  66.                         "usable": false,
  67.                         "cause": "query_references_nonkey_column"
  68.                       }
  69.                     ]
  70.                   },
  71.                   "analyzing_range_alternatives": {
  72.                     "range_scan_alternatives": [
  73.                       {
  74.                         "index": "idx_empm",
  75.                         "ranges": [
  76.                           "first_name = 'Aamer'"
  77.                         ],
  78.                         "index_dives_for_eq_ranges": true,
  79.                         "rowid_ordered": false,
  80.                         "using_mrr": false,
  81.                         "index_only": false,
  82.                         "in_memory": 1,
  83.                         "rows": 96912,
  84.                         "cost": 33919.5, -- 索引扫描cost >全表扫描的cost
  85.                         "chosen": false, --不选择
  86.                         "cause": "cost"
  87.                       },
  88.                       {
  89.                         "index": "idx1_empm",
  90.                         "ranges": [
  91.                           "first_name = 'Aamer'"
  92.                         ],
  93.                         "index_dives_for_eq_ranges": true,
  94.                         "rowid_ordered": true,
  95.                         "using_mrr": false,
  96.                         "index_only": false,
  97.                         "in_memory": 0.349754,
  98.                         "rows": 96912,
  99.                         "cost": 33919.5, -- 索引扫描cost >全表扫描的cost
  100.                         "chosen": false, --不选择
  101.                         "cause": "cost"
  102.                       },
  103.                       {
  104.                         "index": "idx2_empm",
  105.                         "ranges": [
  106.                           "first_name = 'Aamer'"
  107.                         ],
  108.                         "index_dives_for_eq_ranges": true,
  109.                         "rowid_ordered": false,
  110.                         "using_mrr": false,
  111.                         "index_only": false,
  112.                         "in_memory": 0,
  113.                         "rows": 96912,
  114.                         "cost": 33919.5,
  115.                         "chosen": false,
  116.                         "cause": "cost"
  117.                       }
  118.                     ],
  119.                     "analyzing_roworder_intersect": {
  120.                       "usable": false, -- range scan没有被选择
  121.                       "cause": "too_few_roworder_scans"
  122.                     }
  123.                   }
  124.                 }
  125.               }
  126.             ]
  127.           },
  128.           {
  129.             "considered_execution_plans": [
  130.               {
  131.                 "plan_prefix": [
  132.                 ],
  133.                 "table": "`empm`",
  134.                 "best_access_path": {
  135.                   "considered_access_paths": [
  136.                     {
  137.                       "access_type": "ref", --- 计算ref
  138.                       "index": "idx_empm",
  139.                       "rows": 96912,
  140.                       "cost": 10292, -- 先计算的idx_empm,cost=10292,比全表扫描的小,所以走idx_empm的ref扫描
  141.                       "chosen": true -- 选择这个访问路径,问题是排序成本没有加上去
  142.                     },
  143.                     {
  144.                       "access_type": "ref",
  145.                       "index": "idx1_empm",
  146.                       "rows": 96912,
  147.                       "cost": 10292, -- idx1_empm的ref扫描,cost和idx_empm一样,所以不用,这里其实因为idx_empm ref,
  148.                       "chosen": false --问题是排序成本没有加上去
  149.                     },
  150.                     {
  151.                       "access_type": "ref",
  152.                       "index": "idx2_empm",
  153.                       "rows": 96912,
  154.                       "cost": 10292,
  155.                       "chosen": false
  156.                     },
  157.                     {
  158.                       "rows_to_scan": 193825,
  159.                       "filtering_effect": [
  160.                       ],
  161.                       "final_filtering_effect": 0.499997,
  162.                       "access_type": "scan",
  163.                       "resulting_rows": 96912,
  164.                       "cost": 19582.8,
  165.                       "chosen": false
  166.                     }
  167.                   ]
  168.                 },
  169.                 "condition_filtering_pct": 100,
  170.                 "rows_for_plan": 96912,
  171.                 "cost_for_plan": 10292,
  172.                 "chosen": true
  173.               }
  174.             ]
  175.           },
  176.           {
  177.             "attaching_conditions_to_tables": {
  178.               "original_condition": "(`empm`.`first_name` = 'Aamer')",
  179.               "attached_conditions_computation": [
  180.               ],
  181.               "attached_conditions_summary": [
  182.                 {
  183.                   "table": "`empm`",
  184.                   "attached": "(`empm`.`first_name` = 'Aamer')"
  185.                 }
  186.               ]
  187.             }
  188.           },
  189.           {
  190.             "optimizing_distinct_group_by_order_by": {
  191.               "simplifying_order_by": {
  192.                 "original_clause": "`empm`.`emp_no` desc",
  193.                 "items": [
  194.                   {
  195.                     "item": "`empm`.`emp_no`"
  196.                   }
  197.                 ],
  198.                 "resulting_clause_is_simple": true,
  199.                 "resulting_clause": "`empm`.`emp_no` desc"
  200.               }
  201.             }
  202.           },
  203.           {
  204.             "reconsidering_access_paths_for_index_ordering": { --- 使用索引的排序。。。
  205.               "clause": "ORDER BY",
  206.               "steps": [
  207.               ],
  208.               "index_order_summary": {
  209.                 "table": "`empm`",
  210.                 "index_provides_order": false,
  211.                 "order_direction": "undefined", -- 发现这部分是undefined,中间order by部分丢掉了,所以这个索引cost计算丢掉了排序成本
  212.                 "index": "idx_empm",
  213.                 "plan_changed": false
  214.               }
  215.             }
  216.           },
  217.           {
  218.             "finalizing_table_conditions": [
  219.               {
  220.                 "table": "`empm`",
  221.                 "original_table_condition": "(`empm`.`first_name` = 'Aamer')",
  222.                 "final_table_condition ": null
  223.               }
  224.             ]
  225.           },
  226.           {
  227.             "refine_plan": [
  228.               {
  229.                 "table": "`empm`"
  230.               }
  231.             ]
  232.           },
  233.           {
  234.             "considering_tmp_tables": [
  235.               {
  236.                 "adding_sort_to_table": "empm"
  237.               }
  238.             ]
  239.           }
  240.         ]
  241.       }
  242.     },
  243.     {
  244.       "join_execution": {
  245.         "select#": 1,
  246.         "steps": [
  247.           {
  248.             "sorting_table": "empm",
  249.             "filesort_information": [
  250.               {
  251.                 "direction": "desc",
  252.                 "expression": "`empm`.`emp_no`"
  253.               }
  254.             ],
  255.             "filesort_priority_queue_optimization": {
  256.               "usable": false,
  257.               "cause": "not applicable (no LIMIT)"
  258.             },
  259.             "filesort_execution": [
  260.             ],
  261.             "filesort_summary": {
  262.               "memory_available": 262144,
  263.               "key_size": 8,
  264.               "row_size": 677,
  265.               "max_rows_per_buffer": 387,
  266.               "num_rows_estimate": 18446744073709551615,
  267.               "num_rows_found": 50106,
  268.               "num_initial_chunks_spilled_to_disk": 12,
  269.               "peak_memory_used": 294912,
  270.               "sort_algorithm": "std::stable_sort",
  271.               "sort_mode": ""
  272.             }
  273.           }
  274.         ]
  275.       }
  276.     }
  277.   ]
  278. }





阅读(40) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~