Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1364941
  • 博文数量: 172
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3831
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc

文章分类

全部博文(172)

文章存档

2024年(27)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

分类: Oracle

2024-05-17 16:46:26

接:ORACLE CBO 默认选择率及其问题解决方案_PART1

本节研究下列无统计信息情况下的默认选择率。

2.列无统计信息情况下的选择率计算


列无统计信息,常发生在统计信息自动采集关闭,手动或定时收集统计信息时,采用:

点击(此处)折叠或打开

  1. exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'sel_test',estimate_percent=>10,-
  2. method_opt=>'for all indexed columns size skewonly',no_invalidate=>false);

也即只对索引列收集统计信息其它列未收集统计信息,这个主要是DBA对统计信息了解的不够,
认为只要收集索引列统计信息即可,
其实非索引列统计信息也很重要,它的选择性影响过滤条件计算返回的cardinality,从而影响join method和join order的选择。


所以收集统计信息还是要把索引列,非索引列统计信息都要收集,从12c开始,建议estimate_percent用默认的,才可能收集
hybirid histogram和top n histogram。

如果一个表已经生产使用过,method_opt也可以考虑使用默认值,会根据column_usage$自动决定是否收集直方图,使用默认值的可以去掉estimate_percent和method_opt也可以手动写上。

当然如果表很大,需要控制收集比例,也可以写estimate_percent,但是不会收集hybirid histogram和top n histogram。

建议收集语句如下:

点击(此处)折叠或打开

  1. exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'sel_test',estimate_percent=>dbms_stats.auto_sample_size,-
  2. method_opt=>'for all columns size auto',no_invalidate=>false);

如果分区表,建议增加partname,避免无数据的分区num_rows=0.

另外method_opt支持多个条件混合收集,如下:

点击(此处)折叠或打开

  1. FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
  2. FOR COLUMNS [size clause] column [size_clause] [,column [size_clause]...]
比如:

点击(此处)折叠或打开

  1. exec DBMS_STATS.GATHER_TABLE_STATS(ownname=>'xxx',tabname=>'T',estimate_percent=>10,-
  2. method_opt=>'FOR ALL INDEXED COLUMNS SIZE SKEWONLY FOR COLUMNS SIZE 1 OBJECT_NAME,OBJECT_TYPE,CREATED,-
  3. LAST_DDL_TIME SIZE 254',no_invalidate=>false,cascade=>true,degree => 10);
for all indexed columns size skewonly for all columns size 1这种不行,只会收集索引列。
如果所有列有倾斜的都要收集直方图,不考虑col_usage$,可以用
method_opt=>'for all columns size skewonly'

列无统计信息,就不考虑1/num_distinct,density,num_nulls等,
比较的值不管是否是具体值,都直接按照对应的默认选择率计算,因为列无统计信息,
无法按照选择率计算公式计算选择率,

其实和有统计信息,绑定变量无具体值类似(除了等值有索引是0.004)。

1)列无统计信息,有索引且条件是等值,走索引范围扫描 ix_sel=0.004,ix_sel_with_filters=0.01

先删除object_name列统计信息

点击(此处)折叠或打开

  1. exec dbms_stats.delete_column_stats(ownname=>user,tabname=>'sel_test',colname=>'object_name',no_invalidate=>false);

  2. select column_name,num_distinct,num_nulls,density,histogram
  3. from dba_tab_col_statistics
  4. where table_name='SEL_TEST'
  5. and column_name='OBJECT_NAME';
  6. no rows selected

  7. Elapsed: 00:00:00.25
现在object_name列无统计信息,不知道列的num_distinct和density,num_nulls,按照前面的公式就不行了,这时候就要采用
默认选择率。

列有索引且是等值,因为列无统计信息,
则不能按照1/num_distinct等计算,默认选择率0.004,回表也不是按照0.05计算,而是按照0.01
计算。
有没有具体值都一样。
这个默认选择率还是很小的,偏向于走索引。

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name = :var;

  4. select * from table(dbms_xplan.display);


  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 310337252

  8. ----------------------------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ----------------------------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 728 | 96096 | 149 (0)| 00:00:01 |
  12. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SEL_TEST | 728 | 96096 | 149 (0)| 00:00:01 |
  13. |* 2 | INDEX RANGE SCAN | IDX_SEL_TEST | 291 | | 3 (0)| 00:00:01 |
  14. ----------------------------------------------------------------------------------------------------

  15. Predicate Information (identified by operation id):
  16. ---------------------------------------------------

  17.    2 - access("OBJECT_NAME"=:VAR)

对应ix_sel=291/72800=.003997253=0.004
对应回表选择率sel=728/72800=0.01

传入具体值一样:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name = 'TA';

  4. select * from table(dbms_xplan.display);
  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 310337252

  8. ----------------------------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ----------------------------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 728 | 96096 | 149 (0)| 00:00:01 |
  12. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SEL_TEST | 728 | 96096 | 149 (0)| 00:00:01 |
  13. |* 2 | INDEX RANGE SCAN | IDX_SEL_TEST | 291 | | 3 (0)| 00:00:01 |
  14. ----------------------------------------------------------------------------------------------------

  15. Predicate Information (identified by operation id):
  16. ---------------------------------------------------

  17.    2 - access("OBJECT_NAME"='TA')

2)等值无统计信息,列无索引列默认选择率sel=0.01,包括绑定变量无具体值,函数条件的等值查询,不管有没有统计信息都是1%。


通过1)计算的回表选择率是0.01也可以看出,因为回表利用的就是无索引选择率,删除object_name索引测试:
有无具体值,选择率一样。

点击(此处)折叠或打开

  1. drop index idx_sel_test;

  2. explain plan for
  3. select * from sel_test
  4. where object_name = :var;

  5. select * from table(dbms_xplan.display);

  6. PLAN_TABLE_OUTPUT
  7. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  8. Plan hash value: 747827318

  9. ------------------------------------------------------------------------------
  10. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  11. ------------------------------------------------------------------------------
  12. | 0 | SELECT STATEMENT | | 728 | 96096 | 395 (1)| 00:00:01 |
  13. |* 1 | TABLE ACCESS FULL| SEL_TEST | 728 | 96096 | 395 (1)| 00:00:01 |
  14. ------------------------------------------------------------------------------

  15. Predicate Information (identified by operation id):
  16. ---------------------------------------------------

  17.    1 - filter("OBJECT_NAME"=:VAR)
很显然列无统计信息,且无索引默认选择率是1%= 728/72800=0.01.

传入具体值一样:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name = 'TA';

  4. select * from table(dbms_xplan.display);

  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 728 | 96096 | 395 (1)| 00:00:01 |
  12. |* 1 | TABLE ACCESS FULL| SEL_TEST | 728 | 96096 | 395 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("OBJECT_NAME"='TA')
函数等值和前面有统计信息的一样,都是1%。

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where SUBSTR(object_name,1,1) = 'TA';
  4. select * from table(dbms_xplan.display);

  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 728 | 96096 | 396 (1)| 00:00:01 |
  12. |* 1 | TABLE ACCESS FULL| SEL_TEST | 728 | 96096 | 396 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter(SUBSTR("OBJECT_NAME",1,1)='TA')

有没有注意到SQL条件:SUBSTR(object_name,1,1) = 'TA',因为substr条件只匹配一个字符,而值是2个字符,
实际上,这个条件是impossible的,但是这里CBO却没有考虑这些,这就是既定规则的默认选择率,所以,默认选择率很多时候是不准确的。

3)列无统计信息,对于非索引列范围查询,和有统计信息无具体绑定变量值一样,非索引列
(包括函数)是5%


object_name列索引已经删除:


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name like :var;

  4. select * from table(dbms_xplan.display);
  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 3640 | 469K| 395 (1)| 00:00:01 |
  12. |* 1 | TABLE ACCESS FULL| SEL_TEST | 3640 | 469K| 395 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("OBJECT_NAME" LIKE :VAR)
函数范围也是5%,不管是否有具体值:


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where substr(object_name,1,1) like :var;


  4. PLAN_TABLE_OUTPUT
  5. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  6. Plan hash value: 747827318

  7. ------------------------------------------------------------------------------
  8. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  9. ------------------------------------------------------------------------------
  10. | 0 | SELECT STATEMENT | | 3640 | 469K| 396 (1)| 00:00:01 |
  11. |* 1 | TABLE ACCESS FULL| SEL_TEST | 3640 | 469K| 396 (1)| 00:00:01 |
  12. ------------------------------------------------------------------------------

  13. Predicate Information (identified by operation id):
  14. ---------------------------------------------------

  15.    1 - filter(SUBSTR("OBJECT_NAME",1,1) LIKE :VAR)

4)列无统计信息,范围条件走索引的选择率是0.009,回表是0.05,和索引列有统计信息,使用绑定变量无具体值的范围查询一样。


将object_name索引加上:

点击(此处)折叠或打开

  1. create index idx_sel_test on sel_test(object_name);

SQL如下:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name like :var;

  4. select * from table(dbms_xplan.display);

有索引列的范围查询ix_sel=655/72800=0.009,回表选择率=3640/72800=0.05:

点击(此处)折叠或打开

  1. PLAN_TABLE_OUTPUT
  2. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  3. Plan hash value: 310337252

  4. ----------------------------------------------------------------------------------------------------
  5. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  6. ----------------------------------------------------------------------------------------------------
  7. | 0 | SELECT STATEMENT | | 3640 | 469K| 335 (0)| 00:00:01 |
  8. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SEL_TEST | 3640 | 469K| 335 (0)| 00:00:01 |
  9. |* 2 | INDEX RANGE SCAN | IDX_SEL_TEST | 655 | | 7 (0)| 00:00:01 |
  10. ----------------------------------------------------------------------------------------------------

  11. Predicate Information (identified by operation id):
  12. ---------------------------------------------------

  13.    2 - access("OBJECT_NAME" LIKE :VAR)
  14.        filter("OBJECT_NAME" LIKE :VAR)

5)列无统计信息,<>条件选择率是5%,和列有统计信息按照1-sel(等值)的不一样。


注意这里<>的选择率不是1-sel(=)的,而是默认5%,和列有统计信息按照1-sel(等值)的不一样。


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name <> 'A';
  4. select * from table(dbms_xplan.display);

  5. PLAN_TABLE_OUTPUT
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. Plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 3640 | 469K| 395 (1)| 00:00:01 |
  12. |* 1 | TABLE ACCESS FULL| SEL_TEST | 3640 | 469K| 395 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("OBJECT_NAME"<>'A')

列OBJECT_NAME无统计信息,<>的选择率=5%
   
如果OBJECT_NAME有统计信息,则<>的选择率=1-sel(object_name=),返回行=round(72800*(1-1/60896))=72799.

收集统计信息:

点击(此处)折叠或打开

  1. exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'sel_test',estimate_percent=>dbms_stats.auto_sample_size,-
  2. method_opt=>'for all columns size auto',no_invalidate=>false);

列有统计信息则按照1-sel(等值)计算:

点击(此处)折叠或打开

  1. PLAN_TABLE_OUTPUT
  2. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  3. Plan hash value: 747827318

  4. ------------------------------------------------------------------------------
  5. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  6. ------------------------------------------------------------------------------
  7. | 0 | SELECT STATEMENT | | 72799 | 9384K| 398 (2)| 00:00:01 |
  8. |* 1 | TABLE ACCESS FULL| SEL_TEST | 72799 | 9384K| 398 (2)| 00:00:01 |
  9. ------------------------------------------------------------------------------

  10. Predicate Information (identified by operation id):
  11. ---------------------------------------------------

  12.    1 - filter("OBJECT_NAME"<>'A')

下一篇:ORACLE CBO 默认选择率及其问题解决方案_PART3

阅读(592) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~