Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1390392
  • 博文数量: 172
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3831
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc

文章分类

全部博文(172)

文章存档

2024年(27)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

分类: Oracle

2024-04-29 09:14:53

一般情况下LIKE查询有三种情况:
1)%后通配可以走INDEX RANGE SCAM,
2)%前通配不能走INDEX RANGE SCAN,如果数据不能全部从索引里获取,还需要回表,则走全表扫描
3)%前通配不能走INDEX RANGE SCAN,如果数据全部从索引里获取可以走INDEX FAST FULL SCAN


数据库版本:ORACLE 19C
实际上本文问题至少从ORACLE 11G开始持续到19C一直存在。

建表语句如下:

点击(此处)折叠或打开

  1. drop table bind_test;
  2. create table bind_test as select * from dba_objects;
  3. create index idx_bind_test on bind_test(object_name);
  4. exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'bind_test',no_invalidate=>false);

1.正常情况下的LIKE访问路径:

1)like后通配+非绑定变量,走INDEX RANGE SCAN,正常

点击(此处)折叠或打开

  1. select *
  2. from bind_test
  3. where object_name like 'TA%';

  4. 45 rows selected.

  5. Elapsed: 00:00:00.00

  6. Execution Plan
  7. ----------------------------------------------------------
  8. Plan hash value: 293636637

  9. -----------------------------------------------------------------------------------------------------
  10. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  11. -----------------------------------------------------------------------------------------------------
  12. | 0 | SELECT STATEMENT | | 7 | 924 | 7 (0)| 00:00:01 |
  13. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BIND_TEST | 7 | 924 | 7 (0)| 00:00:01 |
  14. |* 2 | INDEX RANGE SCAN | IDX_BIND_TEST | 7 | | 3 (0)| 00:00:01 |
  15. -----------------------------------------------------------------------------------------------------

  16. Predicate Information (identified by operation id):
  17. ---------------------------------------------------

  18.    2 - access("OBJECT_NAME" LIKE 'TA%')
  19.        filter("OBJECT_NAME" LIKE 'TA%')


  20. Statistics
  21. ----------------------------------------------------------
  22.           0 recursive calls
  23.           0 db block gets
  24.          37 consistent gets
  25.           0 physical reads
  26.           0 redo size
  27.        8468 bytes sent via SQL*Net to client
  28.         436 bytes received via SQL*Net from client
  29.           4 SQL*Net roundtrips to/from client
  30.           0 sorts (memory)
  31.           0 sorts (disk)
  32.          45 rows processed
2)like前通配+非绑定变量,走全表扫描,正常

点击(此处)折叠或打开

  1. select *
  2. from bind_test
  3. where object_name like '%TA';

  4. Elapsed: 00:00:00.02

  5. Execution Plan
  6. ----------------------------------------------------------
  7. Plan hash value: 1250996780

  8. -------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. -------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 3638 | 468K| 395 (1)| 00:00:01 |
  12. |* 1 | TABLE ACCESS FULL| BIND_TEST | 3638 | 468K| 395 (1)| 00:00:01 |
  13. -------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("OBJECT_NAME" LIKE '%TA' AND "OBJECT_NAME" IS NOT NULL)


  17. Statistics
  18. ----------------------------------------------------------
  19.           1 recursive calls
  20.           0 db block gets
  21.        1429 consistent gets
  22.           0 physical reads
  23.           0 redo size
  24.       17156 bytes sent via SQL*Net to client
  25.         546 bytes received via SQL*Net from client
  26.          14 SQL*Net roundtrips to/from client
  27.           0 sorts (memory)
  28.           0 sorts (disk)
  29.         194 rows processed

3)like前通配+非绑定变量,如果不需要回表,数据全部在索引里获取,则走INDEX FAST FULL SCAN,正常

点击(此处)折叠或打开

  1. select object_name
  2. from bind_test
  3. where object_name like '%TA';

  4. Elapsed: 00:00:00.02

  5. Execution Plan
  6. ----------------------------------------------------------
  7. Plan hash value: 2253044292

  8. --------------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. --------------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 3638 | 124K| 128 (1)| 00:00:01 |
  12. |* 1 | INDEX FAST FULL SCAN| IDX_BIND_TEST | 3638 | 124K| 128 (1)| 00:00:01 |
  13. --------------------------------------------------------------------------------------

  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("OBJECT_NAME" LIKE '%TA' AND "OBJECT_NAME" IS NOT NULL)


  17. Statistics
  18. ----------------------------------------------------------
  19.           1 recursive calls
  20.           0 db block gets
  21.         485 consistent gets
  22.           8 physical reads
  23.           0 redo size
  24.        6708 bytes sent via SQL*Net to client
  25.         556 bytes received via SQL*Net from client
  26.          14 SQL*Net roundtrips to/from client
  27.           0 sorts (memory)
  28.           0 sorts (disk)
  29.         194 rows processed

2.like+绑定变量关闭peeking有问题

如果绑定变量peeking关闭有BUG,不管是前通配或后通配,都走INDEX RANGE SCAN
如果绑定变量值是后通配的问题不大,如果是前通配的,会导致严重性能问题
绑定变量peeking打开没有问题。
11g到19c都一样。

绑定变量peeking打开正常:
1)peeking打开,非前通配,正常走INDEX RANGE SCAN

点击(此处)折叠或打开

  1. var objname varchar2(100);
  2. exec :objname := 'TA%';
  3. set serveroutput off
  4. alter session set statistics_level=all;
  5. select *
  6. from bind_test
  7. where object_name like :objname;
  8. select * from table(dbms_xplan.display_cursor(null,null,'allstats last +peeked_binds'));

  9. PLAN_TABLE_OUTPUT
  10. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  11. SQL_ID dmvrs44wk5w89, child number 0
  12. -------------------------------------
  13. select * from bind_test where object_name like :objname

  14. Plan hash value: 293636637

  15. ---------------------------------------------------------------------------------------------------------------
  16. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  17. ---------------------------------------------------------------------------------------------------------------
  18. | 0 | SELECT STATEMENT | | 1 | | 45 |00:00:00.01 | 37 |
  19. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BIND_TEST | 1 | 7 | 45 |00:00:00.01 | 37 |
  20. |* 2 | INDEX RANGE SCAN | IDX_BIND_TEST | 1 | 7 | 45 |00:00:00.01 | 6 |
  21. ---------------------------------------------------------------------------------------------------------------

  22. Peeked Binds (identified by position):
  23. --------------------------------------

  24.    1 - :1 (VARCHAR2(30), CSID=873): 'TA%'

  25. Predicate Information (identified by operation id):
  26. ---------------------------------------------------

  27.    2 - access("OBJECT_NAME" LIKE :OBJNAME)
  28.        filter("OBJECT_NAME" LIKE :OBJNAME)
2)peeking打开,前通配走TABLE ACCESS FULL正常

点击(此处)折叠或打开

  1. select address,hash_value from v$sql where sql_id='dmvrs44wk5w89';
  2. ADDRESS HASH_VALUE
  3. ---------------- ----------
  4. 00007FFE7D7138B8 958591241

  5. 先purge cursor,以防cursor共享导致演示不准:
  6. exec sys.dbms_shared_pool.purge('00007FFE7D7138B8,958591241','C');


  7. var objname varchar2(100);
  8. exec :objname := '%TA';
  9. set serveroutput off
  10. alter session set statistics_level=all;
  11. select *
  12. from bind_test
  13. where object_name like :objname;
  14. select * from table(dbms_xplan.display_cursor(null,null,'allstats last +peeked_binds'));


  15. PLAN_TABLE_OUTPUT
  16. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  17. SQL_ID dmvrs44wk5w89, child number 0
  18. -------------------------------------
  19. select * from bind_test where object_name like :objname

  20. Plan hash value: 1250996780

  21. -----------------------------------------------------------------------------------------
  22. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  23. -----------------------------------------------------------------------------------------
  24. | 0 | SELECT STATEMENT | | 1 | | 194 |00:00:00.01 | 1429 |
  25. |* 1 | TABLE ACCESS FULL| BIND_TEST | 1 | 3638 | 194 |00:00:00.01 | 1429 |
  26. -----------------------------------------------------------------------------------------

  27. Peeked Binds (identified by position):
  28. --------------------------------------

  29.    1 - :1 (VARCHAR2(30), CSID=873): '%TA'

  30. Predicate Information (identified by operation id):
  31. ---------------------------------------------------

  32.    1 - filter("OBJECT_NAME" LIKE :OBJNAME)

3.绑定变量peeking关闭,like :bind 或like '%'||:bind形式的都走INDEX RANGE SCAN
前面的好理解,因为peeking关闭,like :bind CBO可能把绑定变量当成不带%的计算选择率,这样一般选择率density,density一般很小,走索引
(事实上不是这样,是和like '%'||:bind选择率计算方式一样)。

后面的是like '%'||:bind,这种显式写了前通配,因为peeking关闭导致走INDEX RANGE SCAN,可以认为是BUG(这个BUG貌似一直存在),因为
这种走索引,肯定是从索引{BANNED}最佳左侧扫描全部索引树,除非有rownum<...可能只需要前面少量leaf block。

关闭peeking来演示:
 alter session set "_optim_peek_user_binds"=false;

1)绑定变量peeking关闭,传入的值是前通配,也走INDEX RANGE SCAN。


点击(此处)折叠或打开

  1. var objname varchar2(100);
  2. exec :objname := '%TA';
  3. set serveroutput off
  4. alter session set statistics_level=all;
  5. select *
  6. from bind_test
  7. where object_name like :objname;
  8. select * from table(dbms_xplan.display_cursor(null,null,'allstats last +peeked_binds'));

  9. dingjun123@ORCLPDB> select * from table(dbms_xplan.display_cursor(null,null,'allstats last +peeked_binds'));

  10. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  11. SQL_ID dmvrs44wk5w89, child number 1
  12. -------------------------------------
  13. select * from bind_test where object_name like :objname

  14. Plan hash value: 4236497223

  15. -------------------------------------------------------------------------------------------------------
  16. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  17. -------------------------------------------------------------------------------------------------------
  18. | 0 | SELECT STATEMENT | | 1 | | 221 |00:00:00.05 | 542 |
  19. | 1 | TABLE ACCESS BY INDEX ROWID| BIND_TEST | 1 | 3840 | 221 |00:00:00.05 | 542 |
  20. |* 2 | INDEX RANGE SCAN | IDX_BIND_TEST | 1 | 691 | 221 |00:00:00.05 | 396 |
  21. -------------------------------------------------------------------------------------------------------

  22. Predicate Information (identified by operation id):
  23. ---------------------------------------------------

  24.    2 - access("OBJECT_NAME" LIKE :OBJNAME)
  25.        filter("OBJECT_NAME" LIKE :OBJNAME)
这种E-Row计算方式和2)like '%'||:bind一样,见下面的分析,与猜想的按照类似等值查询不一样,如果按照等值查询返回行是:
num_rows*density=76807*.00002114=2行。

2)绑定变量peeking关闭,显式编码前通配,使用like '%'||:bind形式,这种按理是和peeking无关的,走不了INDEX RANGE SCAN,
但是执行计划缺走了INDEX RANGE SCAN,这是BUG,可能导致严重性能问题

点击(此处)折叠或打开

  1. var objname varchar2(100);
  2. exec :objname := 'TA';
  3. set serveroutput off
  4. alter session set statistics_level=all;
  5. select *
  6. from bind_test
  7. where object_name like '%'||:objname||'%';
  8. select * from table(dbms_xplan.display_cursor(null,null,'allstats last +peeked_binds'));


  9. PLAN_TABLE_OUTPUT
  10. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  11. SQL_ID bg9sh13chmu99, child number 1
  12. -------------------------------------
  13. select * from bind_test where object_name like '%'||:objname||'%'

  14. Plan hash value: 4236497223

  15. -------------------------------------------------------------------------------------------------------
  16. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  17. -------------------------------------------------------------------------------------------------------
  18. | 0 | SELECT STATEMENT | | 1 | | 4337 |00:00:00.07 | 3066 |
  19. | 1 | TABLE ACCESS BY INDEX ROWID| BIND_TEST | 1 | 3840 | 4337 |00:00:00.07 | 3066 |
  20. |* 2 | INDEX RANGE SCAN | IDX_BIND_TEST | 1 | 691 | 4337 |00:00:00.06 | 671 |
  21. -------------------------------------------------------------------------------------------------------

  22. Predicate Information (identified by operation id):
  23. ---------------------------------------------------

  24.    2 - access("OBJECT_NAME" LIKE '%'||:OBJNAME||'%')
  25.        filter("OBJECT_NAME" LIKE '%'||:OBJNAME||'%')

上面的like前导%的走INDEX RANGE SCAN,而且ID=2返回行是691比ID=1的3840少很多,这很让人迷惑,ID=1没有条件,按理说ID=1应该

只会比ID=2少,唯一可能的情况是ID=2的选择率计算方式和ID=1的不一样,还是通过10053看:

10053 Access path analysis计算如下:

点击(此处)折叠或打开

  1. ***************************************
  2. BASE STATISTICAL INFORMATION
  3. ***********************
  4. Table Stats::
  5.   Table: BIND_TEST Alias: BIND_TEST
  6.     #Rows: 76807 #Blks: 1120 AvgRowLen: 98.00 ChainCnt: 0.00
  7. Index Stats::
  8.   Index: IDX_BIND_TEST Col#: 2
  9.     LVLS: 2 #LB: 379 #DK: 47304 LB/K: 1.00 DB/K: 1.00 CLUF: 38540.00
  10. Access path analysis for BIND_TEST
  11. ***************************************
  12. SINGLE TABLE ACCESS PATH
  13.   Single Table Cardinality Estimation for BIND_TEST[BIND_TEST]
  14.   Column (#2): OBJECT_NAME(
  15.     AvgLen: 25 NDV: 47304 Nulls: 0 Density: 0.000021
  16.   Table: BIND_TEST Alias: BIND_TEST
  17.     Card: Original: 76807.000000 Rounded: 3840 Computed: 3840.35 Non Adjusted: 3840.35
  18.   Access Path: TableScan
  19.     Cost: 307.14 Resp: 307.14 Degree: 0
  20.       Cost_io: 305.00 Cost_cpu: 45073703
  21.       Resp_io: 305.00 Resp_cpu: 45073703
  22. kkofmx: index filter:"BIND_TEST"."OBJECT_NAME" LIKE '%'||:B1||'%'

  23.   Access Path: index (RangeScan)
  24.     Index: IDX_BIND_TEST
  25.     resc_io: 353.00 resc_cpu: 2949828
  26.     ix_sel: 0.009000 ix_sel_with_filters: 0.009000
  27.     Cost: 304.14 Resp: 304.14 Degree: 1
  28.   Best:: AccessPath: IndexRange
  29.   Index: IDX_BIND_TEST
  30.          Cost: 304.14 Degree: 1 Resp: 304.14 Card: 3840.35 Bytes: 0
         
结论:为什么绑定变量peeking会出现不管%是显式写还是写在绑定变量的值里,都可能走INDEX RANGE SCAN呢?


从10053看,CBO没有考虑显式前导%走不了索引的问题,当然%写在绑定变量值里也不会考虑,
主要是索引访问选择率ix_sel是0.009,相当于0.9%,因为CBO没有考虑%,这可能导致走索引。


这个0.009貌似是个默认值(无peeking的like默认sel):
在10053里回表过滤的选择率也是0.009,但是执行计划显示的回表选择率还是按照5%(实际回表计算的sel和10053显示不一样)


也就是ix_sel实际上是按照0.009,ix_sel_with_filters实际上按照0.05,导致ID=2的返回行计算是691行,然后ID=1回表没有任何过滤条件,
因为回表按照5%计算,导致行数比ID=2的多很多,这是优化器算法层面的BUG。


具体计算过程如下:
ID=2 INDEX RANGE SCAN E-Rows = round(76807*.009) = 691行
ID=1 TABLE ACCESS BY INDEX ROWID E-Rows = round(76807*.05) = 3840行


实际计算ix_sel:0.009,ix_sel_with_filters:0.05
与10053显示的回表过滤选择率ix_sel_with_filters不一样,10053的如下:
ix_sel: 0.009000  ix_sel_with_filters: 0.009000 


一般情况下,绑定变量+无peeking要使用默认选择率,如下所示:

点击(此处)折叠或打开

  1. = 条件选择率:
  2. greatest(1/NDV,DENSITY)*(num_rows-null_rows)/num_rows

  3. >,<,<=,>= 条件选择率,一般情况就用0.05了:
  4. greatest(0.05,1/NDV,DENSITY)*(num_rows-null_rows)/num_rows

  5. like全表扫描选择率或回表选择率:常数是0.05
  6. greatest(0.05,1/NDV,DENSITY)*(num_rows-null_rows)/num_rows

  7. like条件index range scan选择率:常数不是0.05,是0.009
  8. greatest(0.009,1/NDV,DENSITY)*(num_rows-null_rows)/num_rows
为什么公式里还有density,因为如果带直方图的话,density就不是1/NDV,没有直方图density=1/NDV:

为了验证like带绑定变量,无peeking的选择率计算公式,可以找个density比0.009大的验证下:

点击(此处)折叠或打开

  1. select column_name,num_distinct,density,num_nulls from dba_tab_col_statistics where table_name='BIND_TEST';

  2. COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS
  3. ------------------------------ ------------ ---------- ----------
  4. EDITION_NAME 0 0 76807
  5. NAMESPACE 21 .047619048 0
  6. SECONDARY 2 .5 0
  7. GENERATED 2 .5 0
  8. TEMPORARY 2 .5 0
  9. STATUS 2 .5 0
  10. TIMESTAMP 1663 .000601323 0
  11. LAST_DDL_TIME 1583 .000631712 0
  12. CREATED 1529 .000654022 0
  13. OBJECT_TYPE 45 .022222222 0
  14. DATA_OBJECT_ID 10551 .000094778 66206
  15. OBJECT_ID 76807 .00001302 0
  16. SUBOBJECT_NAME 271 .003690037 76154
  17. OBJECT_NAME 47304 .00002114 0
  18. OWNER 36 .027777778 0
可以用owner列,density=.027777778 > 0.009,为了验证,建个索引:

点击(此处)折叠或打开

  1. create index idx_bind_test_owner on bind_test(owner);

  2. var owner varchar2(100);
  3. select * from bind_test
  4. where owner like :owner;

  5. Execution Plan
  6. ----------------------------------------------------------
  7. Plan hash value: 585688886

  8. ---------------------------------------------------------------------------------------------------
  9. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  10. ---------------------------------------------------------------------------------------------------
  11. | 0 | SELECT STATEMENT | | 3840 | 367K| 65 (0)| 00:00:01 |
  12. | 1 | TABLE ACCESS BY INDEX ROWID| BIND_TEST | 3840 | 367K| 65 (0)| 00:00:01 |
  13. |* 2 | INDEX RANGE SCAN | IDX_BIND_TEST_OWNER | 2135 | | 7 (0)| 00:00:01 |
  14. ---------------------------------------------------------------------------------------------------

  15. Predicate Information (identified by operation id):
  16. ---------------------------------------------------

  17.    2 - access("OWNER" LIKE :OWNER)
  18.        filter("OWNER" LIKE :OWNER)
ID=2的选择率按照公式:greatest(0.009,1/NDV,DENSITY)*(num_rows-null_rows)/num_rows=  .027777778
计算的行数  
select 76807 * .027777778 from dual;     
2134 接近2135


然后ID=1还是按照0.05计算=3840行。


另外从12c开始还有几个like :bind写法的BUG也是走index range scan,上述例子在11g里同样存在:
见BUG:Index SCAN Instead of FAST FULL SCAN With COL LIKE '%bind Variable%' (Doc ID 2781949.1)

综上所述,建议如下:
1)影响优化器的默认参数,除非出现重大影响或确定影响生产的BUG的需要关闭,一般都用默认值,特别是
以_optimizer开头的隐含参数,比如:_optimizer_cost_based_transformation、_optimizer_squ_bottomup


2)绑定变量peeking{BANNED}最佳好不要关闭,关闭了影响直方图使用,无法计算真实的数据分布,
如果查询总是查询某个分布很少的值,有了peeking会走索引,没有peeking就按照默认选择率,可能不走索引。
比如status=1,0,其中1很多,0很少,基本按照status='0'查询,这样走索引好,如果
关闭peeking,按照where status = :var查询,选择率是max(1/2,density)则走不了索引.


关闭peeking,可能会导致like :bind或like '%'||:bind形式的条件走INDEX RANGE SCAN,见本篇文章。


如果绑定变量peeking遇到直方图,传入的值对应分布区间变动频繁,不同分布要求不一样执行计划,
ORACLE有adaptive cursor sharding(BUG多,一般关闭)或可以使用BIND_AWARE HINTS+SQL DIRECTIVE/SQL PROFILE绑定解决(建议使用)。




阅读(235) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~