Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1393397
  • 博文数量: 173
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3841
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc

文章分类

全部博文(173)

文章存档

2025年(1)

2024年(27)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

分类: Oracle

2014-01-01 18:49:34

    在ORACLE里,一般认为''与NULL是等价的,除了一些特别的语法,比如有id is null,但是没有id is '',以及''作为字符类型的特殊性,在decode等函数要求类型匹配的时候,与NULL可能不等价。本篇文章主要说明下''与NULL在SQL优化器中也有很大的区别,如果不了解这个区别,很可能SQL语句的性能,将要陷入灾难境地。此问题来源于:http://www.itpub.net/thread-1838241-1-1.html 讨论
 
  1.示例分析

create table t1 as select * from dba_objects ;
create table t2 as select * from dba_objects ;
create table t3 as select * from dba_objects ;
/
begin
  for xx in 1 .. 5 loop
    insert into t1 select * from t1;
    insert into t2 select * from t2;
    insert into t3 select * from t3;
    commit;
  end loop;
end;
/
create index i1 on  t1(object_name);
create index i2 on  t2(object_name);
create index i3 on  t3(object_name);
create index i4 on  t1(subobject_name);
create index i5 on  t2(subobject_name);
create index i6 on  t3(subobject_name);

--收集统计信息省略


    针对下列语句,ORACLE对t2,t3走了FULL TABLE SCAN:

dingjun123@ORADB> show rel
release 1102000100

dingjun123@ORADB> set autotrace traceonly exp
dingjun123@ORADB> select *

  2    from (select t1.subobject_name, t1.object_name, t1.object_type
  3            from t1
  4          union all
  5          select '', t2.object_name, t2.object_type
  6            from t2
  7          union all
  8          select t3.subobject_name, '', t3.object_type from t3) t
  9  where (t.subobject_name = 'T' or t.object_name = 'T');
Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 3458763867
-------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |      | 62890 |  5773K| 18934   (1)| 00:03:48 |
|   1 |  VIEW                              |      | 62890 |  5773K| 18934   (1)| 00:03:48 |
|   2 |   UNION-ALL                        |      |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID     | T1   | 12678 |   433K|    38   (0)| 00:00:01 |
|   4 |     BITMAP CONVERSION TO ROWIDS    |      |       |       |            |          |
|   5 |      BITMAP OR                     |      |       |       |            |          |
|   6 |       BITMAP CONVERSION FROM ROWIDS|      |       |       |            |          |
|*  7 |        INDEX RANGE SCAN            | I4   |       |       |     1   (0)| 00:00:01 |
|   8 |       BITMAP CONVERSION FROM ROWIDS|      |       |       |            |          |
|*  9 |        INDEX RANGE SCAN            | I1   |       |       |     3   (0)| 00:00:01 |
|* 10 |    TABLE ACCESS FULL               | T2   | 24167 |   778K|  9429   (1)| 00:01:54 |
|* 11 |    TABLE ACCESS FULL               | T3   | 26045 |   279K|  9466   (1)| 00:01:54 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("T1"."SUBOBJECT_NAME"='T')
   9 - access("T1"."OBJECT_NAME"='T')
  10 - filter(''='T' OR "T2"."OBJECT_NAME"='T')
  11 - filter(''='T' OR "T3"."SUBOBJECT_NAME"='T')

    看表t2、t3对应谓词的实际基数情况:

dingjun123@ORADB>    SELECT COUNT(*) FROM t2 WHERE ''='T' OR "T2"."OBJECT_NAME"='T';
  COUNT(*)
----------
       128
1 row selected.
Elapsed: 00:00:11.50

dingjun123@ORADB>    SELECT COUNT(*) FROM t3 WHERE ''='T' OR "T3"."SUBOBJECT_NAME"='T';
  COUNT(*)
----------
         0
1 row selected.
Elapsed: 00:00:14.85

    CBO优化器估算的对应谓词的基数与实际的技术差别如下:
表名
估算基数 实际基数 差别倍数
T2 24167 128 188.80
T3 26045 0 26045

    很显然,实际基数与估算的基数差别太大,从而优化器选择了错误的执行路径,正确的执行路径应该是走索引的。究其原因,可以分析下,

dingjun123@ORADB>    SELECT COUNT(*) FROM t2 WHERE  "T2"."OBJECT_NAME"='T';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 2583336616
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    24 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |      |     1 |    24 |            |          |
|*  2 |   INDEX RANGE SCAN| I2   |    52 |  1248 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."OBJECT_NAME"='T')


dingjun123@ORADB>         SELECT COUNT(*) FROM t2 WHERE ''='T';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 402395414
--------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |  9411   (1)| 00:01:53 |
|   1 |  SORT AGGREGATE     |      |     1 |            |          |
|*  2 |   FILTER            |      |       |            |          |
|   3 |    TABLE ACCESS FULL| T2   |  2411K|  9411   (1)| 00:01:53 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(''='T')

  很显然,是由于谓词''='T'造成了优化器的估算错误,''='T'优化器估算100%的选择性,通过OR一合并,针对表T2的最终基数是24167,因此,优化器选择全表扫描。这很显然是错误的,''='T',走FILTER应该转为NULL IS NOT NULL的形式,最终此分支根本无需计算才对。下面换成NULL测试:

dingjun123@ORADB>   SELECT COUNT(*) FROM t2 WHERE NULL='T';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 402395414
--------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     0   (0)|          |
|   1 |  SORT AGGREGATE     |      |     1 |            |          |
|*  2 |   FILTER            |      |       |            |          |
|   3 |    TABLE ACCESS FULL| T2   |  2411K|  9411   (1)| 00:01:53 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(NULL IS NOT NULL)


  换成NULL='T'后,FILTER自动转为NULL IS NOT NULL,显然,这是个永远不成立的条件,所以,根本无须执行ID=3的操作,最终ID=0的结果COST=0,针对这种FILTER单分支的执行计划注意,虽然子步骤3COST=9411,但是实际可能没有执行,详细的计划可以用DBMS_XPLAN.DISPLAY_CURSOR查看:

dingjun123@ORADB> alter session set statistics_level=all;
Session altered.
Elapsed: 00:00:00.04

dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE NULL='T';
  COUNT(*)
----------
         0
1 row selected.
Elapsed: 00:00:00.05

dingjun123@ORADB> @display_cursor
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID  f33mr91wyuq03, child number 0
-------------------------------------
SELECT COUNT(*) FROM t2 WHERE NULL='T'
Plan hash value: 402395414
----------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.01 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.01 |
|*  2 |   FILTER            |      |      1 |        |      0 |00:00:00.01 |
|   3 |    TABLE ACCESS FULL| T2   |      0 |   2411K|      0 |00:00:00.01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
--------------------------------------------------
   2 - filter(NULL IS NOT NULL)

20 rows selected.
Elapsed: 00:00:00.74

   通过ID=3步骤的Starts=0,很清晰地看出,最终未执行ID=3的步骤。从这点上说,优化器针对''与NULL的相关计算方式还是有很大区别。

 2.问题解决
     通过以上分析,将''改为NULL,问题得到有效解决或者把谓词写到每个子SQL里面。当然,只要是非''的都是可以的,比如' '(空格)。

dingjun123@ORADB> SELECT *
  2  FROM (SELECT t1.subobject_name, t1.object_name, t1.object_type
  3         FROM t1
  4         UNION ALL
  5         SELECT NULL, t2.object_name, t2.object_type
  6         FROM t2
  7         UNION ALL
  8         SELECT t3.subobject_name, NULL, t3.object_type FROM t3) t
  9  WHERE (t.subobject_name = 'T' OR t.object_name = 'T');

256 rows selected.
Elapsed: 00:00:00.72

Execution Plan
----------------------------------------------------------
Plan hash value: 3324794093
-------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |      | 12746 |  1170K|    92   (0)| 00:00:02 |
|   1 |  VIEW                              |      | 12746 |  1170K|    92   (0)| 00:00:02 |
|   2 |   UNION-ALL                        |      |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID     | T1   | 12678 |   433K|    38   (0)| 00:00:01 |
|   4 |     BITMAP CONVERSION TO ROWIDS    |      |       |       |            |          |
|   5 |      BITMAP OR                     |      |       |       |            |          |
|   6 |       BITMAP CONVERSION FROM ROWIDS|      |       |       |            |          |
|*  7 |        INDEX RANGE SCAN            | I4   |       |       |     1   (0)| 00:00:01 |
|   8 |       BITMAP CONVERSION FROM ROWIDS|      |       |       |            |          |
|*  9 |        INDEX RANGE SCAN            | I1   |       |       |     3   (0)| 00:00:01 |
|  10 |    TABLE ACCESS BY INDEX ROWID     | T2   |    52 |  1716 |    41   (0)| 00:00:01 |
|* 11 |     INDEX RANGE SCAN               | I2   |    52 |       |     3   (0)| 00:00:01 |
|  12 |    TABLE ACCESS BY INDEX ROWID     | T3   |    16 |   176 |    13   (0)| 00:00:01 |
|* 13 |     INDEX RANGE SCAN               | I6   |    16 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("T1"."SUBOBJECT_NAME"='T')
   9 - access("T1"."OBJECT_NAME"='T')
  11 - access("T2"."OBJECT_NAME"='T')
  13 - access("T3"."SUBOBJECT_NAME"='T')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        241  consistent gets
         99  physical reads
          0  redo size
       4006  bytes sent via SQL*Net to client
        602  bytes received via SQL*Net from client
         19  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        256  rows processed


    ORACLE优化器虽然强大,但是优化器要判断的东西实在太多太复杂,导致ORACLE强大的优化器也不一定是很完善的,有各种各样的BUG,或未完善的地方,值得我们注意,但是,相信各种新的版本上,会有更多惊喜的特性。
   注:以上问题,在RBO中同样存在。

阅读(7653) | 评论(0) | 转发(3) |
给主人留下些什么吧!~~