WINDOWS下的程序员出身,偶尔也写一些linux平台下小程序, 后转行数据库行业,专注于ORACLE和DB2的运维和优化。 同时也是ios移动开发者。欢迎志同道合的朋友一起研究技术。 数据库技术交流群:58308065,23618606
全部博文(599)
分类: Oracle
2011-04-26 23:06:59
很多时候我们可能需要查询某个列的开头字母是A或者前3个字符是ABC类似这样的查询。
往往很多人喜欢采用SUBSTR的形式来查询,遗憾的是即使这个列上有索引,但没函数索引的话,这将会导致全表扫描。
看下面一个简单的例子:
[oracle@dbtest ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Tue Apr 26 22:37:49 2011
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> conn test/test
Connected.
SQL> select tname from tab;
TNAME
------------------------------
BIN$oEDIBO0f/XDgQKjAewAvlQ==$0
BIN$oEDIBO0g/XDgQKjAewAvlQ==$0
GVVEHICLEOWNER_20110304
MLOG$_TEST
MM
MN
MV_TEST
RUPD$_TEST
T
T1
T2
TNAME
------------------------------
TEST
T_EXCHANGE_TEMP
T_LIST
T_PART
15 rows selected.
SQL> drop table test ;
Table dropped.
SQL> create table test as select * from all_objects;
Table created.
SQL> create index idx_test_01 on test(object_name);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'TEST',cascade=>TRUE);
PL/SQL procedure successfully completed.
SQL> set autot traceonly exp
SQL> select object_id,object_name from test where object_name ='TEST';
Execution Plan
----------------------------------------------------------
Plan hash value: 2317948335
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 2 | 60 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TEST_01 | 2 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME"='TEST')
对于OBJECT_NAME='TEST'这样的查询是能走到索引的。
如果我们需要查询OBJECT_NAME列的前3个字母是TES的信息:
很多人可能采用下面的SQL:
SELECT OBJECT_ID,OBJECT_NAME FROM TEST WHERE SUBSTR(OBJECT_NAME,1,3)='TES';
结果是没问题的,但是效率却比较低下:
看一下这个SQL的执行计划:
SQL> SET AUTOT ON
SQL> SELECT OBJECT_ID,OBJECT_NAME FROM TEST WHERE SUBSTR(OBJECT_NAME,1,3)='TES';
OBJECT_ID OBJECT_NAME
---------- ------------------------------
75729 TEST
83221 TEST_DB
88003 TEST
Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 727 | 21810 | 291 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| TEST | 727 | 21810 | 291 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUBSTR("OBJECT_NAME",1,3)='TES')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1041 consistent gets
1 physical reads
0 redo size
686 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL走的是全表扫描,逻辑读是1041。
其实我们完全可以采用下面的SQL来替换上面的SQL达到同样的效果,但是性能却提升很多。
SQL> SELECT OBJECT_ID,OBJECT_NAME FROM TEST WHERE OBJECT_NAME LIKE 'TES%';
OBJECT_ID OBJECT_NAME
---------- ------------------------------
75729 TEST
88003 TEST
83221 TEST_DB
Execution Plan
----------------------------------------------------------
Plan hash value: 2317948335
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 2 | 60 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TEST_01 | 2 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME" LIKE 'TES%')
filter("OBJECT_NAME" LIKE 'TES%')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
686 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
此时可以命中索引,逻辑读是7。
遗憾的是,这种查询方式只能是查询以某个或者某几个字符开头,假设我们要查询的字符不是开头,而是OBJECT_NAME第2个字符到第4个字符是TES的,那么这种方式就不适用了。
此时需要建立函数索引来提升性能了。
SQL> set autot off
SQL> update test set object_name='ATESTYSP' where object_id=1024;
1 row updated.
SQL> commit;
Commit complete.
SQL> create index idx_test_02 on test(substr(object_name,2,3));
Index created.
SQL> set autot on
SQL> SELECT OBJECT_ID,OBJECT_NAME FROM TEST WHERE SUBSTR(OBJECT_NAME,2,3)='TES';
OBJECT_ID OBJECT_NAME
---------- ------------------------------
1024 ATESTYSP
Execution Plan
----------------------------------------------------------
Plan hash value: 1637066475
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 727 | 27626 | 103 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 727 | 27626 | 103 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IDX_TEST_02 | 291 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(SUBSTR("OBJECT_NAME",2,3)='TES')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
1 physical reads
0 redo size
610 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
在有其他解决方式的前提下,我们尽量避免在表上建立太多的索引,毕竟它会给DML语句带来性能的开销。