NLS_COMP和NLS_SORT参数
Oracle默认是采用binary进行排序,这对于例如中文的排序来说,是不恰当的。
使用这两个参数可以指定排序的方法,比如拼音或是,要注意可能会引起性能问题。
解决方法是使用NLSSORT函数来建立一个函数索引。
NLS_COMP = { BINARY | LINGUISTIC | ANSI }
BINARY
Normally, comparisons in the WHERE clause and in PL/SQL blocks is binary unless you specify theNLSSORT function.
LINGUISTIC
Comparisons for all SQL operations in the WHERE clause and in PL/SQL blocks should use the linguistic sort specified in theNLS_SORT parameter. To improve the performance, you can also define a linguistic index on the column for which you want linguistic comparisons.
ANSI
A setting of ANSI is for backwards compatibility; in general, you should setNLS_COMP to LINGUISTIC
NLS_SORT = { BINARY | linguistic_definition}
NLS_SORT specifies the collating sequence for character value comparison in various SQL operators and clauses, for example, ORDER BY, GROUP BY, comparison conditions (=, <>, <=, >=), IN, BETWEEN, LIKE, MIN/MAX, GREATEST/LEAST, and INSTR.
If the value is BINARY, then comparison is based directly on byte values in the binary encoding of the character values being compared. The ordering depends on the character set of the compared values, which is either the database character set (for VARCHAR2, CHAR, LONG, and CLOB) or the national character set (for NVARCHAR2, NCHAR, and NCLOB).
If the value is a named linguistic sort, then comparison is defined by this sort. A linguistic sort uses various rules to achieve ordering expected by speakers of one or more natural languages. This is usually the same ordering that is used in dictionaries and/or telephone directories in those languages.
The BINARY comparison is faster and uses less resources than any linguistic comparison but for text in a natural language, it does not provide ordering expected by users.
The value of NLS_SORT affects execution plans of queries. Because a standard index cannot be used as a source of values sorted in a linguistic order, an explicit sort operation must usually be performed instead of an index range scan. A functional index on the NLSSORT function may be defined to provide values sorted in a linguistic order and reintroduce the index range scan to the execution plan.
下面做个测试:
首先来看看可用的中文排序方法:
select value from v$nls_valid_values where parameter='SORT' and value like '%SCHINESE%';
返回可用的中文排序方法
SCHINESE_PINYIN_M -- 按照拼音排序
SCHINESE_STROKE_M -- 按照笔划(第一顺序)、部首(第二顺序)排序
SCHINESE_RADICAL_M -- 按照部首(第一顺序)、笔划(第二顺序)排序
下面来做测试。
create index emp_ename_idx on emp(ename);
现在察看select * from emp where ename='Mike';的执行计划,
explain plan for select * from emp where ename='Mike';
select * from table(dbms_xplan.display);
可以看到oracle会走索引emp_ename_idx。
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 46 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 46 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | EMP_ENAME_IDX | 1 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ENAME"=U'Mike')
alter session set nls_comp='LINGUISTIC';
alter session set nls_sort='SCHINESE_PINYIN_M';
之后再查看执行计划,得到结果:
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 46 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 1 | 46 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NLSSORT("ENAME",'nls_sort=''SCHINESE_PINYIN_M''')=HEXTORAW
('0230021B022301FE0000020202020007020202') )
可以看到,现在走全表扫描了。
接下来,如果创建如下的索引
create index emp_ename_idx_nlssort on emp(nlssort(ename,'NLS_SORT=SCHINESE_PINYIN_M'));
再查看执行计划,发现又走索引emp_ename_idx_nlssort了。
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 46 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 46 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | EMP_ENAME_IDX_NLSSORT | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("ENAME",'nls_sort=''SCHINESE_PINYIN_M''')=HEXTORAW('0230021B022301FE000
0020202020007020202') )