Chinaunix首页 | 论坛 | 博客
  • 博客访问: 289516
  • 博文数量: 72
  • 博客积分: 2387
  • 博客等级: 大尉
  • 技术积分: 720
  • 用 户 组: 普通用户
  • 注册时间: 2010-08-26 10:54
文章分类

全部博文(72)

文章存档

2012年(1)

2011年(1)

2010年(70)

分类: Oracle

2010-08-26 12:20:52

                                      
                               
                               
   相信做oracle开发和管理的朋友对sort肯定不会陌生,大家通常都遇到这样那样的排序性能问题,所以我写这一系列关于sort的文章告诉大家在oracle里面sort是怎么一回事以及如果调整sort获得更好的性能。
  
   首先,我们来回顾一下什么情况下需要sort,当取distinct值的时候需要,当进行merge join的时候也需要,当语句包含group by,order by的时候需要,当创建索引的时候需要等等。那么我们来看一下在oracle里面排序分为哪几种方式呢?
  
   一.SORT UNIQUE
  
   sort unique发生在我们需要取distinct值的时候
  
   SQL> desc test
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(18)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
SQL> set autotrace trace exp;
SQL> select distinct owner from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=477 Card=15 Bytes=75
          )
   1    0   SORT (UNIQUE) (Cost=477 Card=15 Bytes=75)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=967440)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2649  consistent gets
          0  physical reads
          0  redo size
        564  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         15  rows processed

   
    也会发生在in的子查询里
   
SQL> select owner from test where object_id in(select object_id from test1 where rownum<10);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=530 Card=145116 Byte
          s=3047436)
   1    0   HASH JOIN (Cost=530 Card=145116 Bytes=3047436)
   2    1     VIEW OF 'VW_NSO_1' (Cost=256 Card=9 Bytes=117)
   3    2       SORT (UNIQUE)
   4    3         COUNT (STOPKEY)
   5    4           TABLE ACCESS (FULL) OF 'TEST1' (Cost=256 Card=1934
          88 Bytes=580464)
   6    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=1547904)

         
         
 为什么in子查询需要排序呢,因为oracle会把基于in的子查询
  "select owner from test where object_id in(select object_id from test1)"
 转换为类似
 "select a.owner from test a,(select distinct object_id from test1) b where a.object_id=b.object_id"
 如果in子查询里面选出的值可以通过主键或unique索引搜索得到,那么将不会进行sort unique的操作。
 
 但自从10g R2开始,sort unique有了一些变化,sort unique变成了hash unique,采用新的hash算法代替了传统的sort unique,据oracle称在某些情况下排序性能是原来的5倍。

注意,sort的统计值将为0
SQL> select distinct owner from test;

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        692  consistent gets
          0  physical reads
          0  redo size
        781  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         19  rows processed

SQL> select distinct owner from test;
Execution Plan
----------------------------------------------------------
Plan hash value: 2203132549
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    19 |    95 |   164   (5)| 00:00:02 |
|   1 |  HASH UNIQUE       |      |    19 |    95 |   164   (5)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| TEST | 49909 |   243K|   158   (2)| 00:00:02 |
--------------------------------------------------------------------------
 
SQL> select owner from test where object_id in(select object_id from test1 where rownum<10);
Execution Plan
----------------------------------------------------------
Plan hash value: 2420636264
--------------------------------------------------------------------------------
------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost
(%CPU)| Time     |
--------------------------------------------------------------------------------
------------------
|   0 | SELECT STATEMENT            |                    |  5989 |   116K|   284
   (2)| 00:00:04 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TEST               |   665 |  4655 |    14
   (0)| 00:00:01 |
|   2 |   NESTED LOOPS              |                    |  5989 |   116K|   284
   (2)| 00:00:04 |
|   3 |    VIEW                     | VW_NSO_1           |     9 |   117 |   156
   (2)| 00:00:02 |
|   4 |     HASH UNIQUE             |                    |     9 |    18 |
      |          |
|*  5 |      COUNT STOPKEY          |                    |       |       |
      |          |
|   6 |       TABLE ACCESS FULL     | TEST1              | 49909 | 99818 |   156
   (2)| 00:00:02 |
|*  7 |    INDEX RANGE SCAN         | IND_TEST_OBJECT_ID |   665 |       |     2
   (0)| 00:00:01 |
 
二.SORT AGGREGATE
   sort aggregate通常发生在使用一些聚合函数的时候,sum(),avg(),min(),max(),count()等等,实际上sort aggregate不做真正的sort,并不会用到排序空间,而是通过一个全局变量+全表或全索引扫描来实现。伪代码如下
    max_so_far = -INFINITE;
    while (row* r=get_a_row_from_table(test_for_max)) {
    if (r->n > max_so_far) max_so_far=n;
    }
    return max_so_far
   
   
   
    SQL> select count(*) from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488)
 
SQL> select sum(object_id) from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=580464)
 

SQL> select min(object_id) from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FULL SCAN (MIN/MAX)) OF 'IND_TEST_OBJECT_ID' (NON
          -UNIQUE) (Cost=257 Card=193488 Bytes=580464)
 

SQL> select max(object_id) from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FULL SCAN (MIN/MAX)) OF 'IND_TEST_OBJECT_ID' (NON
          -UNIQUE) (Cost=257 Card=193488 Bytes=580464)
 

SQL> select avg(object_id) from test;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=580464)
 
SQL> select count(*) from test;
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2649  consistent gets
          0  physical reads
          0  redo size
        381  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
         
         
三.SORT GROUP BY
SORT GROUP BY会发生在有group by子句的时候
SQL> select owner from test group by owner;
15 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=477 Card=15 Bytes=75
          )
   1    0   SORT (GROUP BY) (Cost=477 Card=15 Bytes=75)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=967440)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2649  consistent gets
          0  physical reads
          0  redo size
        564  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         15  rows processed

在10g R2里又有了变化,sort group by被hash group by所代替
SQL> select owner from test group by owner;
19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1435881708
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    19 |    95 |   164   (5)| 00:00:02 |
|   1 |  HASH GROUP BY     |      |    19 |    95 |   164   (5)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| TEST | 49909 |   243K|   158   (2)| 00:00:02 |
---------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        692  consistent gets
          0  physical reads
          0  redo size
        781  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         19  rows processed
联想到上面提到的sort unique也是被hash unique所代替,我们可以肯定10g R2用了更多基于hash算法的东西,我们也期待在数据库其他方面10g R2也带给我们更多的好处。
 
四.SORT ORDER BY
SORT ORDER BY会发生在有order by子句的时候
SQL> select owner from test order by owner;
193488 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=666 Card=193488 Byte
          s=967440)
   1    0   SORT (ORDER BY) (Cost=666 Card=193488 Bytes=967440)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
          s=967440)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2649  consistent gets
          0  physical reads
          0  redo size
    2528735  bytes sent via SQL*Net to client
     142392  bytes received via SQL*Net from client
      12901  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
     193488  rows processed
    
    
    
五.SORT JOIN
SORT JOIN发生在出现merge join的情况下,两张关联的表要各自做sort,然后再merge
SQL> select /*+ use_merge(a b)*/ a.owner from test a,test1 b where a.object_id=b.object_id and rownum<10;
9 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1354 Card=9 Bytes=99
          )
   1    0   COUNT (STOPKEY)
   2    1     MERGE JOIN (Cost=1354 Card=3119800512 Bytes=34317805632)
   3    2       SORT (JOIN) (Cost=737 Card=193488 Bytes=1547904)
   4    3         TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488
          Bytes=1547904)
   5    2       SORT (JOIN) (Cost=617 Card=193488 Bytes=580464)
   6    5         TABLE ACCESS (FULL) OF 'TEST1' (Cost=256 Card=193488
           Bytes=580464)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5251  consistent gets
          0  physical reads
          0  redo size
        444  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          9  rows processed

第一章主要给大家介绍了一下oracle什么时候会采取sort,下一章将会介绍一些sort相关的oracle参数。

阅读(920) | 评论(0) | 转发(0) |
0

上一篇:sequence

下一篇:sqlldr用法

给主人留下些什么吧!~~