DISTINCT vs GROUP BY-TOMSYAN-ChinaUnix博客

美丽人生yanshoupeng.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

TOMSYAN

博客访问： 2936692
博文数量： 599
博客积分： 16398
博客等级：上将
技术积分： 6875
用户组：普通用户
注册时间： 2009-11-30 12:04

个人简介

WINDOWS下的程序员出身，偶尔也写一些linux平台下小程序，后转行数据库行业，专注于ORACLE和DB2的运维和优化。同时也是ios移动开发者。欢迎志同道合的朋友一起研究技术。数据库技术交流群：58308065，23618606

文章分类

全部博文（599）

移动开发（2）
防火墙（1）
脚本（6）
HADOOP（6）
透明网关（5）
工具（2）
DSI（2）
热点话题讨论（2）
oracle internal（28）
Jonathan Lewis（2）
ASKTOM（12）
OLAP（1）
C/Pro*c/c++（23）
PL/SQL（10）
ORACLE全文索引（5）
DB2（28）
METALINK（27）
AIX（4）
DELPHI（5）
性能优化（19）
分布式数据库管理（1）
书籍笔记（4）
备份与恢复（7）
数据库安装及升级（18）
资料下载（43）
Oracle高可用（32）
Linux（7）
Oracle（244）
未分配的博文（53）

文章存档

2014年（12）

2013年（56）

2012年（199）

2011年（105）

2010年（128）

2009年（99）

我的朋友

相关博文

DISTINCT vs GROUP BY

分类： Oracle

2012-03-31 23:42:10

在测试过程中发现 DISTINCT 和 GROUP BY 性能差不多，本打算把下面的话贴给Tom，打算问2个的主要区别,发现上面已经有人问过了，而且TOM也给了例子

Hi tom:

See the following example:

SQL> create table test as select * from all_objects;

Table created.

SQL> set linesize 300
SQL> set autot traceonly
SQL> select distinct object_type from test;

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3279290493

-----------------------------------------------------------------------------------
| Id | Operation          | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   472K| 5076K|       | 3535   (2)| 00:00:43 |
|   1 | HASH UNIQUE       |      |   472K| 5076K|    18M| 3535   (2)| 00:00:43 |
|   2 |   TABLE ACCESS FULL| TEST |   472K| 5076K|       | 1448   (1)| 00:00:18 |
-----------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
          4 recursive calls
          0 db block gets
       6535 consistent gets
          0 physical reads
          0 redo size
       1079 bytes sent via SQL*Net to client
        503 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         23 rows processed

SQL> select object_type from test group by object_type;

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2408827628

---------------------------------------------------------------------------
| Id | Operation          | Name | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   472K| 5076K| 1479   (3)| 00:00:18 |
|   1 | HASH GROUP BY     |      |   472K| 5076K| 1479   (3)| 00:00:18 |
|   2 |   TABLE ACCESS FULL| TEST |   472K| 5076K| 1448   (1)| 00:00:18 |
---------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement

The above two SQL does the same thing，but with different execution plan, One is "HASH UNIQUE",and the other is "HASH GROUP BY" 。
From the above ,the second one seems better due to less logical read。

but i run the above two SQL with many time and find the two SQL have the same logical read .

SQL> select distinct object_type from test;

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3279290493

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       6451 consistent gets
          0 physical reads
          0 redo size
       1079 bytes sent via SQL*Net to client
        503 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         23 rows processed

SQL> select object_type from test group by object_type;

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2408827628

Note
-----
- dynamic sampling used for this statement

I want to know ,Which one is better for performance?

Thanks in advance.

一下是TOM大师给的例子:

You Asked

Tom,

Just want to know the difference between DISTINCT and GROUP BY in queries where I'm not

using any aggregate functions.

Like for example.

Select emp_no, name from Emp

Group by emo_no, name

And

Select distinct emp_no, name from emp;

Which one is faster and why ?

Thanks

and we said...

they are for all intents and purposes the same... and really easy for you to

evaluate!!

if you run:

set linesize 121

set echo on

drop table t;

create table t

select * from all_objects;

alter table t add constraint t_pk primary key(object_id);

exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

set autotrace traceonly

select distinct owner, object_name, object_type from t;

select owner, object_name, object_type from t group by owner, object_name, object_type;

set autotrace off

alter session set sql_trace=true;

set autotrace traceonly

select distinct owner, object_name, object_type from t;

select owner, object_name, object_type from t group by owner, object_name, object_type;

set autotrace off

select distinct owner, object_name, object_type from t

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 3 0.00 0.00 0 0 0 0

Execute 3 0.00 0.00 0 0 0 0

Fetch 9525 0.92 0.83 0 1992 0 142815

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 9531 0.92 0.83 0 1992 0 142815

Rows Row Source Operation

------- ---------------------------------------------------

47605 SORT UNIQUE (cr=664 pr=0 pw=0 time=177034 us)

47938 TABLE ACCESS FULL T (cr=664 pr=0 pw=0 time=48087 us)

********************************************************************************

select owner, object_name, object_type from t group by owner, object_name, object_type

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 3 0.00 0.00 0 0 0 0

Execute 3 0.00 0.00 0 0 0 0

Fetch 9525 0.93 0.85 0 1992 0 142815

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 9531 0.93 0.85 0 1992 0 142815

Rows Row Source Operation

------- ---------------------------------------------------

47605 SORT GROUP BY (cr=664 pr=0 pw=0 time=166792 us)

47938 TABLE ACCESS FULL T (cr=664 pr=0 pw=0 time=48006 us)

以下是问题的地址：

阅读(1432) | 评论(0) | 转发(0) |

上一篇：ORA-00600([ksires_1])错误

下一篇：TOM 大师谈 CONNECT BY

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6