在测试过程中发现 DISTINCT 和 GROUP BY 性能差不多,本打算把下面的话贴给Tom,打算问2个的主要区别,发现上面已经有人问过了,而且TOM也给了例子
Hi tom:
See the following example:
SQL> create table test as select * from all_objects;
Table created.
SQL> set linesize 300
SQL> set autot traceonly
SQL> select distinct object_type from test;
23 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3279290493
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 472K| 5076K| | 3535 (2)| 00:00:43 |
| 1 | HASH UNIQUE | | 472K| 5076K| 18M| 3535 (2)| 00:00:43 |
| 2 | TABLE ACCESS FULL| TEST | 472K| 5076K| | 1448 (1)| 00:00:18 |
-----------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
6535 consistent gets
0 physical reads
0 redo size
1079 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
23 rows processed
SQL> select object_type from test group by object_type;
23 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2408827628
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 472K| 5076K| 1479 (3)| 00:00:18 |
| 1 | HASH GROUP BY | | 472K| 5076K| 1479 (3)| 00:00:18 |
| 2 | TABLE ACCESS FULL| TEST | 472K| 5076K| 1448 (1)| 00:00:18 |
---------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
6535 consistent gets
0 physical reads
0 redo size
1079 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
23 rows processed
The above two SQL does the same thing,but with different execution plan, One is "HASH UNIQUE",and the other is "HASH GROUP BY" 。
From the above ,the second one seems better due to less logical read。
but i run the above two SQL with many time and find the two SQL have the same logical read .
SQL> select distinct object_type from test;
23 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3279290493
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 472K| 5076K| | 3535 (2)| 00:00:43 |
| 1 | HASH UNIQUE | | 472K| 5076K| 18M| 3535 (2)| 00:00:43 |
| 2 | TABLE ACCESS FULL| TEST | 472K| 5076K| | 1448 (1)| 00:00:18 |
-----------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6451 consistent gets
0 physical reads
0 redo size
1079 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
23 rows processed
SQL> select object_type from test group by object_type;
23 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2408827628
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 472K| 5076K| 1479 (3)| 00:00:18 |
| 1 | HASH GROUP BY | | 472K| 5076K| 1479 (3)| 00:00:18 |
| 2 | TABLE ACCESS FULL| TEST | 472K| 5076K| 1448 (1)| 00:00:18 |
---------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6451 consistent gets
0 physical reads
0 redo size
1079 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
23 rows processed
I want to know ,Which one is better for performance?
Thanks in advance.
一下是TOM大师给的例子:
You Asked
Tom,
Just want to know the difference between DISTINCT and GROUP BY in queries where I'm not
using any aggregate functions.
Like for example.
Select emp_no, name from Emp
Group by emo_no, name
And
Select distinct emp_no, name from emp;
Which one is faster and why ?
Thanks
and we said...
they are for all intents and purposes the same... and really easy for you to
evaluate!!
if you run:
set linesize 121
set echo on
drop table t;
create table t
as
select * from all_objects;
alter table t add constraint t_pk primary key(object_id);
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
set autotrace traceonly
select distinct owner, object_name, object_type from t;
select owner, object_name, object_type from t group by owner, object_name, object_type;
set autotrace off
alter session set sql_trace=true;
set autotrace traceonly
select distinct owner, object_name, object_type from t;
select distinct owner, object_name, object_type from t;
select distinct owner, object_name, object_type from t;
select owner, object_name, object_type from t group by owner, object_name, object_type;
select owner, object_name, object_type from t group by owner, object_name, object_type;
select owner, object_name, object_type from t group by owner, object_name, object_type;
set autotrace off
select distinct owner, object_name, object_type from t
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 3 0.00 0.00 0 0 0 0
Fetch 9525 0.92 0.83 0 1992 0 142815
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 9531 0.92 0.83 0 1992 0 142815
Rows Row Source Operation
------- ---------------------------------------------------
47605 SORT UNIQUE (cr=664 pr=0 pw=0 time=177034 us)
47938 TABLE ACCESS FULL T (cr=664 pr=0 pw=0 time=48087 us)
********************************************************************************
select owner, object_name, object_type from t group by owner, object_name, object_type
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 3 0.00 0.00 0 0 0 0
Fetch 9525 0.93 0.85 0 1992 0 142815
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 9531 0.93 0.85 0 1992 0 142815
Rows Row Source Operation
------- ---------------------------------------------------
47605 SORT GROUP BY (cr=664 pr=0 pw=0 time=166792 us)
47938 TABLE ACCESS FULL T (cr=664 pr=0 pw=0 time=48006 us)
以下是问题的地址: