About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc
全部博文(173)
分类: Oracle
2013-04-25 11:17:00
发表时间:2011-1-31
博文地址:
这个例子是ORACLE ANSI JOIN,ANSIJOIN是ORACLE 9i新特性,ANSI JOIN与ORACLE原来的JOIN语法不同(外连接用+,全完连接用UNION ALL实现),本例通过对比ANSI JOIN与ORACLE原始JOIN语法的区别。
建立的两个表在LMT管理的表空间下、8K block、1M uniform extent和freelist管理。禁用system statistics(CPU COSTING)。
DROP TABLE t1;
create table t1
as
select
rownum -1 id,
mod(rownum - 1,20) n1,
lpad(rownum - 1,10,'0') v1,
rpad('x',100) padding
from
all_objects
where
rownum <=4000
;
DROP TABLE t2;
create table t2
as
select
rownum -1 id,
mod(rownum - 1,20) n1,
lpad(rownum - 1,10,'0') v1,
rpad('x',100) padding
from
all_objects
where
rownum <=4000
;
create index t1_i1 on t1(id);
create index t2_i1 on t2(id);
begin
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'T1',
estimate_percent => 100,
method_opt => 'for allcolumns size 1'
);
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'T2',
estimate_percent => 100,
method_opt => 'for allcolumns size 1'
);
end;
/
使用ANSI JOIN外连接,语句如下:
select
/*+gather_plan_statistics */
t1.id,
t1.n1,
t1.v1,
t2.n1
from
t1
left join
t2
on
t2.id =t1.n1
and t1.n1 in (7, 11, 13)
where
t1.id = 15
;
ID N1 V1 N1
---------- ---------- -------------------- ----------
15 15 0000000015
dingjun123@ORADB>@displan
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID 9xccg0mu53241, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.id, t1.n1,
t1.v1, t2.n1 from t1 left join t2 on t2.id = t1.n1 and
t1.n1 in (7, 11,13) where t1.id = 15
Plan hash value:2591160116
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 4 |
| 1 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY INDEX ROWID |T1 | 1 | 1 | 1 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE SCAN |T1_I1 | 1 | 1 | 1 |00:00:00.01 | 3 |
| 4 | VIEW | | 1 | 1 | 0 |00:00:00.01 | 0 |
|* 5 | FILTER | | 1 | | 0 |00:00:00.01 | 0 |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 0 | 1 | 0 |00:00:00.01 | 0 |
|* 7 | INDEX RANGE SCAN | T2_I1| 0 | 1 | 0 |00:00:00.01 | 0 |
--------------------------------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
3 - access("T1"."ID"=15)
5 -filter(("T1"."N1"=7 OR "T1"."N1"=11 OR"T1"."N1"=13))
7 -access("T2"."ID"="T1"."N1")
filter(("T2"."ID"=7 OR"T2"."ID"=11 OR "T2"."ID"=13))
从执行计划上分析,ANSI JOIN的on条件,在计划里显示为“谓词传递”,将t1.n1的条件传递到了t2.id上。语句的含义是ORACLE会找满足t1.id=15的t1的所有行,并且t1.n1 in (7,11,13)的行的时候t2.id和t1.n1相等,其他均为外连接(t2端补null)。
注意FILTER,FILTER类似于NESTED LOOPS( filter),这里的执行计划并不是6,7先执行,而是会先判断是否满足FILTER条件,如果满足执行6,7,否则不执行。因为t1.id=15返回的t1.n1=15,只返回一行,所以不满足谓词 5 - filter(("T1"."N1"=7OR "T1"."N1"=11 OR "T1"."N1"=13))条件,因此6,7没有执行(通过STARTS执行次数可以看出),实际未访问t2索引。
那么用ORACLE原始语法改写,得到的计划是否等效呢?
select
/*+gather_plan_statistics */
t1.id,
t1.n1,
t1.v1,
t2.n1
from
t1,
t2
WHERE
t1.id =15
ANDt2.ID(+)=CASE WHEN t1.n1 IN (7,11,13) THEN t1.n1 END;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
SQL_ID 0xphawu4apgkx, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.id, t1.n1,
t1.v1, t2.n1 from t1, t2 WHERE t1.id = 15 AND
t2.ID(+)=CASE WHENt1.n1 IN (7,11,13) THEN t1.n1 END
Plan hash value:3024481811
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 4 |
| 1 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE SCAN | T1_I1| 1 | 1 | 1 |00:00:00.01 | 3 |
| 4 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 1 | 0 |00:00:00.01 | 0 |
|* 5 | INDEX RANGE SCAN | T2_I1| 1 | 1 | 0 |00:00:00.01 | 0 |
------------------------------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
3 - access("T1"."ID"=15)
5 -access("T2"."ID"=CASE "T1"."N1" WHEN 7THEN "T1"."N1" WHEN 11 THEN "T1"."N1"
WHEN 13 THEN"T1"."N1" END )
上面没有view,虽然第5步访问索引,但是因为access条件CASE语句返回NULL,所以实际虽然访问了一次索引(STARTS=1),但是未访问任何数据块(BUFFERS=0)。实际效率和ANSI JOIN的一样。
鉴于我先前的经验,ORACLE自己的老语法更易于理解,但是客户可能认为ANSI JOIN好,他们写惯了ANSI JOIN。
通过跟踪10053事件,发现上面语句是用 子查询改写的。
select
/*+gather_plan_statistics */
t1.id,
t1.n1,
t1.v1,
t2.n1
from
t1,
lateral (
(
select
t2.n1
from
t2
where
t1.n1 in (7, 11, 13)
and t2.id = t1.n1
)
)(+) t2
where
t1.id = 15
;
lateral (
*
ERROR at line 9:
ORA-00933: SQL command not properly ended
lateral类似于管道函数table,lateral创建一inline view, 允许内部包含子查询,此子查询还可以引用其他对象,外面可以查对应的列。
上面报错,对于lateral view可以使用22829 event来执行:
dingjun123@ORADB> alter sessionset events '22829 trace name context forever';
执行计划和ANSI JOIN一样,省略。
精彩回复:
1)lewis说了lateral类似于table管道函数,那么使用table函数来模仿:
create or replace type t_array as table of number;
/
SELECT/*+gather_plan_statistics*/t1.ID,t1.n1,t1.v1,t2.COLUMN_VALUE
FROM t1,
TABLE(CAST(MULTISET(
SELECT t2.n1
FROM t2
WHERE t2.ID = t1.n1
AND t1.n1 IN(7,11,13)
) AS t_array)
)(+) t2
WHERE t1.ID =15;
Plan hash value:483134546
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 4 |
| 1 | NESTED LOOPS OUTER | | 1 | 8168 | 1 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 1 | 1 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE SCAN |T1_I1 | 1 | 1| 1 |00:00:00.01 | 3 |
| 4 | COLLECTION ITERATOR SUBQUERY FETCH| | 1 | 8168 | 0 |00:00:00.01 | 0 |
|* 5 | FILTER | | 1 | | 0 |00:00:00.01 | 0 |
| 6 | TABLE ACCESS BY INDEX ROWID | T2 | 0 | 1 | 0 |00:00:00.01 | 0 |
|* 7 | INDEX RANGE SCAN |T2_I1 | 0 | 1 | 0 |00:00:00.01 | 0 |
-------------------------------------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
3 - access("T1"."ID"=15)
5 - filter((:B1=7 OR :B2=11 OR :B3=13))
7 -access("T2"."ID"=:B1)
这里的第5行,FILTER不满足,还是没有执行6,7步的。使用table管道函数,一般需要创建一个类型,可以基本类型的nested table,也可以是对象类型的(比如需要返回多列),特别是管道函数包装的inline view可以引用其他from里的表,这个比较牛(普通inline views貌似做不到),还记得标量子查询吗?
标量子查询是返回单行单列的子查询,如果我们要返回单行多列呢?
方法1 构造法:
select dname,
ltrim(substr( data, 1, 30 )) ename,
ltrim(substr(data, 31, 10)) job,
to_date(substr( data, 41 ),'yyyymmddhh24miss') hiredate
from (
select dname,
(selectrpad(ename,30)||rpad(job,10)||to_char(hiredate,'yyyymmddhh24miss')
fromemp
whererownum = 1
andemp.deptno = dept.deptno) data
fromscott.dept
);
方法2 TABLE管道函数:
create or replace type o_emp is object
(
EMPNO NUMBER
,ENAME VARCHAR2(10)
,JOB VARCHAR2(10)
, MGR NUMBER
,HIREDATE DATE
, SAL NUMBER
,COMM NUMBER
, DEPTNO NUMBER
)
/
create or replace type c_emp is table ofo_emp;
/
select *
from scott.dept
,table(cast(multiset(
select * from scott.emp where emp.deptno =dept.deptno
) as c_emp)
);
OK,对于TABLE函数我还想说点,看到了吗??估算的cardinality是8168行,这是TABLE函数的默认基数,经常用TABLE函数处理SQL,可能会遇到默认8168造成不正确的执行计划,这时候可以用undocumenthint:cardinality(table/alias,card)来解决,比如cardinality(t,10)就是让t对象的基数估算为10行,那么可能对走索引和nested loops有好处。
关于table管道函数,可以从oracle pl/sql user's guide and ref上获取详细信息,管道函数是PL/SQL的重要特性,可以在一定情况下提高性能。而且还可以用cast+multiset来实现管道函数在SQL里使用,增强复杂SQL的编写。
2)这人喜欢ANSI JOIN,但是他想用+改写下
select
t1.id,
t1.n1,
t1.v1,
t2.n1
from
t1,
(
select
t2.id,
t2.n1
from
t2
where
t2.id in (7, 11, 13)
) t2
where
t1.id = 15
and t2.id (+) = t1.n1
当然他的结果是对的,但是他的SQL却不是对的,why? 因为他的改写是将t1.n1 in (7,11,13)换成了t2.id,构造了个inline view,这是特别的情况,因为这里存在()谓词传递。
换一下他的就不对了,比如:
select
t1.id,
t1.n1,
t1.v1,
t2.n1
from
t1
left join
t2
on
t2.id = t1.n1
and T1.COLX in (7, 11,13)
where
t1.id = 15
;
这里的colx根本与t2.id没有任何关系,所以按照上面的改写肯定不对,ansi join改写成oraclejoin语法(外连接),对这种情况有点复杂,我们需要理解其含义:选择t1.id=15的t1表数据,为基础行集,然后在t1.colx in(7,11,13)的情况下,t2.id=t1.n1做join,如果不满足此条件,那么t2端为外连接(补空)。
改写有很多方法:DECODE/CASEWHEN…
详细参考我的几篇文章:
http://www.itpub.net/thread-1344746-1-1.html
http://www.itpub.net/thread-1410369-1-1.html
3)ANSI JOIN事实上,大部分情况下ORACLE会内部转为ORACLE JOIN语法,少部分不转换如FULL OUTER JOIN/10gPartitioned Outer Join。
看一个一般外连接转为ORACLE风格JOIN的例子:
SELECT * FROM
scott.emp a
LEFT JOIN
scott.dept b
ON a.ename = 'x'
AND a.deptno = b.deptno;
执行计划
----------------------------------------------------------
Plan hash value: 3387915970
---------------------------------------------------------------------------
| Id | Operation | Name |Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 12 | 708 | 7 (15)| 00:00:01 |
|* 1| HASH JOIN OUTER | | 12 | 708 | 7 (15)| 00:00:01 |
| 2| TABLE ACCESS FULL| EMP | 12| 468 | 3 (0)| 00:00:01 |
| 3| TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
1 -access("A"."DEPTNO"="B"."DEPTNO"(+) AND"A"."ENAME"=CASE WHEN
("B"."DEPTNO"(+) IS NOT NULL) THEN 'x' ELSE 'x' END)
看到谓词里ORACLE风格的外连接语法+了吧。事实ORACLE内部就是转为ORACLE风格外连接语法的。还有点,有没有注意到,前面的计划里没有+,通过测试,好像是DISPLAY_CURSOR把+给省略了。
对于FULL JOIN、POJ是新语法,不按照原来的+来做,这个有兴趣的自己测试。
查看计划是个良好的习惯,ANSIJOIN实现复杂JOIN比较简单,因为它的限制更少(可以从SQL文档看到,比如上面的用+语法还要写CASE WHEN或DECODE),如果要改写ANSIJOIN为+的语法,对于一般的,如果不会写,可以先看计划里的条件是怎么转的,然后再写。
点评:
1) 学习了ANSI JOIN与普通JOIN的性能区别以及ANSI JOIN内部计划的转换。
2) 学习了查看详细计划的方式:使用gather_plan_statisticshint来让SQL产生额外的行资源信息starts(当前步骤执行次数)、E-ROWS(估算的cardinality)、A-ROWS(实际的cardinality)、A-TIME(实际执行时间)、buffers(logical read)以及其他额外列,具体参考文档。用DBMS_XPLAN.DISPLAY_CURSOR来显示
SELECT * FROM TABLE(dbms_xplan.display_cursor(NULL,NULL,'allstats last'))。
这个hint在查看详细计划做SQL性能分析的时候很有用。
3) 还学习了个undocument函数lateral(在11g里10053未发现这个函数),这个需要event 22829才能执行。然后简单学习了table函数的使用。
4) 知道了原来ORACLE内部对外连接是会转为+语法来处理的,那么在改写复杂ANSI JOIN的时候,查看计划就很有用了,知道谓词是怎么transform的。
、