About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc
全部博文(173)
分类: Oracle
2020-07-02 16:38:07
有关DBLINK的分布式SQL执行机制及优化方法
分布式查询语句对于远程对象的查询在远程库执行,在远程库可以执行的SQL语句会通过优化器的查询转换,执行的是转换后的语句,然后结果集返回到本地,再与本地表运算。当然,本地操作还是远程操作是相对的,我们可以通过driving_site hint改变主查询计划的执行位置,但是对DML,driving_site是失效的,另外对远程表也可以使用其他hint来控制执行计划。
分布式查询语句中可能有不同远程库的表,优化分布式查询要达到3点目标:
如果能够达到以上3点,一般情况下,分布式查询的效率较高。
优化分布式查询需要从以上3个方面着手。
以下样例中,local_tab
7万多条,remote_big_tab百万条,remote_small_tab
7万多条。
3.1.使用Collocated内联视图
也就是说,SQL要引用不同远程库的表,需要组织好语句结构:将相同库的表放一起组成内联视图,这样ORACLE就很容易知道这个内联视图里的表是在同一远程库作完查询,然后再返回给本地库,这样减少了本地库与远程库的交互次数、传输结果集的数量和次数,从而提高效率。比如查询:
WHERE EXISTS
(SELECT 1 FROM
remote_big_tab@remote b,remote_small_tab@remote c
WHERE b.object_id=c.object_id AND
a.object_type=b.object_type);
执行计划
----------------------------------------------------------
Plan hash value: 49311412
------------------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes | Cost
(%CPU)| Time | Inst |IN-OUT|
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 42747 | 4508K| 2152
(4)| 00:00:26 | | |
|* 1 | HASH JOIN
| | 42747 |
4508K| 2152 (4)| 00:00:26 | |
|
| 2 | VIEW
| VW_SQ_1 | 26 | 286 |
1855 (4)| 00:00:23 | |
|
| 3 | REMOTE |
| | |
|
| REMOTE | R->S |
| 4 | TABLE ACCESS FULL|
LOCAL_TAB | 73985 | 7008K| 296 (1)| 00:00:04 |
| |
------------------------------------------------------------------------------------------------
Predicate Information (identified by
operation id):
---------------------------------------------------
1 -
access("A"."OBJECT_TYPE"="ITEM_0")
Remote SQL Information (identified by
operation id):
----------------------------------------------------
3 - EXPLAIN PLAN SET
STATEMENT_ID='PLUS5801659' INTO PLAN_TABLE@! FOR SELECT
DISTINCT
"A2"."OBJECT_TYPE" FROM "REMOTE_BIG_TAB"
"A2","REMOTE_SMALL_TAB" "A1" WHERE
"A2"."OBJECT_ID"="A1"."OBJECT_ID" (accessing 'REMOTE' )
可以看出,在远程库remote上执行的语句是两个远程表关联后,并经过查询转换(全转为大写,自己取了别名A1,A2,ORACLE内部自己构造查询语句SELECT DISTINCT…,之后远程查询结果返回给本地,可以去远程库里查询实际的计划,走的是HASH JOIN。通过以上案例看出,对于一些复杂的查询,对象来源于不同远程库,能够通过SQL改写将相同远程库的表先做JOIN,可以提高效率。
3.2.了解CBO优化器对分布式查询的处理
CBO对分布式查询的处理,也是尽量转为Collocated内联视图,CBO会做如下动作:
1)所有可mergeable的视图会merge。
2 ) CBO会测试Collocated内联视图的query BLOCK。
3 ) 如果可以将相同库的表组合成SQL一起查询,那么就会一起执行。
当然,CBO对分布式查询的处理,可能是不高效的,这时候得用其他的方法,比如使用HINT,改造SQL,改造分布式查询的方法(远程库用视图)等。
特别当分布式查询包含下列情况,CBO可能是不高效的:
1)有分组运算
2)有子查询
3)SQL很复杂
比如下面语句含有子查询:
WHERE b.object_id=c.object_id AND
a.object_type=b.object_type
AND a.object_id IN (SELECT object_id from sub);
执行计划
----------------------------------------------------------
Plan hash value: 252158753
----------------------------------------------------------------------------------------------------------
| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
|
| 79M| 20G| 3843 (46)| 00:00:47 |
| |
|* 1 | HASH JOIN
|
| 79M| 20G| 3843
(46)| 00:00:47 | | |
| 2 | REMOTE
| REMOTE_SMALL_TAB | 5320 |
431K| 8 (0)| 00:00:01 | REMOTE | R->S |
|* 3 | HASH JOIN
|
| 172M| 31G| 2978 (31)|
00:00:36 | | |
|* 4 | HASH JOIN
|
| 5260 | 565K| 303 (2)|
00:00:04 | | |
| 5 | SORT UNIQUE
|
| 5320 | 69160 | 5 (0)| 00:00:01 |
| |
| 6 | TABLE
ACCESS FULL| SUB | 5320
| 69160 | 5 (0)| 00:00:01 | |
|
| 7 | TABLE ACCESS
FULL | LOCAL_TAB | 73985 | 7008K| 296
(1)| 00:00:04 | | |
| 8 | REMOTE
| REMOTE_BIG_TAB |
1479K| 119M| 1819 (2)| 00:00:22 | REMOTE | R->S
|
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by
operation id):
---------------------------------------------------
1 - access("B"."OBJECT_ID"="C"."OBJECT_ID")
3 -
access("A"."OBJECT_TYPE"="B"."OBJECT_TYPE")
4 -
access("A"."OBJECT_ID"="OBJECT_ID")
Remote SQL Information (identified by
operation id):
----------------------------------------------------
2 - SELECT "OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYPE","CREATED",
"LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY"
FROM "REMOTE_SMALL_TAB"
"C"
(accessing 'REMOTE' )
8 - SELECT
"OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYPE","C
REATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY"
FROM
"REMOTE_BIG_TAB" "B" (accessing 'REMOTE' )
通过计划可以看到REMOTE有两条SQL,两张远程表无法做Collocated inline VIEW运算。
再比如下面的语句,有分组运算:
WHERE b.object_id=c.object_id AND
a.object_type=b.object_type;
执行计划
----------------------------------------------------------
Plan hash value: 2122363341
-----------------------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes | Cost
(%CPU)| Time | Inst |IN-OUT|
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 2321K| 431M|
2144 (3)| 00:00:26 | |
|
|* 1 | HASH JOIN
| |
2321K| 431M| 2144 (3)| 00:00:26 |
| |
|* 2 | HASH JOIN
| | 1412 |
135K| 1836 (3)| 00:00:23 | |
|
| 3 | VIEW
|
| 11 | 143 | 9 (12)| 00:00:01 |
| |
| 4 | REMOTE
|
| | |
| |
REMOTE | R->S |
| 5 | REMOTE
| REMOTE_BIG_TAB | 1479K|
119M| 1819 (2)| 00:00:22 | REMOTE | R->S |
| 6 | TABLE ACCESS FULL| LOCAL_TAB
| 73985 | 7008K| 296 (1)| 00:00:04 |
| |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 -
access("A"."OBJECT_TYPE"="B"."OBJECT_TYPE")
2 -
access("B"."OBJECT_ID"="C"."OBJECT_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
4 - EXPLAIN PLAN SET STATEMENT_ID='PLUS5801659'
INTO PLAN_TABLE@! FOR SELECT
MAX("A1"."OBJECT_ID") FROM
"REMOTE_SMALL_TAB" "A1" GROUP BY
"A1"."OBJECT_TYPE" (accessing
'REMOTE' )
5 - SELECT
"OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYP
E","CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY"
FROM
"REMOTE_BIG_TAB"
"B" (accessing 'REMOTE' )
通过计划看出,将远程表进行分组运算后,传输给本地库,然后大表传输给本地库,之后做HASH JOIN,这是不高效的。运行时间:已用时间: 00: 02:
12.22
可以改造分布式查询,手动组织Collocated inline VIEW,在远程库建立view:
AS
SELECT b.* FROM remote_big_tab b,(SELECT
max(object_id) object_id FROM remote_small_tab c GROUP BY c.object_type) c
WHERE b.object_id=c.object_id;
查询改为:
SELECT * FROM local_tab
a,v_remote@remote v WHERE a.object_type=v.object_type;
SQL> SELECT * FROM local_tab a,v_remote@remote v WHERE
a.object_type=v.object_type;
已选择1727104行。
已用时间: 00: 01: 02.81
执行计划
----------------------------------------------------------
Plan hash value: 2216230941
------------------------------------------------------------------------------------------------
| Id | Operation | Name
| Rows | Bytes | Cost (%CPU)| Time | Inst
|IN-OUT|
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | 274 | 299 (2)| 00:00:04 |
| |
|* 1 | HASH JOIN |
| 1 | 274 | 299 (2)|
00:00:04 | | |
| 2 | REMOTE
| V_REMOTE | 1 |
177 | 2 (0)| 00:00:01 | REMOTE | R->S |
| 3 | TABLE ACCESS FULL| LOCAL_TAB | 73985 | 7008K|
296 (1)| 00:00:04 | |
|
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."OBJECT_TYPE"="V"."OBJECT_TYPE")
通过计划可以看出,现在是远程表做整体操作之后才返回到本地了,大大减少远程库查询执行次数以及网络传输量。
SELECT/*+driving_site(b)*/ * FROM local_tab
a,remote_big_tab@remote b,(SELECT max(object_id) object_id
FROM remote_small_tab@remote c GROUP BY c.object_type) c
3.3.使用HINT,特别是driving_site
HINT
对远程表可以使用hint,比如parallel,use_nl,use_hash,FULL等。
driving_site hint能够指定执行计划在远程还是本地做,比如下面使用driving_site(b),那么SQL就是放到远程去执行,也就是原来的远程表就相当于本地表,本地表要传输给remote库,主计划在remote库上执行:
WHERE b.object_id=c.object_id AND
a.object_type=b.object_type;
当然,如果是driving_site(a)那么就是本地驱动的,默认的是本地驱动的。
SQL> SELECT COUNT(*) FROM small_tab_local a,
big_tab_remote@remote b
使用driving_site,特别是本地小结果集,远程大结果集的时候,总体结果集较小,希望计划在远程驱动,这样远程执行完毕,将结果集传输到本地,这样避免大结果集的传输。
例1:小表9998条,大表3169376条记录,远程大表sub_id,acc_id上联合索引
2 WHERE a.sub_id=b.sub_id AND
a.acc_id=b.acc_id;
------------
859
已用时间: 00: 00: 50.76
执行计划
----------------------------------------------------------
Plan hash value: 1507576754
--------------------------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows |
Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 |
41 | 44 (3)| 00:00:01 |
| |
| 1 | SORT AGGREGATE |
| 1 |
41 | |
| | |
| 2 | MERGE
JOIN |
| 9998 | 400K| 44
(3)| 00:00:01 | | |
| 3 | REMOTE
| BIG_TAB_REMOTE |
6771K| 167M| 26 (0)| 00:00:01 |
REMOTE | R->S |
|* 4 | SORT JOIN
| |
9998 | 146K| 18 (6)| 00:00:01 |
| |
| 5 | TABLE ACCESS FULL|
SMALL_TAB_LOCAL | 9998 | 146K| 17 (0)|
00:00:01 | | |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 -
access("A"."SUB_ID"="B"."SUB_ID" AND
"A"."ACC_ID"="B"."ACC_ID")
filter("A"."ACC_ID"="B"."ACC_ID"
AND "A"."SUB_ID"="B"."SUB_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
3 - SELECT
"SUB_ID","ACC_ID" FROM "BIG_TAB_REMOTE"
"B" ORDER BY "SUB_ID","ACC_ID"
(accessing 'REMOTE' )
查询876条数据,耗时50s,显然将大结果集拉到本地做运算是不好的,因为本地表很小,远程大表有索引,如果能在远端执行,并走nl,那么显然效率非常好。使用driving_site hint改造查询如下:
SELECT/*+driving_site(b) ordered use_nl(b)*/ COUNT(*) FROM
small_tab_local a, big_tab_remote@remote b
WHERE a.sub_id=b.sub_id AND a.acc_id=b.acc_id;
计划如下:
---------------------------------------------------------------------------------------------
| Id | Operation
| Name |
Rows | Bytes | Cost | Inst |IN-OUT|
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE|
| 1 |
52 | 10009 | | |
| 1 | SORT AGGREGATE
| |
1 | 52 | |
| |
| 2 | NESTED
LOOPS |
| 681 | 35412 | 10009 |
| |
| 3 | REMOTE
|
| 9998 |
253K| 11 | ! | R->S |
|* 4 | INDEX RANGE SCAN |
IDX_BIG_TAB_REMOTE | 1 | 26 | 1 |
MZT~ | |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 -
access("A2"."SUB_ID"="A1"."SUB_ID"
AND "A2"."ACC_ID"="A1"."ACC_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
3 -
SELECT "SUB_ID","ACC_ID" FROM "SMALL_TAB_LOCAL"
"A2" (accessing '!' )
现在主计划是在远端remote上执行的,本地表small_tab_local变成了远程表,ORACLE会将small_tab_local结果集送到远端,只查询了sub_id,acc_id,然后作为驱动表,与远端表做NESTED LOOPS运算,计划里可以看到远端表走索引了,最后将远端执行结果返回给本地。
driving_site hint注意点:
CREATE TABLE test_cnt (cnt NUMBER);
driving_site对dml无效,dml以操作目标表所在库驱动查询。比如下面的driving_site hint失效(其它hint还是有效的)。
INSERT INTO test_cnt
SELECT/*+driving_site(b) ordered use_nl(b)*/
COUNT(*) FROM small_tab_local a, big_tab_remote@remote b
WHERE a.sub_id=b.sub_id AND a.acc_id=b.acc_id;
已用时间: 00: 01: 31.48
执行计划
----------------------------------------------------------
Plan hash value: 259989953
------------------------------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes |
Cost (%CPU)| Time | Inst |IN-OUT|
------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT |
| 1 | 41
| 10035 (1)| 00:02:01 | |
|
| 1 | LOAD TABLE CONVENTIONAL | TEST_CNT
| | |
| |
| |
| 2 | SORT AGGREGATE |
| 1 |
41 | |
| | |
| 3 | NESTED LOOPS
|
| 9998 | 400K| 10035
(1)| 00:02:01 | | |
| 4 | TABLE ACCESS FULL | SMALL_TAB_LOCAL |
9998 | 146K| 17 (0)| 00:00:01 |
| |
| 5 | REMOTE
| BIG_TAB_REMOTE | 1 | 26 |
1 (0)| 00:00:01 | REMOTE | R->S |
------------------------------------------------------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
5 - SELECT /*+
OPAQUE_TRANSFORM USE_NL ("B") */ "SUB_ID","ACC_ID"
FROM "BIG_TAB_REMOTE" "B"
WHERE :1="SUB_ID"
AND :2="ACC_ID" (accessing 'REMOTE' )
语句执行1分31s,driving_site hint失效,但是后面的NL没有失效,可以从计划中看出类似绑定变量的参数,这实际上是对于每个small_tab_local的结果集的行,将sub_id,acc_id传给远端表big_tab_remote,也就是变量:1,:2,这样本地的表筛选出多少行,远程语句SELECT /*+ OPAQUE_TRANSFORM USE_NL ("B") */ "SUB_ID","ACC_ID" FROM "BIG_TAB_REMOTE" "B"
WHERE :1="SUB_ID" AND :2="ACC_ID" 就执行多少次。
SQL> SELECT
sql_text,executions FROM v$sql WHERE sql_text LIKE '%SELECT /*+ USE_NL
("B") */ "SUB_ID","ACC_ID" FROM
"BIG_TAB_REMOTE"%'
这里本地表9998条,无过滤条件,因此远程表语句运行了9998次,虽然远程查询也是走索引的,但是SQL被执行了9998次,是非常影响性能的。可以去远程库查询下:
2 /
SQL_TEXT
EXECUTIONS
--------------------------------------------------------------------------------
--------------------------------------------------------
SELECT /*+ USE_NL ("B") */
"SUB_ID","ACC_ID" FROM "BIG_TAB_REMOTE"
"B" WHERE :1= 9998
这里driving_site失效,但是后面的nl还有效,远程表执行的次数是small_tab_local表的数量(因为这里没有谓词过滤small_tab_local),可以使用其他hint,比如:
INSERT INTO test_cnt
BEGIN
/
SELECT/*+ordered use_hash(b)*/ COUNT(*) FROM small_tab_local a, big_tab_remote@remote b
WHERE a.sub_id=b.sub_id AND a.acc_id=b.acc_id;
当然效率不一定很好,因为这里由远程驱动效率最好,为了不想driving_site失效,可以使用PL/SQL(这里是只查询数量,如果查询结果集可以使用PL/SQL批处理插入)。
FOR i IN
(SELECT/*+driving_site(b) ordered use_nl(b)*/ COUNT(*) cnt FROM
small_tab_local a, big_tab_remote@remote b
WHERE a.sub_id=b.sub_id AND a.acc_id=b.acc_id)
LOOP
INSERT INTO test_cnt
VALUES(i.cnt);
END LOOP;
COMMIT;
END;
已用时间: 00: 00: 00.89
SELECT *
FROM v_remote WHERE object_id IN (
例2:
查询语句:
SELECT c.object_id FROM c
WHERE c.object_name
IN (SELECT d.object_name FROM d WHERE d.object_id=11)
);
比较慢,返回32行,需要10多秒。其中v_remote是个视图,此视图连接到远程表,其中远程的两张表的object_id都有索引:
CREATE OR
REPLACE VIEW v_remote
AS
SELECT object_name,object_id,object_type
FROM a@remote
UNION ALL
SELECT object_name,object_id,object_type
FROM b@remote;
两表记录数如下:
SQL> SELECT COUNT(*) FROM a;
COUNT(*)
----------
369888
SQL> SELECT COUNT(*) FROM b;
COUNT(*)
----------
5323
c和d是本地表,d.object_id以及c.object_name有索引。单独查询很快,<1s就会返回:
--单独本地语句消耗时间00: 00: 00.01
SQL> SELECT c.object_id FROM c WHERE c.object_name IN (SELECT
d.object_name FROM d WHERE d.object_id=11);
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------
Plan hash value: 2528799293
----------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes | Cost (%CPU)|
Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
| | 2 | 94 |
6 (17)| 00:00:01 |
| 1 | NESTED LOOPS
| | |
| |
|
| 2 | NESTED LOOPS
| | 2 | 94 |
6 (17)| 00:00:01 |
| 3 | SORT UNIQUE
| | 1 |
17 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| D |
1 | 17 | 2 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN
| IDX_D | 1 | | 1
(0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN
| IDX_C | 2 | | 2
(0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | C |
2 | 60 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("D"."OBJECT_ID"=11)
6 -
access("C"."OBJECT_NAME"="D"."OBJECT_NAME")
--单独远程语句消耗时间 00: 00: 00.06
SQL> SELECT * FROM v_remote WHERE object_id=11;
已选择32行。
已用时间: 00: 00: 00.06
执行计划
----------------------------------------------------------
Plan hash value: 1788691278
--------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| Inst
|IN-OUT|
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0
(0)| | |
| 1 | REMOTE
| |
| REMOTE | R->S |
--------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
1 - EXPLAIN PLAN SET
STATEMENT_ID='PLUS5821518' INTO PLAN_TABLE@!
FOR SELECT
"A1"."OBJECT_NAME","A1"."OBJECT_ID","A1"."OBJECT_TYPE"
FROM
( (SELECT
"A4"."OBJECT_NAME"
"OBJECT_NAME","A4"."OBJECT_ID"
"OBJECT_ID","A4"."OBJECT_TYPE"
"OBJECT_TYPE" FROM "A" "A4" WHERE
"A4"."OBJECT_ID"=11) UNION ALL (SELECT
"A3"."OBJECT_NAME"
"OBJECT_NAME","A3"."OBJECT_ID" "OBJECT_ID","A3"."OBJECT_TYPE"
"OBJECT_TYPE" FROM "B" "A3" WHERE
"A3"."OBJECT_ID"=11)) "A1" (accessing
'REMOTE' )
--联合查询消耗时间00: 00: 10.95
SQL> SELECT * FROM v_remote WHERE
object_id IN (
2 SELECT c.object_id FROM c
WHERE c.object_name IN (SELECT d.object_name FROM d WHERE d.object_id=11)
3 );
已选择32行。
已用时间: 00: 00: 10.95
执行计划
----------------------------------------------------------
Plan hash value: 2118901120
-------------------------------------------------------------------------------------------------------------
| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)| Time | Inst
|IN-OUT|
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
|
| 65 | 6695 | 471 (3)| 00:00:06
| | |
|* 1 | HASH JOIN
|
| 65 | 6695 | 471
(3)| 00:00:06 | | |
| 2 | VIEW
| VW_NSO_1 | 2 | 26 | 6 (17)|
00:00:01 | | |
| 3 | HASH UNIQUE
|
| 2 | 94 |
6 (17)| 00:00:01 | |
|
| 4 | NESTED LOOPS
|
| | |
| |
| |
| 5 | NESTED
LOOPS |
| 2 | 94 | 5
(0)| 00:00:01 | | |
| 6 | TABLE
ACCESS BY INDEX ROWID| D | 1 |
17 | 2 (0)| 00:00:01 |
| |
|* 7 |
INDEX RANGE SCAN | IDX_D
| 1 | | 1 (0)|
00:00:01 | | |
|* 8 | INDEX
RANGE SCAN | IDX_C |
2 | | 2 (0)| 00:00:01 |
| |
| 9 | TABLE
ACCESS BY INDEX ROWID | C | 2 |
60 | 3 (0)| 00:00:01 |
| |
| 10 | VIEW
| V_REMOTE | 375K| 32M| 462 (2)| 00:00:06 |
| |
| 11 | UNION-ALL
|
| | |
| |
| |
| 12 | REMOTE
| A
| 369K| 29M| 454
(2)| 00:00:06 | REMOTE | R->S |
| 13 | REMOTE
| B
| 5323 | 431K| 8
(0)| 00:00:01 | REMOTE | R->S |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by
operation id):
---------------------------------------------------
1 -
access("OBJECT_ID"="OBJECT_ID")
7 -
access("D"."OBJECT_ID"=11)
8 -
access("C"."OBJECT_NAME"="D"."OBJECT_NAME")
Remote SQL Information (identified by
operation id):
----------------------------------------------------
12 - SELECT
"OBJECT_NAME","OBJECT_ID","OBJECT_TYPE" FROM
"A" "A" (accessing 'REMOTE' )
13 - SELECT
"OBJECT_NAME","OBJECT_ID","OBJECT_TYPE" FROM
"B" "B" (accessing 'REMOTE' )
单独查询很快,为什么联合查询就慢了呢?原因在于:
单独执行远程查询 |
本地与远程混合查询 |
直接执行视图,并将OBJECT_ID=11谓词推入到视图中,走索引,最后只将32行结果返回给本地 |
从计划中可以看到,本地查询与远程查询做HASH JOIN,但是访问远程的SQL是没有谓词的,这样必然全表从远程将结果传输到本地,结果集大,所以慢 |
因此,优化此混合查询的语句有多种办法:可以使用PL/SQL拆分本地与远程查询,或者可以使用driving_site hint,将主计划推到远程库去执行,本地的结果集少,推到远程,远程视图走索引,效率高。如下:
--耗时已用时间: 00: 00: 00.08
SQL> SELECT/*+ driving_site(v_remote.a) */ * FROM v_remote WHERE object_id IN (
2 SELECT c.object_id FROM c WHERE
c.object_name IN (SELECT d.object_name FROM d WHERE d.object_id=11
3 );
已选择32行。
已用时间: 00: 00: 00.08
---------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes |
Cost | Inst |IN-OUT|
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE |
| 15 | 1425 | 109
| | |
| 1 | NESTED LOOPS
| | 15 |
1425 | 109 | | |
| 2 | SORT UNIQUE
| |
| | |
| |
| 3 | VIEW
| VW_NSO_1 | 3 |
39 | 8 | MZT~ | |
| 4 | REMOTE
|
| |
| | ! | R->S |
| 5 | VIEW
|
| 5 | 410 | 33 |
| |
| 6 | UNION-ALL PARTITION
| | |
| | |
|
| 7 | TABLE ACCESS BY INDEX ROWID| A
| 32 | 960 | 35 |MZT~ |
|
|* 8 | INDEX RANGE SCAN
| IDX_A | 32 | |
3 |MZT~ | |
| 9 | TABLE ACCESS BY INDEX ROWID| B
| 1 | 32 | 2 |
MZT~ | |
|* 10 | INDEX RANGE SCAN
| IDX_B | 1 | |
1 | MZT~ | |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
8 -
access("A6"."OBJECT_ID"="VW_NSO_1"."OBJECT_ID")
10 -
access("A5"."OBJECT_ID"="VW_NSO_1"."OBJECT_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
4 - SELECT
/*+ */ "A1"."OBJECT_ID" FROM "D"
"A2","C" "A1" WHERE
"A1"."OBJECT_NAME"="A2"."OBJECT_NAME"
AND "A2"."OBJECT_ID"=11 (accessing '!' )
现在的SQL效率很高,从计划中可以看出,现在计划在远程库上执行,本地的查询传输给远程,并驱动视图查询,视图走索引,然后再将结果返回给本地。另外注意hint 视图的用法是hint(视图名.表名)。
--本地建表和操作表test,driving_site失效
再次说说driving_site失效的情况:
DML,DDL让driving_site失效,driving_site hint会自动被oracle忽略掉,因为此时以目标表所在的库为主计划驱动,相当于driving_site(目标表库)
1)DML,DDL如果是对本地表操作,主计划是在本地做的,远程数据拉到本地,driving_site(remote)失效。
2)DML如果是对远程表DML,主计划是在远程做的,本地数据送到远程,相当于自动driving_site(remote)
CREATE TABLE test AS
SELECT/*+driving_site(v_remote.a)*/ * FROM v_remote
WHERE object_id IN (
SELECT c.object_id FROM c WHERE c.object_name IN
(SELECT d.object_name FROM d WHERE d.object_id=11)
);
--------------------------------------------------------------------------------------------------------------
| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
--------------------------------------------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT
| |
65 | 6695 | 1267 (1)| 00:00:16 |
| |
| 1 | LOAD AS SELECT
| TEST |
| |
| | |
|
|* 2 | HASH JOIN
|
| 65 | 6695 | 1266 (1)|
00:00:16 | | |
| 3 | VIEW
|
VW_NSO_1 | 2 | 26 | 7 (15)|
00:00:01 | | |
| 4 | HASH UNIQUE
|
| 2 | 94 | 6
(17)| 00:00:01 | | |
| 5 | NESTED LOOPS
|
| | |
| |
| |
| 6 | NESTED LOOPS
|
| 2 | 94 | 5 (0)|
00:00:01 | | |
| 7 | TABLE ACCESS BY
INDEX ROWID| D | 1 | 17
| 2 (0)| 00:00:01 | |
|
|* 8 | INDEX RANGE
SCAN | IDX_D | 1
| | 1 (0)| 00:00:01 |
| |
|* 9 | INDEX RANGE
SCAN | IDX_C |
2 | | 2 (0)| 00:00:01 |
| |
| 10 | TABLE ACCESS BY INDEX
ROWID | C | 2 | 60 |
3 (0)| 00:00:01 | |
|
| 11 | VIEW
|
V_REMOTE | 375K| 32M| 1257 (1)| 00:00:16 |
| |
| 12 | UNION-ALL
|
| | |
| |
| |
| 13 |
REMOTE
| A | 369K|
29M| 454 (2)| 00:00:06 | REMOTE | R->S |
| 14 | REMOTE
| B | 5323 | 431K| 8
(0)| 00:00:01 | REMOTE | R->S |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
access("OBJECT_ID"="OBJECT_ID")
8 -
access("D"."OBJECT_ID"=11)
9 -
access("C"."OBJECT_NAME"="D"."OBJECT_NAME")
Remote SQL Information (identified by operation id):
----------------------------------------------------
13 -
SELECT /*+
*/ "OBJECT_NAME","OBJECT_ID","OBJECT_TYPE"
FROM "A" "A" (accessing 'REMOTE' )
14 - SELECT
"OBJECT_NAME","OBJECT_ID","OBJECT_TYPE" FROM
"B" "B" (accessing 'REMOTE' )
--远程表操作,自动driving_site(remote)
INSERT INTO xm@remote(
object_name,object_id,object_type)
SELECT * FROM v_remote WHERE object_id IN (
SELECT c.object_id FROM c WHERE c.object_name IN
(SELECT d.object_name FROM d WHERE d.object_id=11)
);
---------------------------------------------------------------------------------------
| Id | Operation
| Name | Rows | Bytes |
Cost | Inst |IN-OUT|
---------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT REMOTE |
| 15 | 1425 | 109
| | |
| 1 | NESTED LOOPS
| | 15 |
1425 | 109 | | |
| 2 | SORT UNIQUE
| |
| | |
| |
| 3 | VIEW
| VW_NSO_1 | 3 |
39 | 8 | MZT~ | |
| 4 | REMOTE
| |
| | |
! | R->S |
| 5 | VIEW
|
| 5 | 410 | 33 |
| |
| 6 | UNION-ALL PARTITION
| | |
| | |
|
| 7 | TABLE ACCESS BY INDEX ROWID|
A | 32 | 960 | 35
| MZT~ |
|
|* 8 | INDEX RANGE SCAN
| IDX_A | 32 |
| 3 | MZT~ |
|
| 9 | TABLE ACCESS BY INDEX ROWID|
B | 1 | 32 |
2 | MZT~ |
|
|* 10 | INDEX RANGE SCAN
| IDX_B | 1 |
| 1 | MZT~ | |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
8 -
access("A7"."OBJECT_ID"="VW_NSO_1"."OBJECT_ID")
10 -
access("A6"."OBJECT_ID"="VW_NSO_1"."OBJECT_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
4 - SELECT /*+ */
"A1"."OBJECT_ID" FROM "D"
"A2","C" "A1" WHERE
"A1"."OBJECT_NAME"="A2"."OBJECT_NAME"
AND "A2"."OBJECT_ID"=11 (accessing '!' )