分类: DB2/Informix
2010-09-08 14:37:44
l 目的:
找出何种情况适用in还是exists条件为最优。
l 环境描述:
操作系统:linux
数据库:IBM Informix Dynamic Server Version 10.00.UC3R1
表:
Chen建表模式
create table "informix".chen
(
a integer,
b integer,
c char(10)
) extent size 16 next size 16 lock mode page;
revoke all on "informix".chen from "public";
create index "informix".chenona on "informix".chen (a) using btree
in datadbs ;
create index "informix".testf on "informix".chen ("informix".testfunction(b))
using btree in datadbs ;
create table "informix".test
(
a integer,
b integer,
c integer
) extent size 16 next size 16 lock mode page;
revoke all on "informix".test from "public";
create index testona on test(a);
Sql语句:
语句in:
Select * from chen where a in (select a from test);
语句exists :
Select * from chen where exists (select a from test where test.a=chen.a);
数据库为有日志缓存模式
l 测试过程:
A、外表比内表大只差别在数据量不同测试:
1、小量数据
Chen记录为10000
Test 记录为1000
结果输出:
语句in:
QUERY:
------
Select * from chen where a in (select a from test)
Estimated Cost: 87
Estimated # of Rows Returned: 1000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = ANY
Subquery:
---------
Estimated Cost: 38
Estimated # of Rows Returned: 1000
1) informix.test: SEQUENTIAL SCAN
语句exists
QUERY:
------
Select * from chen where exists (select a from test where test.a=chen.a)
Estimated Cost: 666
Estimated # of Rows Returned: 1000
1) informix.test: INDEX PATH (Skip Duplicate)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
2) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = informix.chen.a
NESTED
因为记录量太少没必要做时间的测试。
结论:
In的成本更低
2、数据量加大十倍第一次
Chen记录为100000
Test 记录为10000
测试输出
语句in:
QUERY:
------
Select * from chen where a in (select a from test)
Estimated Cost: 809
Estimated # of Rows Returned: 10000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = ANY
Subquery:
---------
Estimated Cost: 345
Estimated # of Rows Returned: 10000
1) informix.test: INDEX PATH
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
语句exists
QUERY:
------
Select * from chen where exists (select a from test where test.a=chen.a)
Estimated Cost: 12044
Estimated # of Rows Returned: 100000
1) informix.test: INDEX PATH (Skip Duplicate)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
2) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = informix.chen.a
NESTED
因为记录量太少没必要做时间的测试。
结论:
数据量增大后仍然是in的效率较高.这里发现了一个变化,在IN语句的子查询里用了索引,而不再是全表扫描。这与TEST表的数据量增加了有关.
3、数据库加大十倍第二次
Chen记录为110W
Test 记录为11W
语句in:
QUERY:
------
Select * from chen where a in (select a from test)
Estimated Cost: 9195
Estimated # of Rows Returned: 110000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = ANY
Subquery:
---------
Estimated Cost: 3989
Estimated # of Rows Returned: 110000
1) informix.test: INDEX PATH
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
语句exists:
QUERY:
------
Select * from chen where exists (select a from test where test.a=chen.a)
Estimated Cost: 90416
Estimated # of Rows Returned: 310750
1) informix.test: INDEX PATH (Skip Duplicate)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
2) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = informix.chen.a
NESTED
In使用的时间:
real
user
sys
exists使用的时间
real
user
sys
结论:
在这一数据量级别依然是IN成本较低。但使用的时间多了1秒多。约为10%。
3、数据库加大十倍第三次
Chen记录为1010W
Test 记录为101W
语句in:
QUERY:
------
Select * from chen where a in (select a from test)
Estimated Cost: 82140
Estimated # of Rows Returned: 1010000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = ANY
Subquery:
---------
Estimated Cost: 35516
Estimated # of Rows Returned: 1010000
1) informix.test: INDEX PATH
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
语句exists
QUERY:
------
Select * from chen where exists (select a from test where test.a=chen.a)
Estimated Cost: 1489105
Estimated # of Rows Returned: 11213197
1) informix.test: INDEX PATH (Skip Duplicate)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
2) informix.chen: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = informix.chen.a
NESTED
In使用的时间:
1100000 row(s) retrieved.
Database closed.
real
user
sys
exists使用的时间:
1100000 row(s) retrieved.
Database closed.
real
user
sys
结论:
在千万行级IN的成本低,但使用的时间比EXISTS略高。差11秒左右,约为10%左右。
b、外表比内表小只差别在数据量不同测试:
语句改为:
语句in:
Select * from test where a in (select a from chen);
语句exists:
Select * from test where exists (select a from chen where chen.a=test.a);
1、Chen记录为1010W
Test 记录为101W
结果
语句in:
QUERY:
------
Select * from test where a in (select a from chen)
Estimated Cost: 376981
Estimated # of Rows Returned: 505000
1) informix.test: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = ANY
Subquery:
---------
Estimated Cost: 355209
Estimated # of Rows Returned: 10100000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
语句exists
QUERY:
------
Select * from test where exists (select a from chen where chen.a=test.a)
Estimated Cost: 1368674
Estimated # of Rows Returned: 11213197
1) informix.test: SEQUENTIAL SCAN
2) informix.chen: INDEX PATH (First Row)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = informix.test.a
NESTED
In使用的时间
real
user
sys
exists使用的时间
real
user
sys
结论:
虽然从成本上看是IN的成低本但运行的时间上明显是exists的快很多。
时间上差了1分12秒。约为90%。
2、
Chen记录为100W
Test 记录为10W
In语句
QUERY:
------
Select * from test where a in (select a from chen)
Estimated Cost: 38722
Estimated # of Rows Returned: 50000
1) informix.test: INDEX PATH
(1) Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.test.a = ANY
Subquery:
---------
Estimated Cost: 36497
Estimated # of Rows Returned: 1000000
1) informix.chen: INDEX PATH
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
Exists语句:
QUERY:
------
Select * from test where exists (select a from chen where chen.a=test.a)
Estimated Cost: 71224
Estimated # of Rows Returned: 100000
1) informix.test: SEQUENTIAL SCAN
2) informix.chen: INDEX PATH (First Row)
(1) Index Keys: a (Key-Only) (Serial, fragments: ALL)
Lower Index Filter: informix.chen.a = informix.test.a
NESTED
使用的时间
In语句
real
user
sys
exists 语句
real
user
sys
结论:
EXISTS的执行的快多了,虽然成本较多。
l 总结:
根据以上测试。得出以下结论:
Exists的效率为最高。
不论是外表比内表大,还是外表比内表小。虽然使用的成本较多,但执行的时间短。在外表比内表小时节约的时间更多,约为90%效果也比较明显。
chinaunix网友2010-09-11 10:35:13
很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com
chinaunix网友2010-09-11 10:35:13
很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com