Chinaunix首页 | 论坛 | 博客
  • 博客访问: 50349
  • 博文数量: 14
  • 博客积分: 312
  • 博客等级: 一等列兵
  • 技术积分: 105
  • 用 户 组: 普通用户
  • 注册时间: 2010-09-01 10:42
文章分类

全部博文(14)

文章存档

2012年(12)

2010年(2)

我的朋友

分类: DB2/Informix

2010-09-08 14:37:44

l  目的:

      找出何种情况适用in还是exists条件为最优。

l  环境描述:

操作系统:linux

数据库:IBM Informix Dynamic Server Version 10.00.UC3R1

表:

Chen建表模式

create table "informix".chen

  (

    a integer,

    b integer,

    c char(10)

  )  extent size 16 next size 16 lock mode page;

revoke all on "informix".chen from "public";

 

 

 

create index "informix".chenona on "informix".chen (a) using btree

     in datadbs ;

create index "informix".testf on "informix".chen ("informix".testfunction(b))

using btree  in datadbs ;

 

create table "informix".test

  (

    a integer,

    b integer,

    c integer

  )  extent size 16 next size 16 lock mode page;

revoke all on "informix".test from "public";

create index testona on test(a);

Sql语句:

语句in:

Select * from chen where a in (select a from test);

语句exists :

Select * from chen where exists (select a from test where test.a=chen.a);

 

数据库为有日志缓存模式

l  测试过程:

A、外表比内表大只差别在数据量不同测试:

1、小量数据

Chen记录为10000

Test 记录为1000

 

结果输出:

语句in:

QUERY:

------

Select * from chen where a in (select a from test)

 

Estimated Cost: 87

Estimated # of Rows Returned: 1000

 

  1) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 38

    Estimated # of Rows Returned: 1000

 

      1) informix.test: SEQUENTIAL SCAN

 

语句exists

QUERY:

------

Select * from chen where exists (select a from test where test.a=chen.a)

 

Estimated Cost: 666

Estimated # of Rows Returned: 1000

 

  1) informix.test: INDEX PATH  (Skip Duplicate)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

  2) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = informix.chen.a

NESTED LOOP JOIN

因为记录量太少没必要做时间的测试。

结论:

In的成本更低

 

2、数据量加大十倍第一次

Chen记录为100000

Test 记录为10000

测试输出

语句in:

QUERY:

------

Select * from chen where a in (select a from test)

 

Estimated Cost: 809

Estimated # of Rows Returned: 10000

 

  1) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 345

    Estimated # of Rows Returned: 10000

 

      1) informix.test: INDEX PATH

 

        (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

语句exists

QUERY:

------

Select * from chen where exists (select a from test where test.a=chen.a)

 

Estimated Cost: 12044

Estimated # of Rows Returned: 100000

 

  1) informix.test: INDEX PATH  (Skip Duplicate)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

  2) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = informix.chen.a

NESTED LOOP JOIN

因为记录量太少没必要做时间的测试。

 

结论:

数据量增大后仍然是in的效率较高.这里发现了一个变化,在IN语句的子查询里用了索引,而不再是全表扫描。这与TEST表的数据量增加了有关.

3、数据库加大十倍第二次

Chen记录为110W

Test 记录为11W

语句in:

QUERY:

------

Select * from chen where a in (select a from test)

 

Estimated Cost: 9195

Estimated # of Rows Returned: 110000

 

  1) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 3989

    Estimated # of Rows Returned: 110000

 

      1) informix.test: INDEX PATH

 

        (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

语句exists:

 

QUERY:

------

Select * from chen where exists (select a from test where test.a=chen.a)

 

Estimated Cost: 90416

Estimated # of Rows Returned: 310750

 

  1) informix.test: INDEX PATH  (Skip Duplicate)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

  2) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = informix.chen.a

NESTED LOOP JOIN

In使用的时间:

 

 

real    0m12.392s

user    0m1.600s

sys     0m4.797s

 

exists使用的时间

real    0m10.255s

user    0m1.631s

sys     0m4.709s

结论:

在这一数据量级别依然是IN成本较低。但使用的时间多了1秒多。约为10%

3、数据库加大十倍第三次

Chen记录为1010W

Test 记录为101W

语句in:

QUERY:

------

Select * from chen where a in (select a from test)

 

Estimated Cost: 82140

Estimated # of Rows Returned: 1010000

 

  1) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 35516

    Estimated # of Rows Returned: 1010000

 

      1) informix.test: INDEX PATH

 

        (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

语句exists

QUERY:

------

Select * from chen where exists (select a from test where test.a=chen.a)

 

Estimated Cost: 1489105

Estimated # of Rows Returned: 11213197

 

  1) informix.test: INDEX PATH  (Skip Duplicate)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

 

  2) informix.chen: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = informix.chen.a

NESTED LOOP JOIN

In使用的时间:

 

 

1100000 row(s) retrieved.

 

 

Database closed.

 

 

real    1m39.850s

user    0m16.711s

sys     0m49.068s

exists使用的时间:

 

 

1100000 row(s) retrieved.

 

 

Database closed.

 

 

real    1m28.705s

user    0m17.119s

sys     0m50.854s

结论:

在千万行级IN的成本低,但使用的时间比EXISTS略高。差11秒左右,约为10%左右。

b、外表比内表小只差别在数据量不同测试:

语句改为:

语句in

Select * from test where a in (select a from chen);

 

语句exists:

Select * from test where exists (select a from chen where chen.a=test.a);

1Chen记录为1010W

Test 记录为101W

结果

语句in:

QUERY:

------

Select * from test where a in (select a from chen)

 

Estimated Cost: 376981

Estimated # of Rows Returned: 505000

 

  1) informix.test: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 355209

    Estimated # of Rows Returned: 10100000

 

      1) informix.chen: INDEX PATH

 

        (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

语句exists

QUERY:

------

Select * from test where exists (select a from chen where chen.a=test.a)

 

Estimated Cost: 1368674

Estimated # of Rows Returned: 11213197

 

  1) informix.test: SEQUENTIAL SCAN

 

  2) informix.chen: INDEX PATH  (First Row)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = informix.test.a

NESTED LOOP JOIN  (Semi Join)

In使用的时间

 

real    2m40.604s

user    0m16.573s

sys     0m46.344s

exists使用的时间

 

real    1m28.609s

user    0m16.602s

sys     0m46.801s

结论:

虽然从成本上看是IN的成低本但运行的时间上明显是exists的快很多。

时间上差了112秒。约为90%

 

2

Chen记录为100W

Test 记录为10W

In语句

QUERY:

------

Select * from test where a in (select a from chen)

 

Estimated Cost: 38722

Estimated # of Rows Returned: 50000

 

  1) informix.test: INDEX PATH

 

    (1) Index Keys: a   (Serial, fragments: ALL)

        Lower Index Filter: informix.test.a = ANY

 

    Subquery:

    ---------

    Estimated Cost: 36497

    Estimated # of Rows Returned: 1000000

 

      1) informix.chen: INDEX PATH

 

        (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

Exists语句:

QUERY:

------

Select * from test where exists (select a from chen where chen.a=test.a)

 

Estimated Cost: 71224

Estimated # of Rows Returned: 100000

 

  1) informix.test: SEQUENTIAL SCAN

 

  2) informix.chen: INDEX PATH  (First Row)

 

    (1) Index Keys: a   (Key-Only)  (Serial, fragments: ALL)

        Lower Index Filter: informix.chen.a = informix.test.a

NESTED LOOP JOIN  (Semi Join)

使用的时间

In语句

real    0m29.853s

user    0m1.599s

sys     0m4.659s

exists 语句

 

 

real    0m9.403s

user    0m1.651s

sys     0m4.650s

结论:

EXISTS的执行的快多了,虽然成本较多。

l  总结:

根据以上测试。得出以下结论:

Exists的效率为最高。

不论是外表比内表大,还是外表比内表小。虽然使用的成本较多,但执行的时间短。在外表比内表小时节约的时间更多,约为90%效果也比较明显。

阅读(841) | 评论(2) | 转发(0) |
给主人留下些什么吧!~~

chinaunix网友2010-09-11 10:35:13

很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com

chinaunix网友2010-09-11 10:35:13

很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com