一、查询要求

Q17语句查询获得比平均供货量的百分之二十还低的小批量订单。对于指定品牌和指定包装类型的零件，决定在一个七年数据库的所有订单中这些订单零件的平均项目数量（过去的和未决的）。如果这些零件中少于平均数20％的订单不再被接纳，那平均一年会损失多少呢？所以此查询可用于计算出如果没有小量订单，平均年收入将损失多少（因为大量商品的货运，将降低管理费用）。

Q17语句的特点是：带有聚集、聚集子查询操作并存的两表连接操作。

二、 Oracle执行

Oracle编写的查询SQL语句如下：

select /*+ parallel(n) */

sum(l_extendedprice) / 7.0 as avg_yearly

from

lineitem,part

where

p_partkey = l_partkey

and p_brand = 'Brand#33'

and p_container = 'LG DRUM'

and l_quantity < (

select

0.2 * avg(l_quantity)

from

lineitem

where

l_partkey = p_partkey

);

其中/*+ parallel(n) */ 是Oracle的并行查询语法，n是并行数。

脚本执行时间，单位：秒

并行数	1	2	4	8	12
Oracle	363	278	230	173	165

三、 SPL优化

这种在子查询中用等值条件与主表关联的情况，都可以转换成JOIN来计算，从而可以利用JOIN的优化技术。

select /*+ parallel(n) */

sum(l_extendedprice) / 7.0 as avg_yearly

from

lineitem,part,

(select l_partkey lp, 0.2*avg(l_quantity) lq

from lineitem

group by l_partkey

) lpq

where

p_partkey = l_partkey

and p_brand = 'Brand#33'

and p_container = 'LG DRUM'

and l_partkey=lp

and l_quantity < lq

这相当于对lineitem表进行两次外键匹配过滤，其中一个外键表是子查询计算出来的中间表lpq，而因为是内连接，lpq中涉及到的partkey也只在part表过滤后的范围内，因此可以复用part表的过滤。

SPL脚本如下：

	A
1	=1
2	=now()
3	>brand="Brand#33"
4	>container="LG DRUM"
5	=file(path+"part.ctx").create().cursor@m(P_PARTKEY;P_BRAND == brand && P_CONTAINER == container;A1).fetch().keys@i(P_PARTKEY)
6	=file(path+"lineitem.ctx").create()
7	=A6.cursor@m(L_PARTKEY,L_QUANTITY;A5.find(L_PARTKEY);A1)
8	=A7.groups@u(L_PARTKEY;avg(L_QUANTITY)*0.2:avg).keys@i(L_PARTKEY)
9	=A6.cursor@m(L_PARTKEY,L_QUANTITY,L_EXTENDEDPRICE;A5.find(L_PARTKEY),L_PARTKEY:A8;A1)
10	=(A9.total(sum(if(L_QUANTITY
11	=now()
12	=interval@s(A2,A11)

脚本执行时间，单位：秒

并行数	1	2	4	8	12
Oracle	363	278	230	173	165
SPL组表	185	101	61	39	25

四、进一步优化

在前面的SPL脚本中，A7和A9中对lineitem表按part表外键过滤遍历了两次。如果服务器内存足够，可以把过滤后的结果保存在内存中，无需第二次遍历读数，性能将会提高不少，因为lineitem表数据量很大。

SPL脚本如下：

	A
1	=1
2	=now()
3	>brand="Brand#33"
4	>container="LG DRUM"
5	=file(path+"part.ctx").create().cursor@m(P_PARTKEY;P_BRAND == brand && P_CONTAINER == container;A1).fetch().keys@i(P_PARTKEY)
6	=file(path+"lineitem.ctx").create().cursor@m(L_PARTKEY,L_QUANTITY,L_EXTENDEDPRICE;A5.find(L_PARTKEY);A1)
7	=A6.fetch()
8	=A7.cursor@m(A1)
9	=A8.groups@u(L_PARTKEY;avg(L_QUANTITY)*0.2:avg)
10	=A7.cursor@m(A1).switch@i(L_PARTKEY,A9:L_PARTKEY)
11	=(A10.total(sum(if(L_QUANTITY
12	=now()
13	=interval@s(A2,A12)

A7.cursor@m(A1)利用SPL提供的内存并行计算技术将内存中的序表A7变成多路游标，可以利用并行计算来提速。

A7中取出过滤后的lineitem表数据，A9中对取出的数据集按L_PARTKEY分组计算每种零件的平均订单量的20%，与原数据集连接后，A11中再重复利用此数据集来计算。

脚本执行时间，单位：秒

并行数	1	2	4	8	12
Oracle	363	278	230	173	165
SPL组表(优化前)	185	101	57	39	25
SPL组表(优化后)	94	49	26	18	17

可见，优化后速度有较大的提升。

一、 查询要求

二、 Oracle执行

三、 SPL优化

四、 进一步优化

一、查询要求

四、进一步优化