DB2 10: Faster SQL PL and PL/SQL-serge

Serge Rielau (瑞赛奇)

首页　| 　博文目录　| 　关于我

serge_relau

博客访问： 128561
博文数量： 6
博客积分： 205
博客等级：入伍新兵
技术积分： 1304
用户组：普通用户
注册时间： 2012-05-25 13:37

文章分类

全部博文（6）

SQL Server（0）
未分配的博文（6）

文章存档

2012年（6）

我的朋友

相关博文

DB2 10: Faster SQL PL and PL/SQL

分类： DB2/Informix

2012-05-29 14:15:57

Background

In January 2011 I posted Inline SQL PL vs compiled SQL PL.

One of the points I made in this entry was that inline SQL PL is preferable for simple SQL PL logic because it is faster.

This implies of course that compiled SQL PL has some headroom when it comes to performance improvements.
There is always room for improvement just about anywhere, so it can be interesting to talk about what motivated investments in this area.
In this case an IBM partner was enabling an Oracle application to DB2.
Most times we find that applications operate at competitive speeds after moving to DB2 thanks to the law of averages.

So DB2 may be slower for some SQL statements, but it will be faster for others.
As long as the fast statements cancel out the slow statements for all the time critical transactions everything is good.

In this case however the partner ended up with one critical transaction that exposed a performance regression.
Some further analysis revealed that there was a single PL/SQL function which consumed a lot more time in DB2.

A look at the function revealed that the function did not contain a single SQL statement.

All the function did was some sophisticated scanning of strings to mine for patterns.

In this post I will describe the problem and how DB2 10 has fixed it.

Compiled SQL PL in DB2 9.7

Naturally I cannot reveal the partner's source code here.
So I have chosen an artificial test to illustrate the problem.
Some of you may know the "3*a+1" sequence.
I don't even recall when I bumped into it the first time.
But it's easy to understand and code:

Take a whole number n
If n is 1 then the sequence finishes
If n odd then multiply n by 3 and add one. This is the new n
If n is even then divide n by 2. This is the new n
Go back to step 2 until the sequence finishes

The question is whether this sequence is always finite.

The following SQL PL function tries out the first n whole numbers and verifies the sequence reaches 1 in each case.
To ensure we are comparing apples to apples we make this test in DB2 10, but restricting it to DB2 9.,7 technology:

点击(此处)折叠或打开

CALL set_routine_opts('PVM_OPTLEVEL 127');
CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
IF (a / 2) * 2 = a THEN
SET a = a / 2;
ELSE
SET a = 3 * a + 1;
END IF;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
209

This is the baseline to beat.
But where does time go here?

The SQL PL Profiler described previously in this blog can tell us that:

点击(此处)折叠或打开

LINE EXEC TIME(ACT) CPU_TIME TEXT
---- --------- --------- --------- --------------------------------------------------------------------
1 CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
2 RETURNS INTEGER SPECIFIC three_a_plus_one
3 BEGIN
4 DECLARE len INTEGER;
5 DECLARE n, a INTEGER DEFAULT 1;
6 DECLARE stamp TIMESTAMP;
7 1 0 0 SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
8 WHILE n < stop DO
9 SET a = n;
10 WHILE a <> 1 DO
11 10753712 46283 64847998 IF (a / 2) * 2 = a THEN
12 SET a = a / 2;
13 ELSE
14 SET a = 3 * a + 1;
15 END IF;
16 END WHILE;
17 SET n = n + 1;
18 END WHILE;
19 1 0 0 RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
20 END

What this is showing is that several of the statements inside of the loop are seen by DB2 as SQL statements.

Most of the math already happens within the Procedure Virtual Machine (PVM).
For example comparison on line 8 or the division in line 12.
It is interesting to see that about 10 million statements are being executed here.

Now let's take a look at the best case.

Inlined SQL PL

The densest way to drive the logic is to inline the SQL PL.
This is done by adding an ATOMIC keyword to the function.
When this is done DB2 will macro expand the logic into the VALUES statement.

点击(此处)折叠或打开

CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN ATOMIC
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
IF (a / 2) * 2 = a THEN
SET a = a / 2;
ELSE
SET a = 3 * a + 1;
END IF;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
17

Due to inlining there are no SQL statements by definition.
So I cannot show profiler output here.
Instead we would have to look at the optimizer plan which I will skip to not distract from the goal of the post.
Now, let's take a look at normal DB2 10 operation.

Compiled SQL PL in DB2 10

We reset the "magic switch" we used to place DB2 10 into DB2 9.7 mode and recreate the function without the ATOMIC keyword.

点击(此处)折叠或打开

CALL set_routine_opts('');
CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
IF (a / 2) * 2 = a THEN
SET a = a / 2;
ELSE
SET a = 3 * a + 1;
END IF;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
9

We have reduced the time 20 fold!

Even better we have beaten inline SQL PL by a factor of 2!

A look at the profiler shows why

点击(此处)折叠或打开

LINE EXEC TIME(ACT) CPU_TIME TEX
---- ---------- ---------- ---------- -------------------------------------------------------------
1 CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
2 RETURNS INTEGER SPECIFIC three_a_plus_one
3 BEGIN
4 DECLARE len INTEGER;
5 DECLARE n, a INTEGER DEFAULT 1;
6 DECLARE stamp TIMESTAMP;
7 1 0 0 SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
8 WHILE n < stop DO
9 SET a = n;
10 WHILE a <> 1 DO
11 IF (a / 2) * 2 = a THEN
12 SET a = a / 2;
13 ELSE
14 SET a = 3 * a + 1;
15 END IF;
16 END WHILE;
17 SET n = n + 1;
18 END WHILE;
19 1 0 0 RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
20 END

The only remaining SQL statements are the ones used for the timer.
The reason being that DB2 presently cannot execute expressions locally which are non deterministic.
GENERATE_UNIQUE() falls into this category.

Other limiting factors are global variables or user defined function invocations.

Restoring the Honor of inline SQL PL

Beating inline SQL PL was unexpected.
The reason for this is in the IF THEN ELSE statement.
Inline SQL PL has to emulate that in a relational access plan which is not trivial complex.

If we squeeze that IF out we should restore inline SQL PL to the top:

点击(此处)折叠或打开

CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN ATOMIC
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
SET a = CASE WHEN (a / 2) * 2 = a
THEN a / 2
ELSE 3 * a + 1 END;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
4

That's much better!
But really we are not comparing apples to apples anymore.
Will the same trick help compiled SQL PL as well

点击(此处)折叠或打开

CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
SET a = CASE WHEN (a / 2) * 2 = a
THEN a / 2
ELSE 3 * a + 1 END;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
7

That's better as well, but doesn't beat inline SQL PL.
To complete the circle let's try the same routine without the optimizations of DB2 10

点击(此处)折叠或打开

CALL set_routine_opts('PVM_OPTLEVEL 127');
CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)
RETURNS INTEGER SPECIFIC three_a_plus_one
BEGIN
DECLARE len INTEGER;
DECLARE n, a INTEGER DEFAULT 1;
DECLARE stamp TIMESTAMP;
SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE()));
WHILE n < stop DO
SET a = n;
WHILE a <> 1 DO
SET a = CASE WHEN (a / 2) * 2 = a
THEN a / 2
ELSE 3 * a + 1 END;
END WHILE;
SET n = n + 1;
END WHILE;
RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);
END
/
VALUES three_a_plus_one(100000);
1
-----------
211

Roughly the same speed.

So with 210s/7s that's a factor of 30 for a well tuned function.

Conclusion

DB2 10 significantly improves SQL PL (and with that also PL/SQL) performance.
In extreme cases this may be a factor of 30!
As SQL statements such as SELECT or UPDATE are added to a routine this relative increase will drop.
But for logic-heavy routines you can expect solid improvements which reduce the need to resort to inline SQL PL.
Note that what we have not measured here was the overhead of invoking a compiled routine over inlining SQL PL.

阅读(5181) | 评论(0) | 转发(0) |

上一篇：The Lighter Side: Meet the IBM Toronto Lab Night Shift

下一篇：DB2 10: Making the move from Oracle even easier

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6