Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2957501
  • 博文数量: 199
  • 博客积分: 1400
  • 博客等级: 上尉
  • 技术积分: 4126
  • 用 户 组: 普通用户
  • 注册时间: 2008-07-06 19:06
个人简介

半个PostgreSQL DBA,热衷于数据库相关的技术。我的ppt分享https://pan.baidu.com/s/1eRQsdAa https://github.com/chenhuajun https://chenhuajun.github.io

文章分类

全部博文(199)

文章存档

2020年(5)

2019年(1)

2018年(12)

2017年(23)

2016年(43)

2015年(51)

2014年(27)

2013年(21)

2011年(1)

2010年(4)

2009年(5)

2008年(6)

分类: Mysql/postgreSQL

2014-12-31 23:40:25

今年的PG用户大会,华为的许中清分享了《PostgreSQL表分区实践》(当时我在另一个会场)。其中最后一页的测试结果表明,100分区的继承分区表的查询更新删除比不分区的普通表慢了2个数量级。虽然PG的手册中也说:

主表的所有分区的所有约束在约束排除中被审查,所以大量的分区将大大增加查询规划时间。 分区使用这些技术或许可以将分区提升到一百个且能很好的工作; 不要试图使用成千上万的分区。
摘自:

但是否真慢到这种程度心中还是存了疑问,故测试验证之。

1.PG的分区表概述

先建立一个小的分区表,简单看看分区表相关的操作。
PG的分区表是通过继承实现的,下面建一个有5个分区的分区表。

  1. create table maintb(id int,name char(10));

  2.  create table childtb_1 (CHECK ( id BETWEEN 1 AND 200)) inherits(maintb);
  3.  create table childtb_2 (CHECK ( id BETWEEN 201 AND 400)) inherits(maintb);
  4.  create table childtb_3 (CHECK ( id BETWEEN 401 AND 600)) inherits(maintb);
  5.  create table childtb_4 (CHECK ( id BETWEEN 601 AND 800)) inherits(maintb);
  6.  create table childtb_5 (CHECK ( id BETWEEN 801 AND 1000)) inherits(maintb);

  7.  CREATE INDEX childtb_idx_1 ON childtb_1 (id);
  8.  CREATE INDEX childtb_idx_2 ON childtb_2 (id);
  9.  CREATE INDEX childtb_idx_3 ON childtb_3 (id);
  10.  CREATE INDEX childtb_idx_4 ON childtb_4 (id);
  11.  CREATE INDEX childtb_idx_5 ON childtb_5 (id);

  12.  CREATE OR REPLACE FUNCTION maintb_insert_trigger()
  13.  RETURNS TRIGGER AS $$
  14.  BEGIN
  15.      IF( NEW.id BETWEEN 1 AND 200 ) THEN
  16.          INSERT INTO childtb_1 VALUES (NEW.*);
  17.      ELSIF( NEW.id BETWEEN 201 AND 400 ) THEN
  18.          INSERT INTO childtb_2 VALUES (NEW.*);
  19.      ELSIF( NEW.id BETWEEN 401 AND 600 ) THEN
  20.          INSERT INTO childtb_3 VALUES (NEW.*);
  21.      ELSIF( NEW.id BETWEEN 601 AND 800 ) THEN
  22.          INSERT INTO childtb_4 VALUES (NEW.*);
  23.      ELSIF( NEW.id BETWEEN 801 AND 1000 ) THEN
  24.          INSERT INTO childtb_5 VALUES (NEW.*);
  25.      ELSE
  26.          RAISE EXCEPTION 'id out of range!';
  27.      END IF;
  28.      RETURN NULL;
  29.  END;
  30.  $$
  31.  LANGUAGE plpgsql;

  32.  CREATE TRIGGER insert_maintb_trigger
  33.      BEFORE INSERT ON maintb
  34.      FOR EACH ROW EXECUTE PROCEDURE maintb_insert_trigger();

往父表插入的数据通过触发器被分发到了子表上

  1. tmpdb=# insert into maintb values(1,'abcde12345');
  2. INSERT 0 0
  3. tmpdb=# select * from only maintb;
  4.  id | name
  5. ----+------
  6. (0 rows)
  7. tmpdb=# select * from childtb_1;
  8.  id | name
  9. ----+------------
  10.   1 | abcde12345
  11. (1 row)

对父表的查询更新和删除操作,PG会根据where条件和子表的CHECK约束条件分发到相应的子表上(这称为约束排除,详见PG手册)。

  1. tmpdb=# explain select * from maintb where id=2;
  2.                            QUERY PLAN
  3. ----------------------------------------------------------------
  4.  Append (cost=0.00..4.50 rows=2 width=32)
  5.    -> Seq Scan on maintb (cost=0.00..0.00 rows=1 width=48)
  6.          Filter: (id = 2)
  7.    -> Seq Scan on childtb_1 (cost=0.00..4.50 rows=1 width=15)
  8.          Filter: (id = 2)
  9. (5 rows)

  10. tmpdb=# explain update maintb set name='xx' where id=2;
  11.                            QUERY PLAN
  12. ----------------------------------------------------------------
  13.  Update on maintb (cost=0.00..4.50 rows=2 width=10)
  14.    -> Seq Scan on maintb (cost=0.00..0.00 rows=1 width=10)
  15.          Filter: (id = 2)
  16.    -> Seq Scan on childtb_1 (cost=0.00..4.50 rows=1 width=10)
  17.          Filter: (id = 2)
  18. (5 rows)

  19. tmpdb=# explain delete from maintb where id=2;
  20.                           QUERY PLAN
  21. ---------------------------------------------------------------
  22.  Delete on maintb (cost=0.00..4.50 rows=2 width=6)
  23.    -> Seq Scan on maintb (cost=0.00..0.00 rows=1 width=6)
  24.          Filter: (id = 2)
  25.    -> Seq Scan on childtb_1 (cost=0.00..4.50 rows=1 width=6)
  26.          Filter: (id = 2)
  27. (5 rows)

由于表中的数据量太小,所以上面的执行计划里没有用索引扫描。
如果试图更新分区键,把记录从一个分区更新到另一个分区。对不起,PG会报CHECK约束违法的错误。

  1. tmpdb=# update maintb set id=300 where id=1;
  2. 错误: 关系 "childtb_1" 的新列违反了检查约束 "childtb_1_id_check"
  3. DETAIL: 失败, 行包含(300, abcde12345).

所以分区键是不能随便更新的,如果非要更新,只能先删后插。
PG如果无法确定要操作哪个或哪些子表,就要在所有子表上都执行一遍。

  1. tmpdb=# explain select * from maintb where id+1=2+1;
  2.                             QUERY PLAN
  3. -------------------------------------------------------------------
  4.  Append (cost=0.00..229.00 rows=56 width=16)
  5.    -> Seq Scan on maintb (cost=0.00..0.00 rows=1 width=48)
  6.          Filter: ((id + 1) = 3)
  7.    -> Seq Scan on childtb_1 (cost=0.00..5.00 rows=1 width=15)
  8.          Filter: ((id + 1) = 3)
  9.    -> Seq Scan on childtb_2 (cost=0.00..5.00 rows=1 width=15)
  10.          Filter: ((id + 1) = 3)
  11.    -> Seq Scan on childtb_3 (cost=0.00..5.00 rows=1 width=15)
  12.          Filter: ((id + 1) = 3)
  13.    -> Seq Scan on childtb_4 (cost=0.00..5.00 rows=1 width=15)
  14.          Filter: ((id + 1) = 3)
  15.    -> Seq Scan on childtb_5 (cost=0.00..209.00 rows=51 width=15)
  16.          Filter: ((id + 1) = 3)
  17. (13 rows)

废话讲完,下面开始测试。

2.测试环境

测试环境为个人PC上的VMware虚拟机
PC
 CPU:Intel Core i5-3470 3.2G(4核)
 MEM:6GB
 SSD:OCZ-VERTEX4 128GB(VMware虚拟机所在磁盘,非系统盘)
 OS:Win7

VMware虚拟机
 CPU:4核
 MEM:1GB
 OS:CentOS 6.5
 PG:PostgreSQL 9.3.4(shared_buffers = 128MB,其他是默认值)


3.测试

3.1测试方法

1,仿照"PG的分区表概述"中的表定义分别建0,1,10,100和1000 5个不同分区数目的数据库。
2,使用批量insert对每个数据库都插入1000w条记录。
3,用pgbench分别以1,10和100并发执行单条记录的insert,select,update和delete,记录平均SQL延迟。
   对分区表的select,update和delete的测试分为覆盖所有分区和只操作1个分区两种情况。

3.2 建数据库

5个数据库分别命名如下:
db0:    0个分区(普通表)
db1:    1个分区
db10:   10个分区
db100:  100个分区
db1000: 1000个分区


为简化建表工作,创建一个用于生成分区表建表SQL的SQL脚本(有点绕口)
createsql.sql:

点击(此处)折叠或打开

  1. \pset linestyle old-ascii
  2. \t on

  3. select 'create sequence seq1;';
  4. select 'create table maintb(id int,name char(10));';

  5. select 'create table childtb_' || id ||' (CHECK ( id BETWEEN ' || (id-1)*(:total/:pnum)+1 ||' AND ' || (id)*(:total/:pnum) || ')) inherits(maintb);' from (select generate_series(1,:pnum)) ids(id);

  6. select 'CREATE INDEX childtb_idx_'|| id || ' ON childtb_'|| id ||' (id);' from (select generate_series(1,:pnum)) ids(id);

  7. select 'CREATE OR REPLACE FUNCTION maintb_insert_trigger()
  8. RETURNS TRIGGER AS $$
  9. BEGIN';

  10. select CASE WHEN id>1 THEN ' ELS' ELSE ' ' END || 'IF( NEW.id BETWEEN ' || (id-1)*(:total/:pnum)+1 ||' AND ' || (id)*(:total/:pnum) || ' ) THEN
  11.         INSERT INTO childtb_'|| id || ' VALUES (NEW.*); ' from (select generate_series(1,:pnum)) ids(id);

  12. select' ELSE
  13.         RAISE EXCEPTION ''id out of range!'';
  14.     END IF;
  15.     RETURN NULL;
  16. END;
  17. $$
  18. LANGUAGE plpgsql;';

  19. select 'CREATE TRIGGER insert_maintb_trigger
  20.     BEFORE INSERT ON maintb
  21.     FOR EACH ROW EXECUTE PROCEDURE maintb_insert_trigger();';

创建db0

  1. -bash-4.1$ createdb db0
  2. -bash-4.1$ psql db0
  3. psql (9.3.4)
  4. Type "help" for help.

  5. db0=# create sequence seq1;
  6. CREATE SEQUENCE
  7. db0=# create table maintb(id int,name char(10));
  8. CREATE TABLE
  9. db0=# CREATE INDEX maintb_idx ON maintb (id);
  10. CREATE INDEX

执行下面的命令,创建db1,db10,db100和db1000

  1. createdb db1
  2. psql -q -v total=10000000 -v pnum=1 -f createsql.sql |psql db1

  3. createdb db10
  4. psql -q -v total=10000000 -v pnum=10 -f createsql.sql |psql db10

  5. createdb db100
  6. psql -q -v total=10000000 -v pnum=100 -f createsql.sql |psql db100

  7. createdb db1000
  8. psql -q -v total=10000000 -v pnum=1000 -f createsql.sql |psql db1000


3.3批量插入数据

对每个db分别用下面的SQL批量插入1000w条数据
insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);

点击(此处)折叠或打开

  1. db0=# insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);
  2. INSERT 0 10000000
  3. Time: 33521.119 ms
  4. db0=# analyze;
  5. ANALYZE
  6. Time: 637.394 ms

  7. db0=# \c db1
  8. You are now connected to database "db1" as user "postgres".
  9. db1=# insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);
  10. INSERT 0 0
  11. Time: 230640.841 ms
  12. db1=# analyze;
  13. ANALYZE
  14. Time: 3354.329 ms

  15. db1=# \c db10
  16. You are now connected to database "db10" as user "postgres".
  17. db10=# insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);
  18. INSERT 0 0
  19. Time: 257349.510 ms
  20. db10=# analyze;
  21. ANALYZE
  22. Time: 3979.994 ms

  23. db10=# \c db100
  24. You are now connected to database "db100" as user "postgres".
  25. db100=# insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);
  26. INSERT 0 0
  27. Time: 602719.217 ms
  28. db100=# analyze;
  29. ANALYZE
  30. Time: 6750.144 ms

  31. db100=# \c db1000
  32. You are now connected to database "db1000" as user "postgres".
  33. db1000=# insert into maintb select id,'abcde12345' from (select generate_series(1,10000000))ids(id);
  34. INSERT 0 0
  35. Time: 7203528.417 ms
  36. db1000=# analyze;
  37. ANALYZE
  38. Time: 12512.537 ms

分区表的插入要执行触发器,所以比普通表要慢。从上面的数据可以看出:100以下分区时,执行时间随分区数的增长还比较缓慢;但当分区数为1000时,批量插入的时间已经变态到100分区时的12倍了。

3.4全表扫描

简单做个需要全表扫描的查询。下面的查询约束排除和索引都失效,走了全表扫描。因为大部分时间花在表扫描上,不关分区的事,所以执行时间相差不大。


点击(此处)折叠或打开

  1. db0=# select * from maintb where id+1=2;
  2.  id | name
  3. ----+------------
  4.   1 | abcde12345
  5. (1 row)

  6. Time: 2569.647 ms
  7. db1000=# \c db1
  8. You are now connected to database "db1" as user "postgres".
  9. db1=# select * from maintb where id+1=2;
  10.  id | name
  11. ----+------------
  12.   1 | abcde12345
  13. (1 row)

  14. Time: 2479.952 ms
  15. db1=# \c db10
  16. You are now connected to database "db10" as user "postgres".
  17. db10=# select * from maintb where id+1=2;
  18.  id | name
  19. ----+------------
  20.   1 | abcde12345
  21. (1 row)

  22. Time: 2384.198 ms
  23. db10=# \c db100
  24. You are now connected to database "db100" as user "postgres".
  25. db100=# select * from maintb where id+1=2;
  26.  id | name
  27. ----+------------
  28.   1 | abcde12345
  29. (1 row)

  30. Time: 2474.023 ms
  31. db0=# \c db1000
  32. You are now connected to database "db1000" as user "postgres".
  33. db1000=# select * from maintb where id+1=2;
  34.  id | name
  35. ----+------------
  36.   1 | abcde12345
  37. (1 row)

  38. Time: 4065.615 ms

1000个分区时花的时间比较长,应该和文件系统缓存有关。连续执行2次,可以发现第二次执行时,1000个分区和不分区的执行时间差不多。

点击(此处)折叠或打开

  1. b0=# select * from maintb where id+1=2;
  2.  id | name
  3. ----+------------
  4.   1 | abcde12345
  5. (1 row)

  6. Time: 2461.604 ms
  7. db0=# select * from maintb where id+1=2;
  8.  id | name
  9. ----+------------
  10.   1 | abcde12345
  11. (1 row)

  12. Time: 1436.271 ms
  13. db0=# \c db1000
  14. You are now connected to database "db1000" as user "postgres".
  15. db1000=# select * from maintb where id+1=2;
  16.  id | name
  17. ----+------------
  18.   1 | abcde12345
  19. (1 row)

  20. Time: 3977.888 ms
  21. db1000=# select * from maintb where id+1=2;
  22.  id | name
  23. ----+------------
  24.   1 | abcde12345
  25. (1 row)

  26. Time: 1400.637 ms

3.5 并发SQL测试

准备以下pgbench的SQL脚本文件
insert.sql:
  1. \setrandom id 1 10000000
  2. insert into maintb values(:id,'abcd12345');

insert_smallrange.sql:
  1. \setrandom id 1 10000
  2. insert into maintb values(:id,'abcd12345');

select.sql:
  1. \setrandom id 1 10000000
  2. select * from maintb where id=:id;

select_smallrange.sql:
  1. \setrandom id 1 10000
  2. select * from maintb where id=:id;

update.sql:
  1. \setrandom id 1 10000000
  2. update maintb set name = 'aaaaa12345' where id=:id;

update_smallrange.sql:
  1. \setrandom id 1 10000
  2. update maintb set name = 'aaaaa12345' where id=:id;

delete.sql:
  1. \setrandom id 1 10000000
  2. delete from maintb where id=:id;

delete_smallrange.sql:
  1. \setrandom id 1 10000
  2. delete from maintb where id=:id;

然后用pgbench对不同数据库进行不同并发数的测试,比如对db0,测试insert的平均执行时间。

点击(此处)折叠或打开

  1. -bash-4.1$ pgbench -n -r -c 1 -j 1 -T 2 -f insert.sql db0
  2. transaction type: Custom query
  3. scaling factor: 1
  4. query mode: simple
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 2 s
  8. number of transactions actually processed: 7282
  9. tps = 3640.288324 (including connections establishing)
  10. tps = 3647.901830 (excluding connections establishing)
  11. statement latencies in milliseconds:
  12.     0.002014    \setrandom id 1 10000000
  13.     0.270468    insert into maintb values(:id,'abcd12345');

  14. -bash-4.1$ pgbench -n -r -c 10 -j 10 -T 2 -f insert.sql db0
  15. transaction type: Custom query
  16. scaling factor: 1
  17. query mode: simple
  18. number of clients: 10
  19. number of threads: 10
  20. duration: 2 s
  21. number of transactions actually processed: 32075
  22. tps = 15772.769831 (including connections establishing)
  23. tps = 15932.463640 (excluding connections establishing)
  24. statement latencies in milliseconds:
  25.     0.002050    \setrandom id 1 10000000
  26.     0.621524    insert into maintb values(:id,'abcd12345');

  27. -bash-4.1$ pgbench -n -r -c 100 -j 100 -T 2 -f insert.sql db0
  28. transaction type: Custom query
  29. scaling factor: 1
  30. query mode: simple
  31. number of clients: 100
  32. number of threads: 100
  33. duration: 2 s
  34. number of transactions actually processed: 25692
  35. tps = 12085.876901 (including connections establishing)
  36. tps = 22583.213366 (excluding connections establishing)
  37. statement latencies in milliseconds:
  38.     0.001861    \setrandom id 1 10000000
  39.     4.046697    insert into maintb values(:id,'abcd12345');

4.测试结果

1个并发(单位毫秒)
id范 SQL 分区数
0 1 10 100 1000
1~10000000 insert into maintb values(:id,'abcd12345'); 0.34568 0.809927 0.910337 0.967627 4.190777
select * from maintb where id=:id; 0.14852 0.789401 0.90642 2.091848 38.66256
update maintb set name = 'aaaaa12345' where id=:id; 0.48688 1.17653 1.493839 6.046276 598.335
delete from maintb where id=:id; 0.43789 1.110788 1.235515 5.861994 589.4893
1~10000 insert into maintb values(:id,'abcd12345'); 0.26089 0.288702 0.297765 0.295104 0.313431
select * from maintb where id=:id; 0.07621 0.109531 0.208925 1.237369 31.09006
update maintb set name = 'aaaaa12345' where id=:id; 0.20382 0.378703 0.568731 5.161718 590.0673
delete from maintb where id=:id; 0.13484 0.264828 0.464408 5.006479 586.6543


10个并发(单位毫秒)
id范 SQL 分区数
0 1 10 100 1000
1~10000000 insert into maintb values(:id,'abcd12345'); 1.14853 1.563754 1.661851 2.385968 10.7074
select * from maintb where id=:id; 0.5432 1.274031 1.524999 5.950416 111.1581
update maintb set name = 'aaaaa12345' where id=:id; 1.50849 2.740678 3.094661 23.89209 出错(*3)
delete from maintb where id=:id; 1.59584 1.80079 2.727167 22.79166 出错(*3)
1~10000 insert into maintb values(:id,'abcd12345'); 0.6038 0.779655 0.673587 0.662618 0.789707
select * from maintb where id=:id; 0.22318 0.316221 0.597139 4.822255 117.1621
update maintb set name = 'aaaaa12345' where id=:id; 0.85966 0.959858 1.739188 20.3759 出错(*3)
delete from maintb where id=:id; 0.29249 0.407228 1.158087 20.18293 出错(*3)


100个并发(单位毫秒)
id范 SQL 分区数
0 1 10 100 1000
1~10000000 insert into maintb values(:id,'abcd12345'); 6.77161 9.762775 11.93486 21.35377 1037.091
select * from maintb where id=:id; 9.01432 10.91613 17.37906 87.52062 5919.649(*2)
update maintb set name = 'aaaaa12345' where id=:id; 16.0372 21.10411 29.61658 380.3574 出错(*3)
delete from maintb where id=:id; 11.3606 13.64317 28.92108 345.2502 出错(*3)
1~10000 insert into maintb values(:id,'abcd12345'); 5.12748 6.519101 6.270275 6.555714 8.49643
select * from maintb where id=:id; 2.40127(*1) 3.226115 6.332551 71.98606 6258.338(*2)
update maintb set name = 'aaaaa12345' where id=:id; 2.40123(*1)
8.497982 15.57208 368.961 出错(*3)
delete from maintb where id=:id; 2.79931 3.985874 11.96576 289.3604 (*3)


*1)在db0上以100个并发连起来执行小范围的select和update时,有时会发生死锁。
aa.sql:
  1. export jobs=100
  2. export db=db0
  3. pgbench -n -r -c $jobs -j $jobs -T 2 -f select_smallrange.sql $db|tail -1|awk '{print $1}'
  4. pgbench -n -r -c $jobs -j $jobs -T 2 -f update_smallrange.sql $db|tail -1|awk '{print $1}'

  1. -bash-4.1$ export  jobs=100
    -bash-4.1$ export db=db0
    -bash-4.1$ sh aa.sql
  2. 2.368792
    Client 57 aborted in state 1: 错误:  检测到死锁
    DETAIL:  进程14436等待在事务 3145678上的ShareLock; 由进程14470阻塞.
    进程14470等待在事务 3146294上的ShareLock; 由进程14436阻塞.
    HINT:  详细信息请查看服务器日志.
    7.002670
奇怪的是2个pgbench命令分开单独执行不发生过问题。有点怀疑执行update_smallrange.sql时,select_smallrange.sql并没有完全结束,但把-T改成-t依然可能发生,把update和insert放到同一个sql脚本中反而不发生。

*2)在db1000上以100个并发执行select会遭遇资源限制的错误。
出错消息1:
  1. Client 77 aborted in state 1: 错误: 共享内存用尽
  2. HINT: 您可能需要增加参数max_locks_per_transaction.
  3. 警告: 共享内存用尽

出错消息2:

  1. Client 31 aborted in state 1: 错误: 无法打开文件 "base/25376/30776": Too many open files in system
根据消息提示修改postgresql.conf的配置后可以成功执行。
max_locks_per_transaction=64
max_files_per_process = 1000
==》
max_locks_per_transaction=1100
max_files_per_process = 500

*3)在db1000上以10和100个并发执行update,delete老是报错,未能成功。
出错消息:
  1. 警告: 中断联接, 因为其它服务器进程崩溃
  2. DETAIL: Postmaster 命令此服务器进程回滚当前事物并退出, 因为其它服务器进程不正常的退出可能毁坏了共享内存.
  3. HINT: 一会儿你将可以重联接数据库并且重复你的命令.

5.性能分析

插入慢是由于触发器,而且慢的也不算太多,顶多慢2到3倍。有异常的数据是在100并发时往有1000个分区的父表中插入数据,比不分区慢了100多倍。这应该是由于数据被随机写到1000个不同的子表文件中,导致IO的响应很慢。
查询更新和删除慢,是由于查询规划慢,大部分时间都耗在查询规划上,而且分区数越多,慢的越离谱。

点击(此处)折叠或打开

  1. db1000=# update maintb set name = 'aaaaa12345' where id=10000;
  2. UPDATE 9
  3. Time: 610.774 ms
  4. db1000=# explain update maintb set name = 'aaaaa12345' where id=10000;
  5.                                     QUERY PLAN
  6. -----------------------------------------------------------------------------------
  7.  Update on maintb (cost=0.00..44.73 rows=12 width=10)
  8.    -> Seq Scan on maintb (cost=0.00..0.00 rows=1 width=10)
  9.          Filter: (id = 10000)
  10.    -> Bitmap Heap Scan on childtb_1 (cost=4.50..44.73 rows=11 width=10)
  11.          Recheck Cond: (id = 10000)
  12.          -> Bitmap Index Scan on childtb_idx_1 (cost=0.00..4.50 rows=11 width=0)
  13.                Index Cond: (id = 10000)
  14. (7 rows)

  15. Time: 661.814 ms

6.总结

1,正如PG手册上说的,千上万的分区是不太靠谱的。(不论是性能还是稳定性)
2,和许中清他们的测试结果一样,100个分区时,单条记录的select,update和delete操作的执行时间比不分区要慢2个数量级。
3,华为的原生分区方案确实不错,社区版也能提供这个功能就好了。
4,解决PG的继承表在分区较多时执行规划耗时的问题,除了在分区本身上下功夫,还可以通过缓存执行计划达到目的。可惜目前PG的执行计划cache比较弱,很难满足分区表的场景(关于这一点准备之后再写一篇文章详细说明一下)。
5,不要被上面的测试结果吓到,对典型的OLAP复杂查询来说,这点执行规划耗时根本不值得注意,反而处理的总数据量少了,执行性能会大大提升。
6,对OLTP应用则要仔细斟酌一下要不要分区,分多少个区了?显然分区数不宜超过100个,
如果业务是按时间分区,历史数据分区尽量合并减少分区数。

7.参考



http://beigang.iteye.com/blog/1884415
阅读(16157) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~