博客访问： 2973218
博文数量： 199
博客积分： 1400
博客等级：上尉
技术积分： 4126
用户组：普通用户
注册时间： 2008-07-06 19:06

个人简介

半个PostgreSQL DBA，热衷于数据库相关的技术。我的ppt分享https://pan.baidu.com/s/1eRQsdAa https://github.com/chenhuajun https://chenhuajun.github.io

文章分类

全部博文（199）

其他（1）
citus（10）
greenlpum（1）
安全（1）
Pacemaker（3）
MySQL（21）
Symfoware（2）

Native（1）
分布式（0）
C（1）
Solaris（1）
Linux（11）
C#（3）
故障案例（5）
NoSQL（4）
云计算（1）
Windows（3）
Database（13）
PostgreSQL（101）

安装配置（1）

HA（3）

doc（6）

Npgsql（1）

psqlODBC（2）
嵌入式开发（8）
Java开发（2）
生活随笔（3）
未分配的博文（4）

文章存档

2020年（5）

2019年（1）

2018年（12）

2017年（23）

2016年（43）

2015年（51）

2014年（27）

2013年（21）

2011年（1）

2010年（4）

2009年（5）

2008年（6）

我的朋友

分布式实时分析数据库citus数据查询性能简单对比

如果单纯看实时数据插入的速度，并不能体现citus的价值，还要看聚合查询的性能。下面将集群的查询性能和单机做个简单的对比。

仍使用之前插入测试的环境

环境

软硬件配置

CentOS release 6.5 x64物理机(16C/128G/300GB SSD)
- CPU: 2*8core 16核32线程， Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
PostgreSQL 9.6.2
citus 6.1.0
sysbench-1.0.3

机器列表

master
- 192.168.0.177
worker(8个)
- 192.168.0.181~192.168.0.188

软件的安装都比较简单，参考官方文档即可，这里略过。

postgresql.conf配置

listen_addresses = '*'
port = 5432
max_connections = 1000
shared_buffers = 32GB
effective_cache_size = 96GB
work_mem = 16MB
maintenance_work_mem = 2GB
min_wal_size = 4GB
max_wal_size = 32GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
shared_preload_libraries = 'citus'
checkpoint_timeout = 60min
wal_level = replica
wal_compression = on
wal_log_hints = on
synchronous_commit = on

查询SQL

在单库和集群中通过sysbench-1.0.3的oltp_insert.lua分别插入109704106条记录,然后执行以下查询。

select count(1) from sbtest1;
select count(1),min(k),max(k),sum(k) from sbtest1;
select count(1),min(k),max(k),sum(k) from sbtest1 where c like '1%';
select id from sbtest1 where c like '1%' order by c offset 100 limit 10;
select id from sbtest1 where c like '1%' order by id offset 100 limit 10;

注：id为主键，c上无索引

citus集群查询

Q1:

dbcitus=# select count(1) from sbtest1;
   count   
-----------
 109704106
(1 行记录)

时间：358.383 ms

Q2：

dbcitus=# select count(1),min(k),max(k),sum(k) from sbtest1;
   count   | min  | max  |     sum      
-----------+------+------+--------------
 109704106 | 1054 | 8814 | 549598249903
(1 行记录)

时间：537.863 ms

Q3:

dbcitus=# select count(1),min(k),max(k),sum(k) from sbtest1 where c like '1%';
  count   | min  | max  |     sum     
----------+------+------+-------------
 10970072 | 1422 | 8697 | 54957088506
(1 行记录)

时间：444.634 ms

Q4:

dbcitus=# select id from sbtest1 where c like '1%' order by c offset 100 limit 10;
     id      
-------------
   660221252
  -156599825
 -1070591685
  1972534048
   273755819
  -322155824
 -1645137219
  1803521703
  1717570691
   469077412
(10 行记录)

时间：502.428 ms

Q5:

dbcitus=# select id from sbtest1 where c like '1%' order by id offset 100 limit 10;
     id      
-------------
 -2147445247
 -2147444600
 -2147444596
 -2147444595
 -2147444054
 -2147443610
 -2147443341
 -2147442900
 -2147442344
 -2147441989
(10 行记录)

时间：69.143 ms

单机查询

Q1:

第1次执行919秒(这个数据有点问题)

dbone=# select count(1) from sbtest1;
   count   
-----------
 109704106
(1 行记录)

时间：919258.638 ms

第2次执行14秒

dbone=# select count(1) from sbtest1;
   count   
-----------
 109704106
(1 行记录)

时间：14003.682 ms

Q2:

dbone=# select count(1),min(k),max(k),sum(k) from sbtest1;
   count   | min | max  |     sum      
-----------+-----+------+--------------
 109704106 | 982 | 8837 | 549600081751
(1 行记录)

时间：25070.524 ms

Q3:

dbone=# select count(1),min(k),max(k),sum(k) from sbtest1 where c like '1%';
  count   | min | max  |     sum     
----------+-----+------+-------------
 10967526 | 982 | 8798 | 54945704901
(1 行记录)

时间：18543.113 ms

Q4:

dbone=# select id from sbtest1 where c like '1%' order by c offset 100 limit 10;
     id      
-------------
  1737611069
    43961197
  1736349807
 -1126409957
  -814972129
 -1889152976
  1692000262
  1911254584
   104013245
   553339542
(10 行记录)

时间：21845.293 ms

Q5:

dbone=# select id from sbtest1 where c like '1%' order by id offset 100 limit 10;
     id      
-------------
 -2147445924
 -2147445692
 -2147445355
 -2147445018
 -2147444919
 -2147444700
 -2147444202
 -2147444055
 -2147443598
 -2147443359
(10 行记录)

时间：0.642 ms

单机并行查询

打开并行计算开关后，最大同时开启8个worker。

dbone=# set max_parallel_workers_per_gather=8;
SET
时间：0.243 ms
dbone=# explain  select count(1) from sbtest1;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=3137653.95..3137653.96 rows=1 width=8)
   ->  Gather  (cost=3137653.12..3137653.93 rows=8 width=8)
         Workers Planned: 8
         ->  Partial Aggregate  (cost=3136653.12..3136653.13 rows=1 width=8)
               ->  Parallel Seq Scan on sbtest1  (cost=0.00..3102367.69 rows=13714169 width=0)
(5 行记录)

时间：0.264 ms

注:根据PG并行的策略,开启的并行worker数与表大小和min_parallel_relation_size(默认8MB)的倍数有关，每3倍增加1个worker。

>24MB : 2
>72MB : 3
>216MB : 4
>648MB : 5
>1.9GB : 6
>5.8GB : 7
>17GB : 8

sbtest1的大小为23GB，所以最大启用8个worker。

dbone=# \d+
                       关联列表
 架构模式 |  名称   |  类型  |  拥有者  | 大小  | 描述 
----------+---------+--------+----------+-------+------
 public   | sbtest1 | 数据表 | postgres | 23 GB | 
(1 行记录)

Q1:

dbone=# select count(1) from sbtest1;
   count   
-----------
 109704106
(1 行记录)

时间：2313.477 ms

Q2:

dbone=# select count(1),min(k),max(k),sum(k) from sbtest1;
   count   | min | max  |     sum      
-----------+-----+------+--------------
 109704106 | 982 | 8837 | 549600081751
(1 行记录)

时间：3734.968 ms

Q3:

dbone=# select count(1),min(k),max(k),sum(k) from sbtest1 where c like '1%';
  count   | min | max  |     sum     
----------+-----+------+-------------
 10967526 | 982 | 8798 | 54945704901
(1 行记录)

时间：2664.022 ms

Q4:

dbone=# select id from sbtest1 where c like '1%' order by c offset 100 limit 10;
     id      
-------------
  1737611069
    43961197
  1736349807
 -1126409957
  -814972129
 -1889152976
  1692000262
  1911254584
   104013245
   553339542
(10 行记录)

时间：7073.320 ms

Q5:

实际未启用并行

dbone=# select id from sbtest1 where c like '1%' order by id offset 100 limit 10;
     id      
-------------
 -2147445924
 -2147445692
 -2147445355
 -2147445018
 -2147444919
 -2147444700
 -2147444202
 -2147444055
 -2147443598
 -2147443359
(10 行记录)

时间：0.634 ms

以上的都是在数据被OS缓存时的测试结果，单机下大表的数据很可能未被缓存，如果是这种场景，SQL执行时间将非常依赖于IO速度。

总结

citus集群的聚合查询性能大大优于单机，也优于单机并行；但少量数据的单点查询延迟较大。

查询	单机(ms)	单机并行(ms)	citus集群(ms)
Q1	14003	2313	358
Q2	25070	3734	537
Q3	18543	2664	444
Q4	21845	7073	502
Q5	0.6	0.6	69

阅读(3840) | 评论(0) | 转发(0) |

上一篇：分布式实时分析数据库citus数据插入性能优化之二

下一篇：citus对join的支持

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6