关于PostgreSQL的GiST索引之三-skykiker-ChinaUnix博客

博客访问： 2963696
博文数量： 199
博客积分： 1400
博客等级：上尉
技术积分： 4126
用户组：普通用户
注册时间： 2008-07-06 19:06

个人简介

半个PostgreSQL DBA，热衷于数据库相关的技术。我的ppt分享https://pan.baidu.com/s/1eRQsdAa https://github.com/chenhuajun https://chenhuajun.github.io

文章分类

全部博文（199）

其他（1）
citus（10）
greenlpum（1）
安全（1）
Pacemaker（3）
MySQL（21）
Symfoware（2）

Native（1）
分布式（0）
C（1）
Solaris（1）
Linux（11）
C#（3）
故障案例（5）
NoSQL（4）
云计算（1）
Windows（3）
Database（13）
PostgreSQL（101）

安装配置（1）

HA（3）

doc（6）

Npgsql（1）

psqlODBC（2）
嵌入式开发（8）
Java开发（2）
生活随笔（3）
未分配的博文（4）

文章存档

2020年（5）

2019年（1）

2018年（12）

2017年（23）

2016年（43）

2015年（51）

2014年（27）

2013年（21）

2011年（1）

2010年（4）

2009年（5）

2008年（6）

我的朋友

2.3 性能推论

根据前面介绍的全文检索的GiST索引的算法，可以对其性能表现有个大体的认识。

1)GiST索引是有损耗的
GiST索引的叶节点中的索引项没有存储实际的keyword值，而只是keyword的Sign值(ARRKEY形式)，或者是Sign值的签名文件(SIGNKEY形式)，即GiST索引项中存储是有损耗压缩后的key。这意味着该索引可能会产生多余的匹配，所以，对通过索引匹配找到的每个项目需要再从数据文件中抓取元组，进行recheck才能确定是否真的匹配。

2)独特的词汇数越多，SIGNKEY型索引项的准确率越差

这里的准确率指，给定一个key应该匹配的项目数除以实际匹配的项目数。
可以简单的估算出：准确率 = 992/独立的词元个数
手册中建议GiST用于处理100,000以下的独特立词元的场景，即确保准确率大于1/100。

3)SIGNKEY型索引项包含的词元越多，匹配的概率越大
有m个bit位被置成1的SIGNKEY型索引项，匹配任意关键字的概率是m/992。
也就是说对任何一个索引节点，该节点包含的词元集合对应的SIGNKEY中被置为1的bit位数决定了这个索引节点在每次查询时需要被扫描的概率。
SIGNKEY中被置为1的bit位数在词元较少时，近似等于包含的词元数。但随着词元的增加，不同词元对应到同一个bit位的概率会增大，所以索引节点被扫描的概率不是随着包含的词元数线性增长。具体参考下表。

包含词元数	SIGNKEY中被置为1的bit位数	匹配任意关键字的平均概率
10	10	1%
20	20	2%
50	49	5%
100	95	10%
200	181	18%
500	393	40%
1000	630	64%

上面的表格的数据通过以下公式计算。

点击(此处)折叠或打开

testdb=# with recursive tx1(n,sign_bits) as
(select 1 n,1.0 sign_bits
union all
select n+1, sign_bits + 1 - sign_bits/992 from tx1 where n<=1000
)
select n,sign_bits,sign_bits/992 scan_probability from tx1 where n in(10,20,50,100,200,500,1000);
n | sign_bits | scan_probability
------+--------------------------+------------------------
10 | 9.95475882520071613263 | 0.01003503913830717352
20 | 19.80962125597813433986 | 0.01996937626610699026
50 | 48.78480462616009595302 | 0.04917823046991945157
100 | 95.17045888698489805581 | 0.09593796258768638917
200 | 181.21045784981729332855 | 0.18267183250989646505
500 | 392.89514681946496009988 | 0.39606365606800903236
1000 | 630.17880533823543738507 | 0.63526089247805991672
(7 rows)

我不太不确定上面公式的正确性，所以又模拟了一下，结果是差不多的。

点击(此处)折叠或打开

testdb=# select count(DISTINCT sign) from (select (random()*992)::int4 sign from generate_series(1,200))aa;
count
-------
182
(1 row)
testdb=# select count(DISTINCT sign) from (select (random()*992)::int4 sign from generate_series(1,500))aa;
count
-------
397
(1 row)
testdb=# select count(DISTINCT sign) from (select (random()*992)::int4 sign from generate_series(1,1000))aa;
count
-------
643
(1 row)

上面的数据假设词元集合的词元是随机的，这对页节点中的SIGNKEY索引项是适用的，但对其他情况就不适用了，因为GiST在组织索引的时候会尽量把SIGNKEY中对应相同Bit位的词元凑在一起。不过，为简化描述，后面会有意无视这个问题。

索引中数量占多数的是叶节点，我们可以认为叶节点如果包含1000个以上的词元就已经是一个极限了。这时每次查询平均需要扫描64%以上的叶索引节点，因此索引的效果可能也就值得怀疑了。

那么如何估算叶节点平均包含的词元数呢？
可以肯定的是一定会大于每个记录包含的平均词元数,也一定大于总的词元数除以总的叶节点数。不妨先得到这两个值，然后乘以某个假想的系数，比如5倍。如果这样估算，只要每个记录平均被拆成100个以上的词元，那么平均每次查询就要扫描超过40%的叶索引节点了。

但是，即使扫描了全部索引节点,由于下面的原因,走索引也有可能比全表扫描快。
a)索引可能比数据小
b)索引元组的匹配计算可能比数据元组的匹配计算快

不过，一旦每个记录包含的平均词元数超过ARRKEY存储形式的阈值（大概120吧），情况又不一样了。
这时所有叶节点中的索引项都是SIGNKEY形式，索引的Size要比数据小很多，但需要对数据文件进行recheck的概率大大增加。对于每个记录包含的平均词元数在200以上，要recheck 超过18%的数据记录，如果这样这个索引基本就报废了。
所以GiST顶多用来索引一下文章标题之类的小段文本，不适合索引整篇文章。

注）关于签名文件索引算法的效率问题,专业的学术论文中应该会有详细的探讨，我在这里，只是做了一下粗浅的分析。

2.4 性能验证

参考我前面一篇博文中使用的测试数据进行简单的性能验证。验证的结果基本和[2.3 性能推论]中的描述是吻合的，以下是详细内容。
http://zzjlzx.blog.chinaunix.net/uid-20726500-id-4824895.html

1)环境
CentOS 6.5
PostgreSQL 9.4.0

2)测试数据
没有特意找什么测试数据，正好手头有一份翻译中的PostgreSQL9.3中文手册，就拿它当测试数据了。

建表

点击(此处)折叠或打开

create table t1(id serial,c1 text);

导数据，手册中的每一行作为一条记录。

点击(此处)折叠或打开

-bash-4.1$ tar xf pg9.3.1.html.tar.gz
-bash-4.1$ find html -name *.html -exec cat {} \;|iconv -f GBK -t UTF8|sed 's\[\]\\g'|psql -p 5433 testdb -c "copy t1(c1) from stdin"

注）翻译中的PostgreSQL9.3中文手册(pg9.3.1.html.tar.gz)可从以下位置下载：

查看数据大小

点击(此处)折叠或打开

testdb=# select count(*) from t1;
count
--------
702476
(1 row)
testdb=# select pg_table_size('t1');
pg_table_size
---------------
37519360
(1 row)

无索引时的查询

点击(此处)折叠或打开

testdb=# \timing
Timing is on.
testdb=# select * from t1 where to_tsvector('testzhcfg',c1) @@ '重排';
id | c1
--------+----------------------------------------
561249 | >操作可能因为其它的重排列而被引入。</P
(1 row)
Time: 2739.207 ms
testdb=# select * from t1 where c1 like '%重排%';
id | c1
--------+----------------------------------------
561249 | >操作可能因为其它的重排列而被引入。</P
(1 row)
Time: 132.121 ms

可能to_tsvector()的分词处理比较耗时吧，在没有索引的情况下全文检索的匹配比普通的like匹配慢了20倍。

查看总词元数和每记录的平均词元数

点击(此处)折叠或打开

testdb=# select count(*) from (select ts_stat('select to_tsvector(''testzhcfg'', c1) from t1')) aa;
count
-------
30645
(1 row)
testdb=# select avg(length(to_tsvector('testzhcfg', c1))) from t1;
avg
--------------------
2.9314695448670133
(1 row)

根据之前的假设，GiST索引的准确率为0.032 (0.032=992/30645)。

3)查询的性能测试

GIN索引测试：

点击(此处)折叠或打开

testdb=# drop index t1_c1_idx_gist;

DROP INDEX

Time: 16.144 ms

testdb=# create index t1_c1_idx_gin on t1 using gin(to_tsvector('testzhcfg',c1));

CREATE INDEX

Time: 4463.896 ms

testdb=# select pg_relation_size('t1_c1_idx_gin');

pg_relation_size

------------------

          8765440

(1 row)

Time: 0.877 ms

testdb=# explain (analyze,buffers) select * from t1 where to_tsvector('testzhcfg',c1) @@ '重排';

                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on t1 (cost=47.22..4580.24 rows=3512 width=21) (actual time=0.028..0.029 rows=1 loops=1)

   Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)

   Heap Blocks: exact=1

   Buffers: shared hit=5

   -> Bitmap Index Scan on t1_c1_idx_gin (cost=0.00..46.35 rows=3512 width=0) (actual time=0.019..0.019 rows=1 loops=1)

         Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)

         Buffers: shared hit=4

Planning time: 0.101 ms

Execution time: 0.064 ms

(9 rows)

Time: 0.637 ms

由于匹配的记录数只有一条，所以理想状况下，需要扫描索引页数等于索引的层数，即该记录所在叶索引页加上所有祖先索引页。
GIN只扫描了4页索引，那么GIN索引应该有4层，GIN索引扫描的索引页数是和理想值一致的。

GiST索引测试：

点击(此处)折叠或打开

testdb=# create index t1_c1_idx_gist on t1 using gist(to_tsvector('testzhcfg',c1));
CREATE INDEX
Time: 26349.854 ms
testdb=# select pg_relation_size('t1_c1_idx_gist');
pg_relation_size
------------------
32841728
(1 row)
Time: 5.783 ms
testdb=# explain (analyze,buffers) select * from t1 where to_tsvector('testzhcfg',c1) @@ '重排';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t1 (cost=111.51..4644.52 rows=3512 width=21) (actual time=3.531..3.532 rows=1 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Heap Blocks: exact=1
Buffers: shared hit=220
-> Bitmap Index Scan on t1_c1_idx_gist (cost=0.00..110.63 rows=3512 width=0) (actual time=3.388..3.388 rows=1 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Buffers: shared hit=219
Planning time: 0.314 ms
Execution time: 3.574 ms
(9 rows)
Time: 4.861 ms

GiST扫描了219页索引，即扫描了5.5%（5.5%=219/(32841728/8192)）的索引页。根据前面的假设反过来推算，每个叶节点平均包含的词元数大概是50的样子。

4)1/5数据量的GiST索引测试：
取之前的原始测试数据的前1/5数据进行测试。

建表并插入数据

点击(此处)折叠或打开

testdb=# create table t2 (like t1);
CREATE TABLE
Time: 33.293 ms
testdb=# insert into t2 select * from t1 limit 702476/5;
INSERT 0 140495
Time: 938.910 ms
testdb=# select count(*) from t2;
count
--------
140495
(1 row)
Time: 28.495 ms
testdb=# select pg_relation_size('t2');
pg_relation_size
------------------
7528448
(1 row)
Time: 0.689 ms
testdb=# select * from t2 where to_tsvector('testzhcfg',c1) @@ '验证器';
id | c1
--------+-----------------
115726 | >"验证器"</SPAN
(1 row)
Time: 543.530 ms

数据量是原来的t1表的1/5

查看总词元数和每记录的平均词元数

点击(此处)折叠或打开

testdb=# select count(*) from (select ts_stat('select to_tsvector(''testzhcfg''::regconfig, c1) from t2'))aa ;
count
-------
13904
(1 row)
Time: 780.354 ms
testdb=# select avg(length(to_tsvector('testzhcfg', c1))) from t2;
avg
--------------------
2.9266593117192783
(1 row)
Time: 665.092 ms

总词元数大约是原来的1/2（13904/30645），每记录的平均词元数不变。

建GiST索引再查询

点击(此处)折叠或打开

testdb=# create index t2_c1_idx_gist on t2 using gist(to_tsvector('testzhcfg',c1));
CREATE INDEX
Time: 4709.604 ms
testdb=# select pg_relation_size('t2_c1_idx_gist');
pg_relation_size
------------------
6447104
(1 row)
Time: 0.431 ms
testdb=# select pg_relation_size('t2_c1_idx_gist')/8192;
?column?
----------
787
(1 row)
Time: 0.675 ms
testdb=# explain (analyze,buffers) select * from t2 where to_tsvector('testzhcfg',c1) @@ '验证器';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t2 (cost=21.72..931.18 rows=702 width=21) (actual time=1.036..1.037 rows=1 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''验证器'''::tsquery)
Heap Blocks: exact=1
Buffers: shared hit=55
-> Bitmap Index Scan on t2_c1_idx_gist (cost=0.00..21.55 rows=702 width=0) (actual time=0.964..0.964 rows=1 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''验证器'''::tsquery)
Buffers: shared hit=54
Planning time: 0.304 ms
Execution time: 1.123 ms
(9 rows)
Time: 2.235 ms

GiST扫描了54页索引，即扫描了6.9%（6.9%=54/787）的索引页。根据前面的假设反过来推算，每个叶节点平均包含的词元数大概是70的样子。
索引页比例的上升应该和”总词元数/总索引页数“值的上升有关，之前是7.4（30645/(32841728/8192)），现在是17.7（13904/787）。

5)增加每条记录的平均词元数后的GiST索引测试：
把原始测试数据的每30条记录合并成一条记录后进行测试。

建表并插入数据

点击(此处)折叠或打开

testdb=# create table t3 (like t1);
CREATE TABLE
Time: 8.205 ms
testdb=# insert into t3 select nid,string_agg(c1,' ') from (select id/30 nid,c1 from t1)aa group by nid;
INSERT 0 23418
Time: 1403.488 ms
testdb=# select count(*) from t3;
count
-------
23418
(1 row)
Time: 13.487 ms
testdb=# select pg_relation_size('t3');
pg_relation_size
------------------
13066240
(1 row)
Time: 0.551 ms
testdb=# select * from t3 where c1 like '%重排%';
id |
c1
-------+---------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
18708 | >子句，则为操作符命名一个选择性限制计算函数(注意这里是一个函数名， CLASS="LITERAL" ><P >如果提供了<TT > >NOT (x = y)</TT >RESTRICT</TT CLASS
="LITERAL" NAME="AEN54157" ><H2 >操作可能因为其它的重排列而被引入。</P >NOT</TT CLASS="LITERAL" CLASS="LITERAL" 这样的表达式简化成<TT >x <> y</
TT >。这样的情况比你想像的要频繁的多， CLASS="SECT2" 因为<TT ><P >否定符对可以用上面交换符对中解释的相同的方法来定义。</P CLASS="SECT2" ><DIV ></DIV
>35.13.3. <TT >RESTRICT</TT ><A ></H2 ></A 而不是一个操作符名)。<TT
(1 row)
Time: 36.146 ms

查看总词元数和每记录的平均词元数

点击(此处)折叠或打开

testdb=# select count(*) from (select ts_stat('select to_tsvector(''testzhcfg''::regconfig, c1) from t3'))aa ;
count
-------
30634
(1 row)
Time: 3180.648 ms
testdb=# select avg(length(to_tsvector('testzhcfg', c1))) from t3;
avg
---------------------
52.8218464429071654
(1 row)
Time: 2667.861 ms

每记录的平均词元数已经从原来的3提高到了53。可以预见叶索引节点包含的平均词元数一定也会大幅度提高。

建GiST索引再查询

点击(此处)折叠或打开

testdb=# create index t3_c1_idx_gist on t3 using gist(to_tsvector('testzhcfg',c1));
CREATE INDEX
Time: 3672.601 ms
testdb=# select pg_relation_size('t3_c1_idx_gist');
pg_relation_size
------------------
7053312
(1 row)
Time: 0.570 ms
testdb=# select pg_relation_size('t3_c1_idx_gist')/8192;
?column?
----------
861
(1 row)
Time: 0.469 ms
testdb=# analyze ;
ANALYZE
Time: 4000.001 ms
testdb=# explain (analyze,buffers) select * from t3 where to_tsvector('testzhcfg',c1) @@ '重排';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t3 (cost=16.97..305.35 rows=89 width=516) (actual time=15.888..17.608 rows=1 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Rows Removed by Index Recheck: 14
Heap Blocks: exact=15
Buffers: shared hit=183
-> Bitmap Index Scan on t3_c1_idx_gist (cost=0.00..16.95 rows=89 width=0) (actual time=1.340..1.340 rows=15 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Buffers: shared hit=168
Planning time: 0.495 ms
Execution time: 17.662 ms
(10 rows)
Time: 20.008 ms

GiST扫描了168页索引，即扫描了19.5%（19.5%=168/861）的索引页。根据前面的假设反过来推算，每个叶节点平均包含的词元数大概是200的样子。
需要注意的是，Recheck的记录数增加了。要Recheck 15条记录，而查询的大部分时间花在Recheck上。

6)增加每条记录的平均词元数后的GiST索引测试2：
把原始测试数据的每300条记录合并成一条记录后进行测试。

建表并插入数据

点击(此处)折叠或打开

testdb=# create table t4 (like t1);
CREATE TABLE
Time: 10.509 ms
testdb=# insert into t4 select nid,string_agg(c1,' ') from (select id/300 nid,c1 from t1)aa group by nid;
INSERT 0 2344
Time: 1167.627 ms
testdb=# select count(*) from t4;
count
-------
2344
(1 row)
Time: 1.203 ms
testdb=# select pg_relation_size('t4');
pg_relation_size
------------------
950272
(1 row)
Time: 0.540 ms
testdb=# select id,length(c1) from t4 where c1 like '%重排%';
id | length
------+--------
1870 | 4489
(1 row)
Time: 87.053 ms
testdb=# select avg(length(c1)) from t4;
avg
-----------------------
4358.0912969283276451
(1 row)
Time: 126.893 ms

每个文本字段的平均大小为4000多个字符，有点像一般文章的正文大小了。

查看总词元数和每记录的平均词元数

点击(此处)折叠或打开

testdb=# select count(*) from (select ts_stat('select to_tsvector(''testzhcfg''::regconfig, c1) from t4'))aa ;
count
-------
30633
(1 row)
Time: 3213.285 ms
testdb=# select avg(length(to_tsvector('testzhcfg', c1))) from t4;
avg
----------------------
288.1040955631399317
(1 row)
Time: 2890.361 ms

每记录的平均词元数已经从原来的3提高到了288。因此叶索引节点的索引项基本都是SIGNKEY形式的存储了。

建GiST索引再查询

点击(此处)折叠或打开

testdb=# create index t4_c1_idx_gist on t4 using gist(to_tsvector('testzhcfg',c1));
CREATE INDEX
Time: 3026.344 ms
testdb=# select pg_relation_size('t4_c1_idx_gist');
pg_relation_size
------------------
499712
(1 row)
Time: 0.605 ms
testdb=# select pg_relation_size('t4_c1_idx_gist')/8192;
?column?
----------
61
(1 row)
Time: 0.438 ms
testdb=# analyze;
ANALYZE
Time: 6775.215 ms
testdb=# explain (analyze,buffers) select * from t4 where to_tsvector('testzhcfg',c1) @@ '重排';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t4 (cost=4.24..40.84 rows=12 width=361) (actual time=164.155..187.033 rows=1 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Rows Removed by Index Recheck: 134
Heap Blocks: exact=62
Buffers: shared hit=523
-> Bitmap Index Scan on t4_c1_idx_gist (cost=0.00..4.24 rows=12 width=0) (actual time=0.379..0.379 rows=135 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''重排'''::tsquery)
Buffers: shared hit=53
Planning time: 0.341 ms
Execution time: 187.089 ms
(10 rows)
Time: 187.954 ms

GiST扫描了53页索引，即扫描了86.9%（86.9%=53/61）的索引页。根据前面的假设反过来推算，每个叶节点平均包含的词元数应该在2000以上。

现在需要Recheck的记录数增加到了135，Recheck花了太多的时间，以至于性能还不如用like进行全表扫描(87.053 ms)。
Recheck的记录数占总记录数的5.8%（5.8%=135/2344）。但是按照概率去算，每个记录包含288个词元的话，平均需要Recheck的记录数应该要占到20%。于是又尝试了几个关键字，有大于20%的，也有小于20%的。看来是因为记录在SIGNKEY的bit位上的分布不均匀导致的差异，平均值仍然可以认为是20% 。

点击(此处)折叠或打开

testdb=# explain (analyze,buffers) select * from t4 where to_tsvector('testzhcfg',c1) @@ 'A';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t4 (cost=4.24..40.84 rows=12 width=361) (actual time=1709.532..1709.532 rows=0 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''A'''::tsquery)
Rows Removed by Index Recheck: 1395
Heap Blocks: exact=110
Buffers: shared hit=3962
-> Bitmap Index Scan on t4_c1_idx_gist (cost=0.00..4.24 rows=12 width=0) (actual time=0.659..0.659 rows=1395 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''A'''::tsquery)
Buffers: shared hit=61
Planning time: 0.250 ms
Execution time: 1709.586 ms
(10 rows)
Time: 1710.378 ms
testdb=# explain (analyze,buffers) select * from t4 where to_tsvector('testzhcfg',c1) @@ 'B';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t4 (cost=4.24..40.84 rows=12 width=361) (actual time=1877.729..1877.729 rows=0 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''B'''::tsquery)
Rows Removed by Index Recheck: 1127
Heap Blocks: exact=114
Buffers: shared hit=3076
-> Bitmap Index Scan on t4_c1_idx_gist (cost=0.00..4.24 rows=12 width=0) (actual time=0.462..0.462 rows=1127 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''B'''::tsquery)
Buffers: shared hit=61
Planning time: 0.208 ms
Execution time: 1877.770 ms
(10 rows)
Time: 1878.462 ms
testdb=# explain (analyze,buffers) select * from t4 where to_tsvector('testzhcfg',c1) @@ 'C';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t4 (cost=4.24..40.84 rows=12 width=361) (actual time=538.436..538.436 rows=0 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''C'''::tsquery)
Rows Removed by Index Recheck: 342
Heap Blocks: exact=99
Buffers: shared hit=1071
-> Bitmap Index Scan on t4_c1_idx_gist (cost=0.00..4.24 rows=12 width=0) (actual time=0.382..0.382 rows=342 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''C'''::tsquery)
Buffers: shared hit=60
Planning time: 0.312 ms
Execution time: 538.479 ms
(10 rows)
Time: 539.243 ms
testdb=# explain (analyze,buffers) select * from t4 where to_tsvector('testzhcfg',c1) @@ 'D';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t4 (cost=4.24..40.84 rows=12 width=361) (actual time=445.309..445.309 rows=0 loops=1)
Recheck Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''D'''::tsquery)
Rows Removed by Index Recheck: 331
Heap Blocks: exact=95
Buffers: shared hit=1139
-> Bitmap Index Scan on t4_c1_idx_gist (cost=0.00..4.24 rows=12 width=0) (actual time=0.369..0.369 rows=331 loops=1)
Index Cond: (to_tsvector('testzhcfg'::regconfig, c1) @@ '''D'''::tsquery)
Buffers: shared hit=59
Planning time: 0.207 ms
Execution time: 445.350 ms
(10 rows)
Time: 445.993 ms

阅读(3239) | 评论(0) | 转发(0) |

上一篇：关于PostgreSQL的GiST索引之二

下一篇：关于PostgreSQL的GiST索引之四

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6