vacuum full搞瘫PostgreSQL主从复制-skykiker-ChinaUnix博客

博客访问： 2973244
博文数量： 199
博客积分： 1400
博客等级：上尉
技术积分： 4126
用户组：普通用户
注册时间： 2008-07-06 19:06

个人简介

半个PostgreSQL DBA，热衷于数据库相关的技术。我的ppt分享https://pan.baidu.com/s/1eRQsdAa https://github.com/chenhuajun https://chenhuajun.github.io

文章分类

全部博文（199）

其他（1）
citus（10）
greenlpum（1）
安全（1）
Pacemaker（3）
MySQL（21）
Symfoware（2）

Native（1）
分布式（0）
C（1）
Solaris（1）
Linux（11）
C#（3）
故障案例（5）
NoSQL（4）
云计算（1）
Windows（3）
Database（13）
PostgreSQL（101）

安装配置（1）

HA（3）

doc（6）

Npgsql（1）

psqlODBC（2）
嵌入式开发（8）
Java开发（2）
生活随笔（3）
未分配的博文（4）

文章存档

2020年（5）

2019年（1）

2018年（12）

2017年（23）

2016年（43）

2015年（51）

2014年（27）

2013年（21）

2011年（1）

2010年（4）

2009年（5）

2008年（6）

我的朋友

现象

1主2从同步复制环境下，在主库上执行了一个vacuum full，结果发现一直无法结束。

[root@db01 data]# ps -ef|grep postgres
postgres 26268 1 0 Jun17 ? 00:00:12 /usr/pgsql-9.5/bin/postgres -D /data/pgsql/sysbenchdb
postgres 26270 26268 0 Jun17 ? 00:00:00 postgres: logger process
postgres 26272 26268 0 Jun17 ? 00:00:03 postgres: checkpointer process
postgres 26273 26268 0 Jun17 ? 00:00:00 postgres: writer process
postgres 26274 26268 0 Jun17 ? 00:00:33 postgres: wal writer process
postgres 26275 26268 0 Jun17 ? 00:00:01 postgres: autovacuum launcher process
postgres 26276 26268 0 Jun17 ? 00:00:02 postgres: stats collector process
postgres 26316 26315 0 Jun17 pts/3 00:00:00 -bash
postgres 26343 26316 0 Jun17 pts/3 00:00:00 psql
postgres 26345 26268 0 Jun17 ? 00:07:10 postgres: postgres sysbench_1_10000w [local] VACUUM waiting for C4/B3CA33E0
root 31697 29068 0 20:48 pts/0 00:00:00 grep postgres
[root@db01 data]# psql -Upostgres
psql (9.5.2)
Type "help" for help.
postgres=# select * from pg_stat_activity;
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start |
query_start | state_change | waiting | state | backend_xid | backend_xmin | query
-------+-------------------+-------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+----
---------------------------+-------------------------------+---------+--------+-------------+--------------+---------------------------------
13241 | postgres | 31762 | 10 | postgres | psql | | | -1 | 2016-06-18 20:50:15.583562+08 | 2016-06-18 20:50:17.407332+08 | 201
6-06-18 20:50:17.407332+08 | 2016-06-18 20:50:17.407337+08 | f | active | | 7676361 | select * from pg_stat_activity;
16386 | sysbench_1_10000w | 26345 | 10 | postgres | psql | | | -1 | 2016-06-17 23:50:47.009872+08 | 2016-06-17 23:51:07.835389+08 | 201
6-06-17 23:51:07.835389+08 | 2016-06-17 23:51:07.835392+08 | f | active | 7676361 | | vacuum full;
(2 rows)

原因

查看复制状态，原来2个Slave的复制都已经断掉了，由于是同步复制，Master一直在等待Slave的应答。

postgres=# select * from pg_stat_replication ;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flush_location | replay_lo
cation | sync_priority | sync_state
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+--------------+-------+---------------+----------------+----------------+----------
-------+---------------+------------
(0 rows)

为什么Slave的复制会断掉？

分别查看Master和Slave的日志，可以发现vacuum full产生了大量的xlog和频繁的checkpoint。
最终导致还没有来的及传输到Slave上的wal日志被Master删掉了，于是Slave上的复制停止。
这是一个200多GB的库，vacuum full太有杀伤力了。

相关参数如下：
max_wal_size = 8GB
min_wal_size = 2GB
wal_keep_segments=32
full_page_writes = on

主库日志

2016-06-17 23:50:58.172 CST 26345 0STATEMENT: varcuum full;
2016-06-17 23:51:41.896 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:51:47.212 CST 26272 0LOG: checkpoint complete: wrote 34 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=3.426 s, sync=0.490 s, total=5.318 s; sync files=18, longest=0.415 s, average=0.027 s; distance=3342566 kB, estimate=3342566 kB
2016-06-17 23:52:21.904 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:52:23.206 CST 26272 0LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 205 recycled; write=0.009 s, sync=0.693 s, total=1.305 s; sync files=4, longest=0.446 s, average=0.173 s; distance=3358625 kB, estimate=3358625 kB
2016-06-17 23:52:57.883 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:52:59.802 CST 26272 0LOG: checkpoint complete: wrote 1 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 205 recycled; write=0.109 s, sync=0.268 s, total=1.918 s; sync files=4, longest=0.234 s, average=0.067 s; distance=3360452 kB, estimate=3360452 kB
2016-06-17 23:53:38.728 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:53:40.206 CST 26272 0LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 205 recycled; write=0.010 s, sync=0.663 s, total=1.477 s; sync files=4, longest=0.662 s, average=0.165 s; distance=3358649 kB, estimate=3360272 kB
2016-06-17 23:54:17.751 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:54:18.945 CST 26272 0LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 205 recycled; write=0.011 s, sync=0.669 s, total=1.193 s; sync files=4, longest=0.668 s, average=0.167 s; distance=3342837 kB, estimate=3358528 kB
2016-06-17 23:54:57.165 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:54:58.487 CST 26272 0LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 204 recycled; write=0.011 s, sync=0.380 s, total=1.392 s; sync files=5, longest=0.324 s, average=0.076 s; distance=3356734 kB, estimate=3358349 kB
2016-06-17 23:54:58.496 CST 26272 0LOG: checkpoints are occurring too frequently (1 second apart)
2016-06-17 23:54:58.496 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-17 23:54:58.496 CST 26272 0LOG: checkpoint starting: xlog
2016-06-17 23:54:58.874 CST 26272 0LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 176 recycled; write=0.010 s, sync=0.008 s, total=0.377 s; sync files=1, longest=0.008 s, average=0.008 s; distance=56337 kB, estimate=3028147 kB
2016-06-18 00:02:00.536 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:02:32.243 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:03:03.343 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:03:32.654 CST 26272 0LOG: checkpoints are occurring too frequently (29 seconds apart)
2016-06-18 00:03:32.654 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:03:32.654 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:00.859 CST 26272 0LOG: checkpoints are occurring too frequently (28 seconds apart)
2016-06-18 00:04:00.859 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:04:00.859 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:29.940 CST 26272 0LOG: checkpoints are occurring too frequently (29 seconds apart)
2016-06-18 00:04:29.941 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:04:29.943 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:02:00.536 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:02:32.243 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:03:03.343 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:03:32.654 CST 26272 0LOG: checkpoints are occurring too frequently (29 seconds apart)
2016-06-18 00:03:32.654 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:03:32.654 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:00.859 CST 26272 0LOG: checkpoints are occurring too frequently (28 seconds apart)
2016-06-18 00:04:00.859 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:04:00.859 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:29.940 CST 26272 0LOG: checkpoints are occurring too frequently (29 seconds apart)
2016-06-18 00:04:29.941 CST 26272 0HINT: Consider increasing the configuration parameter "max_wal_size".
2016-06-18 00:04:29.943 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:59.536 CST 26272 0LOG: checkpoint starting: xlog
2016-06-18 00:04:59.652 CST 26283 0ERROR: requested WAL segment 00000001000000C40000008C has already been removed
2016-06-18 00:04:59.666 CST 26281 0ERROR: requested WAL segment 00000001000000C4000000A7 has already been removed
2016-06-18 00:05:14.939 CST 26608 0ERROR: requested WAL segment 00000001000000C4000000A7 has already been removed
2016-06-18 00:05:15.003 CST 26609 0ERROR: requested WAL segment 00000001000000C40000008C has already been removed
2016-06-18 00:05:19.779 CST 26611 0ERROR: requested WAL segment 00000001000000C40000008C has already been removed

其中1个从库的日志

2016-06-17 23:48:48.283 CST 17272 0LOG: database system was shut down in recovery at 2016-06-17 14:20:29 CST
2016-06-17 23:48:48.298 CST 17272 0LOG: entering standby mode
2016-06-17 23:48:48.343 CST 17272 0LOG: consistent recovery state reached at BA/53FEACF8
2016-06-17 23:48:48.343 CST 17272 0LOG: invalid resource manager ID 52 at BA/53FEACF8
2016-06-17 23:48:48.344 CST 17270 0LOG: database system is ready to accept read only connections
2016-06-17 23:48:48.373 CST 17276 0LOG: started streaming WAL from primary at BA/53000000 on timeline 1
2016-06-17 23:48:48.530 CST 17272 0LOG: redo starts at BA/53FEACF8
2016-06-17 23:51:47.652 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:52:25.254 CST 17273 0LOG: restartpoint complete: wrote 399635 buffers (76.2%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=36.981 s, sync=0.321 s, total=37.601 s; sync files=21, longest=0.168 s, average=0.014 s; distance=3342566 kB, estimate=3342566 kB
2016-06-17 23:52:25.287 CST 17273 0LOG: recovery restart point at BB/20024600
2016-06-17 23:52:25.287 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:49:38.964451+08
2016-06-17 23:52:25.288 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:53:02.126 CST 17273 0LOG: restartpoint complete: wrote 382293 buffers (72.9%); 0 transaction log file(s) added, 91 removed, 114 recycled; write=32.611 s, sync=0.126 s, total=36.837 s; sync files=6, longest=0.095 s, average=0.021 s; distance=3358625 kB, estimate=3358625 kB
2016-06-17 23:53:02.126 CST 17273 0LOG: recovery restart point at BB/ED00CC58
2016-06-17 23:53:02.126 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:49:38.964451+08
2016-06-17 23:53:02.128 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:53:43.815 CST 17273 0LOG: restartpoint complete: wrote 424800 buffers (81.0%); 0 transaction log file(s) added, 96 removed, 109 recycled; write=38.289 s, sync=0.455 s, total=41.686 s; sync files=8, longest=0.152 s, average=0.056 s; distance=3360452 kB, estimate=3360452 kB
2016-06-17 23:53:43.815 CST 17273 0LOG: recovery restart point at BC/BA1BDD58
2016-06-17 23:53:43.815 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:49:38.964451+08
2016-06-17 23:53:43.818 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:54:21.762 CST 17273 0LOG: restartpoint complete: wrote 418118 buffers (79.7%); 0 transaction log file(s) added, 96 removed, 109 recycled; write=34.681 s, sync=0.115 s, total=37.943 s; sync files=7, longest=0.043 s, average=0.016 s; distance=3358649 kB, estimate=3360272 kB
2016-06-17 23:54:21.762 CST 17273 0LOG: recovery restart point at BD/871AC458
2016-06-17 23:54:21.762 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:49:38.964451+08
2016-06-17 23:54:21.764 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:54:59.045 CST 17273 0LOG: restartpoint complete: wrote 420441 buffers (80.2%); 0 transaction log file(s) added, 98 removed, 107 recycled; write=34.745 s, sync=0.320 s, total=37.281 s; sync files=7, longest=0.142 s, average=0.045 s; distance=3342837 kB, estimate=3358528 kB
2016-06-17 23:54:59.046 CST 17273 0LOG: recovery restart point at BE/532299A0
2016-06-17 23:54:59.046 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:49:38.964451+08
2016-06-17 23:54:59.047 CST 17273 0LOG: restartpoint starting: xlog
2016-06-17 23:57:19.392 CST 17273 0LOG: restartpoint complete: wrote 356251 buffers (67.9%); 0 transaction log file(s) added, 68 removed, 136 recycled; write=138.817 s, sync=0.242 s, total=140.345 s; sync files=181, longest=0.111 s, average=0.001 s; distance=3413071 kB, estimate=3413071 kB
2016-06-17 23:57:19.393 CST 17273 0LOG: recovery restart point at BF/2373D968
2016-06-17 23:57:19.393 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-17 23:56:17.751189+08
2016-06-17 23:57:51.516 CST 17273 0LOG: restartpoint starting: time
2016-06-18 00:02:07.732 CST 17273 0LOG: restartpoint complete: wrote 484068 buffers (92.3%); 0 transaction log file(s) added, 98 removed, 110 recycled; write=252.381 s, sync=0.163 s, total=256.216 s; sync files=198, longest=0.111 s, average=0.000 s; distance=3337246 kB, estimate=3405489 kB
2016-06-18 00:02:07.732 CST 17273 0LOG: recovery restart point at BF/EF245338
2016-06-18 00:02:07.732 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-18 00:02:02.385133+08
2016-06-18 00:02:08.632 CST 17273 0LOG: restartpoint starting: xlog
2016-06-18 00:02:46.845 CST 17273 0LOG: restartpoint complete: wrote 389871 buffers (74.4%); 0 transaction log file(s) added, 103 removed, 101 recycled; write=33.994 s, sync=0.238 s, total=38.212 s; sync files=58, longest=0.123 s, average=0.004 s; distance=3352599 kB, estimate=3400200 kB
2016-06-18 00:02:46.845 CST 17273 0LOG: recovery restart point at C0/BBC4B058
2016-06-18 00:02:46.845 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-18 00:02:36.125659+08
2016-06-18 00:02:50.026 CST 17273 0LOG: restartpoint starting: xlog
2016-06-18 00:03:28.288 CST 17273 0LOG: restartpoint complete: wrote 468192 buffers (89.3%); 0 transaction log file(s) added, 130 removed, 74 recycled; write=33.569 s, sync=0.270 s, total=38.262 s; sync files=67, longest=0.061 s, average=0.004 s; distance=3331508 kB, estimate=3393331 kB
2016-06-18 00:03:28.288 CST 17273 0LOG: recovery restart point at C1/871B83C0
2016-06-18 00:03:28.288 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-18 00:03:01.080019+08
2016-06-18 00:03:28.291 CST 17273 0LOG: restartpoint starting: xlog
2016-06-18 00:04:07.549 CST 17273 0LOG: restartpoint complete: wrote 437802 buffers (83.5%); 0 transaction log file(s) added, 140 removed, 64 recycled; write=32.791 s, sync=0.181 s, total=39.257 s; sync files=35, longest=0.094 s, average=0.005 s; distance=3342050 kB, estimate=3388203 kB
2016-06-18 00:04:07.549 CST 17273 0LOG: recovery restart point at C2/53170EF0
2016-06-18 00:04:07.549 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-18 00:03:42.419304+08
2016-06-18 00:04:07.553 CST 17273 0LOG: restartpoint starting: xlog
2016-06-18 00:04:48.479 CST 17273 0LOG: restartpoint complete: wrote 447937 buffers (85.4%); 0 transaction log file(s) added, 157 removed, 47 recycled; write=33.236 s, sync=0.293 s, total=40.925 s; sync files=37, longest=0.160 s, average=0.007 s; distance=3357242 kB, estimate=3385107 kB
2016-06-18 00:04:48.479 CST 17273 0LOG: recovery restart point at C3/1FFFF8F0
2016-06-18 00:04:48.479 CST 17273 0DETAIL: last completed transaction was at log time 2016-06-18 00:04:02.282782+08
2016-06-18 00:04:48.484 CST 17273 0LOG: restartpoint starting: xlog
2016-06-18 00:04:59.698 CST 17276 0FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 00000001000000C40000008C has already been removed
2016-06-18 00:04:59.723 CST 17272 0LOG: invalid magic number 0000 in log segment 00000001000000C40000008C, offset 15958016
2016-06-18 00:05:15.003 CST 17429 0LOG: started streaming WAL from primary at C4/8C000000 on timeline 1

如何回避?
1. 通过wal_keep_segments预留更多的wal。需要留多少一定要算好。
2. 通过slot槽进行复制。如果Slave挂了，Master上日志将一直保留，所以要做好监控，及时处理。
3. 设置full_page_writes=off减少wal的产生量。前提是有其它方式保障数据安全比如备份或底层文件系统的原子写。
4. 做好日志归档并在主从间共享，这样可以让Slave从归档取缺失的日志。
5. 通过HA软件及时将Master切到异步复制，防止业务阻塞。
6. 千万不要对整个大库使用vacuum full(这一条才是重点，类似的除了vacuum full还要小心大量更新的大事务)

阅读(6694) | 评论(0) | 转发(0) |

上一篇：关于PostgreSQL的full_page_writes

下一篇：blockdev --setra预读优化和IO请求的大小

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6