Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2888900
  • 博文数量: 200
  • 博客积分: 2413
  • 博客等级: 大尉
  • 技术积分: 3067
  • 用 户 组: 普通用户
  • 注册时间: 2011-04-01 22:07
文章分类

全部博文(200)

文章存档

2018年(2)

2017年(8)

2016年(35)

2015年(14)

2014年(20)

2013年(24)

2012年(53)

2011年(44)

分类: Mysql/postgreSQL

2016-03-15 11:43:56

二、并行备份及恢复(gp_dump & gp_restore)



1、gp_dump & gp_restore
gp_dump是Greenplum的并行备份工具,Master和segment同时备份,备份保存在每个节点中。备份时间最长的节点决定着整个备份时长。
Segment节点备份的只有数据,而Master备份的是所有DDL命令和一些全局系统表(如gp_segment_configuration)。

Master备份文件格式:

存放CREATE DATABASE语句,用来重建数据库。

gp_dump_status_1__     # 日志文件,存放备份日志

gp_dump_1__            # schema中对象的定义(DDL)文件

gp_dump_1___post_data  # 含有和重建表相关的对象。主要是一些参数。官方解释:contains commands to rebuild objects associated with the tables.

gp_dump_.rpt                 # 备份报告,包含Timestamp Keygp_dump Command LineBackup Type(全备或者增量),备份是否成功等。


执行gp_dump时,只有Segment的primary和active Master运行备份操作,Mirror和stanby Master不参与备份。
gp_dump在每个Segment上运行gp_dump_agent代理进程,将备份状态报告给Master的gp_dump进程。


Segment上的gp_dump_agent进程

# ps -ef|grep gp_dump_agent

gpadmin   9200  9183  0 14:28 ?        00:00:00 sh -c /opt/greenplum/greenplum-db/./bin/gp_dump_agent --gp-k 20160304142801_0_2_ --gp-d /gpbackup/ -p 40000 -U gpadmin  -a testDB 2> /gpbackup/gp_dump_status_0_2_20160304142801  > /gpbackup/gp_dump_0_2_20160304142801

gpadmin   9201  9200  3 14:28 ?        00:00:01 /opt/greenplum/greenplum-db/./bin/gp_dump_agent --gp-k 20160304142801_0_2_ --gp-d /gpbackup/ -p 40000 -U gpadmin -a testDB

root      9212  7002  0 14:28 pts/1    00:00:00 grep gp_dump_agent


Segment备份文件格式:

gp_dump_0__         # 数据备份文件,例如 gp_dump_0_2_20160304111611

gp_dump_status_0__  # 日志文件,例如gp_dump_status_0_2_20160304111611


特别注意的是,14位timestamp时间戳,在gp_restore恢复时必须指定。

参数:

$ gp_dump --help

gp_dump dumps a database as a text file or to other formats.

 

Usage:

  gp_dump [OPTION]... [DBNAME]

 

General options:

  -i, --ignore-version     proceed even when server version mismatches

                           gp_dump version

  -v, --verbose            verbose mode. adds verbose information to the

                           per segment status files

  --help                   show this help, then exit

  --version                output version information, then exit

 

Options controlling the output content:

  -a, --data-only          dump only the data, not the schema

  -c, --clean              clean (drop) schema prior to create

  -d, --inserts            dump data as INSERT, rather than COPY, commands

  -D, --column-inserts     dump data as INSERT commands with column names

  -E, --encoding=ENCODING  dump the data in encoding ENCODING

  -n, --schema=SCHEMA      dump the named schema only

  -N, --exclude-schema=SCHEMA do NOT dump the named schema(s)

  -o, --oids               include OIDs in dump

  -O, --no-owner           do not output commands to set object ownership

                           in plain text format

  -s, --schema-only        dump only the schema, no data

  -S, --superuser=NAME     specify the superuser user name to use in

                           plain text format

  -t, --table=TABLE        dump only matching table(s) (or views or sequences)

  -T, --exclude-table=TABLE   do NOT dump matching table(s) (or views or sequences)

  -x, --no-privileges      do not dump privileges (grant/revoke)

  --disable-triggers       disable triggers during data-only restore

  --incremental            dump only modified patitions

  --use-set-session-authorization

                              use SESSION AUTHORIZATION commands instead of

                              ALTER OWNER commands to set ownership

 

Connection options:

  -h, --host=HOSTNAME      database server host or socket directory

  -p, --port=PORT          database server port number

  -U, --username=NAME      connect as specified database user

  -W, --password           force password prompt (should happen automatically)

 

Greenplum Database specific options:

  --gp-c                  use gzip for in-line compression

  --gp-d=BACKUPFILEDIR    directory where backup files are placed

  --gp-k=TIMESTAMP        timestamp key to be used for the dump files

  --gp-r=REPORTFILEDIR    directory where report file is placed

  --gp-s=BACKUPSET        backup set indicator - (p)rimaries only (default)

                          or (i)ndividual segdb (must be followed with a list of dbids

                          of primary segments to dump. For example: --gp-s=i[10,12,14]

  --rsyncable             pass --rsyncable option to gzip

If no database name is supplied, then the PGDATABASE environment

variable value is used.


gp_restore命令使用gp_dump生成的备份文件恢复数据定义和用户的数据。特别要注意的时,gp_restore时需指定gp_dump备份时的时间戳(在gp_dump_.rpt中)。

执行gp_restore有如下要求:
  • gp_dump做的备份文件,并且备份文件在原始位置。
  • GPDB正常运行
  • GPDB的Segment节点数(包含primary和mirror)和备份时一致
  • 要被恢复的数据库(database)已创建
  • 使用gp_dump备份时指定了参数如-s (schema only) , -a (data only), --gp-c (compressed), --gp-d (alternate dump file location) ,那么使用gp_restore恢复时也要加上

参数:

$ gp_restore --help

gp_restore restores a Greenplum Database database from an archive created by gp_dump.

 

Usage:

  gp_restore [OPTIONS]

 

General options:

  -d, --dbname=NAME        output database name

  -i, --ignore-version     proceed even when server version mismatches

  -v, --verbose            verbose mode. adds verbose information to the

                           per segment status files

  --help                   show this help, then exit

  --version                output version information, then exit

 

Options controlling the output content:

  -a, --data-only          restore only the data, no schema

  -s, --schema-only        restore only the schema, no data

  -P, --post-data-schema-only    restore only the postdataSchema

 

Connection options:

  -h, --host=HOSTNAME      database server host or socket directory

  -p, --port=PORT          database server port number

  -U, --username=NAME      connect as specified database user

  -W, --password           force password prompt (should happen automatically)

 

Greenplum Database specific options:

  --gp-c                  use gunzip for in-line de-compression

  --gp-d=BACKUPFILEDIR    directory where backup files are located

  --gp-i                  ignore error

  --gp-k=KEY              date time backup key from gp_backup run

  --gp-r=REPORTFILEDIR    directory where report file is placed

  --status=STATUSFILEDIR  directory where status file is placed

  --gp-l=FILELOCATIONS    backup files are on (p)rimaries only (default)

                          or (i)ndividual segdb (must be followed with a list of dbid's

                          where backups are located. For example: --gp-l=i[10,12,15]

  --gp-f=FILE             FILE, present on all machines, with tables to include in restore

  --prefix=PREFIX         PREFIX of the dump files to be restored

  
 
<1>表级别备份恢复

$ gp_dump szlsd_db --table=member --gp-d=/gpbackup/ --gp-r=/gpbackup/ 

20160304:11:16:11|gp_dump-[INFO]:-Read params: -t member

20160304:11:16:11|gp_dump-[INFO]:-Command line options analyzed.

20160304:11:16:11|gp_dump-[INFO]:-Connecting to master database on host localhost port 5432 database szlsd_db.

20160304:11:16:11|gp_dump-[INFO]:-Reading Greenplum Database configuration info from master database.

20160304:11:16:11|gp_dump-[INFO]:-Preparing to dump the following segments:

20160304:11:16:11|gp_dump-[INFO]:-Segment 2 (dbid 4)

20160304:11:16:11|gp_dump-[INFO]:-Segment 1 (dbid 3)

20160304:11:16:11|gp_dump-[INFO]:-Segment 0 (dbid 2)

20160304:11:16:11|gp_dump-[INFO]:-Master (dbid 1)

20160304:11:16:11|gp_dump-[INFO]:-Converting tablenames to oids

20160304:11:16:11|gp_dump-[INFO]:-Processing tables in batches of 1000

20160304:11:16:11|gp_dump-[INFO]:-Finished processing batch 1 of tables

20160304:11:16:11|gp_dump-[INFO]:-Starting a transaction on master database szlsd_db.

20160304:11:16:11|gp_dump-[INFO]:-Getting a lock on pg_class in database szlsd_db.

20160304:11:16:11|GetTimestampKey-[INFO]:-Timestamp key is generated as it is not provided by the user.

20160304:11:16:11|gp_dump-[INFO]:-About to spin off 4 threads with timestamp key 20160304111611

20160304:11:16:11|gp_dump-[INFO]:-Creating thread to backup dbid 4: host gp-s3 port 40000 database szlsd_db

20160304:11:16:11|gp_dump-[INFO]:-Creating thread to backup dbid 3: host gp-s2 port 40000 database szlsd_db

20160304:11:16:11|gp_dump-[INFO]:-Creating thread to backup dbid 2: host gp-s1 port 40000 database szlsd_db

20160304:11:16:11|gp_dump-[INFO]:-Creating thread to backup dbid 1: host gp-master port 5432 database szlsd_db

20160304:11:16:11|gp_dump-[INFO]:-Waiting for remote gp_dump_agent processes to start transactions in serializable isolation level

20160304:11:16:11|gp_dump-[INFO]:-Listening for messages from server on dbid 2 connection

20160304:11:16:11|gp_dump-[INFO]:-Listening for messages from server on dbid 4 connection

20160304:11:16:11|gp_dump-[INFO]:-Successfully launched Greenplum Database backup on dbid 2 server

20160304:11:16:11|gp_dump-[INFO]:-Successfully launched Greenplum Database backup on dbid 4 server

20160304:11:16:11|gp_dump-[INFO]:-Listening for messages from server on dbid 3 connection

20160304:11:16:11|gp_dump-[INFO]:-Successfully launched Greenplum Database backup on dbid 3 server

20160304:11:16:11|gp_dump-[INFO]:-Listening for messages from server on dbid 1 connection

20160304:11:16:13|gp_dump-[INFO]:-Successfully launched Greenplum Database backup on dbid 1 server

20160304:11:16:13|gp_dump-[INFO]:-backup succeeded for dbid 1 on host gp-master

20160304:11:16:14|gp_dump-[INFO]:-backup succeeded for dbid 4 on host gp-s3

20160304:11:16:14|gp_dump-[INFO]:-All remote gp_dump_agent processes have began transactions in serializable isolation level

20160304:11:16:14|gp_dump-[INFO]:-Waiting for remote gp_dump_agent processes to obtain local locks on dumpable objects

20160304:11:16:14|gp_dump-[INFO]:-All remote gp_dump_agent processes have obtains the necessary locks

20160304:11:16:14|gp_dump-[INFO]:-Committing transaction on the master database, thereby releasing locks.

20160304:11:16:14|gp_dump-[INFO]:-Waiting for all remote gp_dump_agent programs to finish.

20160304:11:16:14|gp_dump-[INFO]:-backup succeeded for dbid 2 on host gp-s1

20160304:11:16:16|gp_dump-[INFO]:-backup succeeded for dbid 3 on host gp-s2

20160304:11:16:16|gp_dump-[INFO]:-All remote gp_dump_agent programs are finished.

20160304:11:16:16|gp_dump-[INFO]:-Report results also written to /gpbackup/gp_dump_20160304111611.rpt.

 

Greenplum Database Backup Report

Timestamp Key: 20160304111611

gp_dump Command Line: szlsd_db --table=member --gp-d=/gpbackup/ --gp-r=/gpbackup/

Pass through Command Line Options: -t member -t member_1_prt_other -t member_1_prt_boys -t member_1_prt_girls

Compression Program: None

Backup Type: Full

 

Individual Results

        segment 2 (dbid 4) Host gp-s3 Port 40000 Database szlsd_db BackupFile /gpbackup/gp_dump_0_4_20160304111611: Succeeded 

        segment 1 (dbid 3) Host gp-s2 Port 40000 Database szlsd_db BackupFile /gpbackup/gp_dump_0_3_20160304111611: Succeeded 

        segment 0 (dbid 2) Host gp-s1 Port 40000 Database szlsd_db BackupFile /gpbackup/gp_dump_0_2_20160304111611: Succeeded 

        Master (dbid 1) Host gp-master Port 5432 Database szlsd_db BackupFile /gpbackup/gp_dump_1_1_20160304111611: Succeeded 

        Master (dbid 1) Host gp-master Port 5432 Database szlsd_db BackupFile /gpbackup/gp_dump_1_1_20160304111611_post_data: Succeeded 

 

gp_dump utility finished successfully.


根据上面的日志,总结了备份过程:
读备份参数->连接到需要备份的数据库->将表名转成oids->在master开启一个事务->在szlsd_db库中的pg_class表加锁->产生时间戳Timestamp key,保证数据一致性->等待segment上的gp_dump_agent开始一个串行的隔离级别事务->开始备份->备份完成->gp_dump_agent结束串行的隔离级别事务->等待远程的gp_dump_agent获得本地备份文件锁->所有gp_dump_agent获得本地文件锁->在Master节点提交事务,pg_class表锁释放->等待所有远程gp_dump_agent完成工作->所有远程gp_dump_agent工作完成!


恢复:

$ gp_restore --gp-d=/gpbackup/ --gp-r=/gpbackup/ --gp-k=20160304111611

20160304:15:28:59|gp_restore-[INFO]:-Analyzed command line options.

20160304:15:28:59|gp_restore-[INFO]:-Connecting to master segment on host localhost port 5432 database testDB.

20160304:15:28:59|gp_restore-[INFO]:-Reading Greenplum Database configuration info from master segment database.

20160304:15:28:59|gp_restore-[INFO]:-Preparing to restore the following segments:

20160304:15:28:59|gp_restore-[INFO]:-Segment 2 (dbid 4)

20160304:15:28:59|gp_restore-[INFO]:-Segment 1 (dbid 3)

20160304:15:28:59|gp_restore-[INFO]:-Segment 0 (dbid 2)

20160304:15:28:59|gp_restore-[INFO]:-Master (dbid 1)

20160304:15:28:59|gp_restore-[INFO]:-Starting to restore the master database.

20160304:15:28:59|gp_restore-[INFO]:-Creating thread to restore master database: host gp-master port 5432 database testDB

20160304:15:28:59|gp_restore-[INFO]:-Listening for messages from dbid 1 server (source) for dbid 1 restore

20160304:15:28:59|gp_restore-[INFO]:-Successfully launched Greenplum Database restore on dbid 1 to restore dbid 1

20160304:15:29:02|gp_restore-[INFO]:-restore started for source dbid 1, target dbid 1 on host gp-master

20160304:15:29:02|gp_restore-[INFO]:-restore succeeded for source dbid 1, target dbid 1 on host gp-master

20160304:15:29:02|gp_restore-[INFO]:-Successfully restored master database: host gp-master port 5432 database testDB

20160304:15:29:02|gp_restore-[INFO]:-Creating thread to restore dbid 4 (gp-s3:40000) from backup file on dbid 4 (gp-s3:40000)

20160304:15:29:02|gp_restore-[INFO]:-Creating thread to restore dbid 3 (gp-s2:40000) from backup file on dbid 3 (gp-s2:40000)

20160304:15:29:02|gp_restore-[INFO]:-Creating thread to restore dbid 2 (gp-s1:40000) from backup file on dbid 2 (gp-s1:40000)

20160304:15:29:02|gp_restore-[INFO]:-Waiting for all remote gp_restore_agent programs to finish.

20160304:15:29:02|gp_restore-[INFO]:-Listening for messages from dbid 2 server (source) for dbid 2 restore

20160304:15:29:02|gp_restore-[INFO]:-Listening for messages from dbid 3 server (source) for dbid 3 restore

20160304:15:29:02|gp_restore-[INFO]:-Successfully launched Greenplum Database restore on dbid 3 to restore dbid 3

20160304:15:29:02|gp_restore-[INFO]:-Listening for messages from dbid 4 server (source) for dbid 4 restore

20160304:15:29:02|gp_restore-[INFO]:-Successfully launched Greenplum Database restore on dbid 2 to restore dbid 2

20160304:15:29:02|gp_restore-[INFO]:-Successfully launched Greenplum Database restore on dbid 4 to restore dbid 4

20160304:15:29:06|gp_restore-[INFO]:-restore started for source dbid 2, target dbid 2 on host gp-s1

20160304:15:29:06|gp_restore-[INFO]:-restore started for source dbid 3, target dbid 3 on host gp-s2

20160304:15:29:06|gp_restore-[INFO]:-restore started for source dbid 4, target dbid 4 on host gp-s3

20160304:15:29:10|gp_restore-[INFO]:-restore succeeded for source dbid 2, target dbid 2 on host gp-s1

20160304:15:29:12|gp_restore-[INFO]:-restore succeeded for source dbid 4, target dbid 4 on host gp-s3

20160304:15:29:18|gp_restore-[INFO]:-restore succeeded for source dbid 3, target dbid 3 on host gp-s2

20160304:15:29:18|gp_restore-[INFO]:-All remote gp_restore_agent programs are finished.

20160304:15:29:18|gp_restore-[INFO]:-updating Append Only table statistics

20160304:15:29:18|gp_restore-[INFO]:-Starting to restore the master database.

20160304:15:29:18|gp_restore-[INFO]:-Creating thread to restore master database: host gp-master port 5432 database testDB

20160304:15:29:18|gp_restore-[INFO]:-Listening for messages from dbid 1 server (source) for dbid 1 restore

20160304:15:29:18|gp_restore-[INFO]:-Successfully launched Greenplum Database restore on dbid 1 to restore dbid 1

20160304:15:29:20|gp_restore-[INFO]:-restore started for source dbid 1, target dbid 1 on host gp-master

20160304:15:29:20|gp_restore-[INFO]:-restore succeeded for source dbid 1, target dbid 1 on host gp-master

20160304:15:29:20|gp_restore-[INFO]:-Successfully restored master database: host gp-master port 5432 database testDB

20160304:15:29:20|gp_restore-[INFO]:-Report results also written to /gpbackup/gp_restore_20160304152431.rpt.

 

Greenplum Database Restore Report

Timestamp Key: 20160304152431

gp_restore Command Line: --gp-d=/gpbackup/ --gp-r=/gpbackup/ --gp-k=20160304152431

Pass through Command Line Options:  --post-data-schema-only

Compression Program: None

 

Individual Results

        Restore of testDB on dbid 1 (gp-master:5432) from /gpbackup/gp_dump_1_1_20160304152431_post_data: Succeeded

        Restore of testDB on dbid 4 (gp-s3:40000) from /gpbackup/gp_dump_0_4_20160304152431: Succeeded

        Restore of testDB on dbid 3 (gp-s2:40000) from /gpbackup/gp_dump_0_3_20160304152431: Succeeded

        Restore of testDB on dbid 2 (gp-s1:40000) from /gpbackup/gp_dump_0_2_20160304152431: Succeeded

 

gp_restore  utility finished successfully.



<2>database级别备份恢复

$ gp_dump testDB  --gp-d=/gpbackup/ --gp-r=/gpbackup/ 


# 模拟删除testDB


# 恢复前必须手动创建数据库,默认都是小写,如果要大些必须加双引号

test=# create database "testDB" with owner=gpadmin;


# 恢复数据

$ gp_restore  --gp-d=/gpbackup/ --gp-r=/gpbackup/ --gp-k=20160304153252



<3>schema级别备份恢复

$ gp_dump szlsd_db -n public  --gp-d=/gpbackup/ --gp-r=/gpbackup/ 


# 模拟删除schema

testDB=# drop schema public cascade;


#恢复schema前,必须手动创建schema

testDB=# create schema public ;

CREATE SCHEMA


# 恢复数据
#一定要加上--status参数,否则gp_restore的status file找不到写目录,报错20160304:16:34:28|gp_restore-[ERROR]:-Error executing query SELECT * FROM gp_read_backup_file('/gpbackup/', '20160304163404', 2) : ERROR:  Backup File /gpbackup/gp_restore_status_1_1_20160304163404 Type 2 could not be be found

$ gp_restore  --gp-d=/gpbackup/ --gp-r=/gpbackup/ --status=/gpbackup/ --gp-k=20160304163404



<4>根据时间戳备份,用于备份多个database,保证数据一致性

$ gp_dump szlsd_db -n public  --gp-d=/gpbackup/ --gp-r=/gpbackup/  --gp-k=xxxxxxxxxxx



<5>增量备份

$ gp_dump szlsd_db -n public  --gp-d=/gpbackup/ --gp-r=/gpbackup/ --incremental



转载请注明:
十字螺丝钉
http://blog.chinaunix.net/uid/23284114.html

QQ:463725310
E-MAIL:houora#gmail.com(#请自行替换为@)
阅读(16066) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~