Chinaunix首页 | 论坛 | 博客
  • 博客访问: 85687025
  • 博文数量: 19285
  • 博客积分: 9968
  • 博客等级: 上将
  • 技术积分: 196072
  • 用 户 组: 普通用户
  • 注册时间: 2007-02-07 14:28
文章分类

全部博文(19285)

文章存档

2012年(1)

2011年(1)

2009年(125)

2008年(19095)

2007年(63)

分类:

2008-05-21 15:28:51

From PgsqlWiki

Jump to: navigation, search

High Availability, Load Balancing, and Replication

Database servers can work together to allow a second server to take over quickly if the primary server fails (high availability), or to allow several computers to serve the same data (load balancing). .Ideally, database servers could work together seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines. In fact, read-only database servers can be combined relatively easily too. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data needs to be placed on each server only once, a write to any server has to be propagated to all servers so that future read requests to those servers return consistent results.

数据库服务器可以协同工作,比如在高可用系统中主服务器失败的时候,备份服务器可以马上接管,或者是在负载均衡架构中几台服务器提供相同的数据服务。理想状况下,数据库服务器可以无缝地协同工作。提供静态页面服务的web server可以相当容易地联合在一起,仅仅是通过把web请求均衡负载到多台机器。事实上,只读的数据库服务器协同工作也会相对容易些。不幸的是,许多数据库服务器是读/写请求混杂的,并且读/写服务器很难协同。这是因为只读数据只需要部署在每个服务器色上一次,而对任何服务器的写需要传播到所有的服务器以至于将来对于那些服务器的读请求可以返回一致的结果。

This synchronization problem is the fundamental difficulty for servers working together. Because there is no single solution that eliminates the impact of the sync problem for all use cases, there are multiple solutions. Each solution addresses this problem in a different way, and minimizes its impact for a specific workload.

这种同步问题就是服务器协同工作的一个基本问题。在所有用例中没有单个方案可以解决同步问题的影响,有多个方案。每一个解决方案以不同的方向来定位问题,并力图减少工作量。

Some solutions deal with synchronization by allowing only one server to modify the data. Servers that can modify data are called read/write or "master" servers. Servers that can reply to read-only queries are called "slave" servers. Servers that cannot be accessed until they are changed to master servers are called "standby" servers.

某些方案通过允许一台服务器修改数据来解决同步问题。能修改数据的服务器称之为读/写或者master服务器。提供只读请求响应的服务器称之为slave服务器。那些仅有在被改变成master服务器角色才能被访问的服务器我们称之为standby服务器。

Some solutions are synchronous, meaning that a data-modifying transaction is not considered committed until all servers have committed the transaction. This guarantees that a failover will not lose any data and that all load-balanced servers will return consistent results no matter which server is queried. In contrast, asynchronous solutions allow some delay between the time of a commit and its propagation to the other servers, opening the possibility that some transactions might be lost in the switch to a backup server, and that load balanced servers might return slightly stale results. Asynchronous communication is used when synchronous would be too slow. Solutions can also be categorized by their granularity. Some solutions can deal only with an entire database server, while others allow control at the per-table or per-database level.

一些解决方案是同步的,这意味着直到所有服务器都提交了事务数据的修改才被提交。相对而言,异步方案允许提交事务时间和它分发到其他服务器之间的延迟,这导致了在切换到备份服务器的时候一些事务被丢失的可能,并且load-balanced服务器可能返回老的结果。但同步太慢的时候就需要异步通讯了。方案也可以由他们的颗粒度来分类。一些方案只能处理整个数据库服务器,而其他的允许控制在表或数据库级别。这保证了一次failover不会丢失任何数据并且不管哪个load-balanced服务器被查询都会返回一致的结果。

Performance must be considered in any choice. There is usually a trade-off between functionality and performance. For example, a full synchronous solution over a slow network might cut performance by more than half, while an asynchronous one might have a minimal performance impact.

任何情况下性能是必须被考虑的。在功能和性能上通常应该权衡。比如,一个100%的同步方案在一个慢网络上可能导致超过50%的性能损失,而异步方案可能只有轻微的性能影响。

The remainder of this section outlines various failover, replication, and load balancing solutions. A glossary is also available.

剩下的篇幅罗列了各种failover, 复制和负载均衡方案。另外,对应的还有一个简要说明。

Shared Disk Failover

共享磁盘failover

Shared disk failover avoids synchronization overhead by having only one copy of the database. It uses a single disk array that is shared by multiple servers. If the main database server fails, the standby server is able to mount and start the database as though it was recovering from a database crash. This allows rapid failover with no data loss.
共享磁盘failover通过仅有一个数据库拷贝避免了同步的瓶颈。它使用一个单一的盘阵由多个服务器共享。如果主数据库服务器失败,那么standby服务器能够mount这个盘并且重启数据库仿佛是数据库crash之后的恢复。
Shared hardware functionality is common in network storage devices. Using a network file system is also possible, though care must be taken that the file system has full POSIX behavior (see Section 17.2.1). One significant limitation of this method is that if the shared disk array fails or becomes corrupt, the primary and standby servers are both nonfunctional. Another issue is that the standby server should never access the shared storage while the primary server is running.
共享硬件功能在网络存储设备上是很常见的。使用一个网络文件系统也是可能的,尽管需要小心这个文件系统拥有全部的POSIX行为(见17.2.1).这个方法的一个重要限制是如果共享盘阵失败或者损坏,主从服务器都将失败。另外一个问题是如果主服务器在运行,从服务器不应该访问共享存储。

File System (Block-Device) Replication

文件系统(块设备)复制

A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the master. DRBD is a popular file system replication solution for Linux.
共享硬件功能的一个修改版本是文件系统复制,一个文件系统的所有改变会镜像到位于另一个计算机的文件系统。唯一的限制是镜像必须以保证从服务器有文件系统的一致拷贝的方式来完成。具体说就是 数据必须以主服务器同样的顺序写入到从服务器。DRBD就是linux系统上一个比较流行的文件系统复制方案。

Warm Standby Using Point-In-Time Recovery (PITR)

使用时间基点的热备份(PITR)

A warm standby server (see Section 24.4) can be kept current by reading a stream of write-ahead log (WAL) records. If the main server fails, the warm standby contains almost all of the data of the main server, and can be quickly made the new master database server. This is asynchronous and can only be done for the entire database server.
一个热备份的服务器(看24.4)通过读取预写日志记录的方式保持当前状态。如果主服务器失败,那么热备份服务器包含有所有主服务器的数据,就可以很快地把自己改变为新的主数据库服务器。这是异步的并且仅能够对整个数据库服务器做备份。

Master-Slave Replication

主从复制

A master-slave replication setup sends all data modification queries to the master server. The master server asynchronously sends data changes to the slave server. The slave can answer read-only queries while the master server is running. The slave server is ideal for data warehouse queries.
一个主从复制结构是会发送所有的数据修改查询到主服务器。主服务器负责异步地发送数据改变到从服务器。从服务器在主服务器运行的时候可以响应只读查询。从服务器对于数据仓库的查询是比较理想的。
Slony-I is an example of this type of replication, with per-table granularity, and support for multiple slaves. Because it updates the slave server asynchronously (in batches), there is possible data loss during fail over.
Slony-I是这类复制的一个例子,基于表级的颗粒度,并且支持多个slave。因为它是异步更新(批处理),在failover的时候可能会有数据丢失。

Statement-Based Replication Middleware

基于statement的复制中间件

With statement-based replication middleware, a program intercepts every SQL query and sends it to one or all servers. Each server operates independently. Read-write queries are sent to all servers, while read-only queries can be sent to just one server, allowing the read workload to be distributed.
使用基于statement的复制中间件,程序截取每一个sql查询并且把它发送到一个或者所有的服务器。每一个服务器独立操作。读/写查询被送到所有的服务器,与此同时只读查询只送到一个服务器,这样可以实现读分布。
If queries are simply broadcast unmodified, functions like random(), CURRENT_TIMESTAMP, and sequences would have different values on different servers. This is because each server operates independently, and because SQL queries are broadcast (and not actual modified rows). If this is unacceptable, either the middleware or the application must query such values from a single server and then use those values in write queries. Also, care must be taken that all transactions either commit or abort on all servers, perhaps using two-phase commit (PREPARE TRANSACTION and COMMIT PREPARED. Pgpool-II and Sequoia are examples of this type of replication.
如果一个查询简单地不家修改地广播,函数比如random(), CURRENT_TIMESTAMP, 以及sequence在不同的服务器上将是不同的值。这是因为每一个服务器都是独自工作的,并且因为sql查询被广播(并不是实际修改的行)。如果这个不能被接受,那么不管是中间件或者应用都必须从一个单独的服务器来查询这些值并且用在写查询中。同事,需要注意的是,所有的事务不管是commit或者abort,或许使用两阶段提交(PREPARE TRANSACTION和COMMIT PREPARED)。pgpool-ii以及Sequoia就是这类复制方案的一个例子。

Asynchronous Multimaster Replication

异步multimaster复制

For servers that are not regularly connected, like laptops or remote servers, keeping data consistent among servers is a challenge. Using asynchronous multimaster replication, each server works independently, and periodically communicates with the other servers to identify conflicting transactions. The conflicts can be resolved by users or conflict resolution rules.
对于没有互联的服务器,比如掌上电脑或者远程服务器,在服务器间保持数据的一致性是一个挑战。使用异步multimaster复制方案,每一个服务器独立工作,并且周期性地相互通讯来标识冲突的事务。这些冲突通过用户或者冲突解决规则得以处理。

Synchronous Multimaster Replication

同步multimaster复制

In synchronous multimaster replication, each server can accept write requests, and modified data is transmitted from the original server to every other server before each transaction commits. Heavy write activity can cause excessive locking, leading to poor performance. In fact, write performance is often worse than that of a single server. Read requests can be sent to any server. Some implementations use shared disk to reduce the communication overhead. Synchronous multimaster replication is best for mostly read workloads, though its big advantage is that any server can accept write requests — there is no need to partition workloads between master and slave servers, and because the data changes are sent from one server to another, there is no problem with non-deterministic functions like random().
在同步multimaster复制方案中, 每一个服务器接受写请求,并且在每一个事务提交之前,修改的数据从源服务器发送到每一个其他服务器。大负载的写行为会引起过分的锁,从而导致低下的性能。事实上,写性能经常会比单个服务器差。读操作可以发送给任一服务器。一些实现使用共享磁盘来降低通讯负载。同步multimaster复制方案对于多数读操作来讲是不错的,尽管它的最大益处是任何的服务器都可以接受写请求,因而不需要对负载做分发在主从服务器之间,并且数据改变是从一个服务器到另一个服务器,这对很多像random()这样的非确定性函数来说是没有问题的。
PostgreSQL does not offer this type of replication, though PostgreSQL two-phase commit (PREPARE TRANSACTION and COMMIT PREPARED) can be used to implement this in application code or middleware.
PostgreSQL没有提供这个类型的复制方案,尽管两阶段提交可以被用来在应用代码和中间件中实现它。

Commercial Solutions

商业解决方案

Because PostgreSQL is open source and easily extended, a number of companies have taken PostgreSQL and created commercial closed-source solutions with unique failover, replication, and load balancing capabilities.
因为PostgreSQL是一个开源软件并且易于扩展,很多公司使用PostgreSQL来创建商业的闭源解决方案,包括唯一的failover,复制,和负载均衡的能力。

Table 25-1 summarizes the capabilities of the various solutions listed above.

图标25-1 概括了以上罗列的各种解决方案的性能。

Table 25-1. High Availability, Load Balancing, and Replication Feature Matrix
Feature Shared Disk Failover File System Replication Warm Standby Using PITR Master-Slave Replication Statement-Based Replication Middleware Asynchronous Multimaster Replication Synchronous Multimaster Replication
No special hardware required
Allows multiple master servers
No master server overhead
No waiting for multiple servers
Master failure will never lose data
Slaves accept read-only queries
Per-table granularity
No conflict resolution necessary
Communication Method shared disk disk blocks WAL table rows SQL table rows table rows and row locks

There are a few solutions that do not fit into the above categories:

Data Partitioning

数据分区

Data partitioning splits tables into data sets. Each set can be modified by only one server. For example, data can be partitioned by offices, e.g. London and Paris, with a server in each office. If queries combining London and Paris data are necessary, an application can query both servers, or master/slave replication can be used to keep a read-only copy of the other office's data on each server.
数据分区是把表切分为多个数据集。每一个数据集只能别唯一的服务器修改。举个例子,数据可以通过office来分区,比如伦敦和巴黎,在每个办公室都有一个服务器。如果联合伦敦和巴黎的数据是必要的话,那么一个应用可以同时查询两台服务器,或者master/slave复制可以在每一个服务器上保持一个只读的其他办公室的数据拷贝。

Multiple-Server Parallel Query Execution

多服务器并行查询执行

Many of the above solutions allow multiple servers to handle multiple queries, but none allow a single query to use multiple servers to complete faster. This solution allows multiple servers to work concurrently on a single query. It is usually accomplished by splitting the data among servers and having each server execute its part of the query and return results to a central server where they are combined and returned to the user. Pgpool-II has this capability. Also, this can be implemented using the PL/Proxy toolset.
上面的很多方案允许多个服务器来处理多个请求,但是没有一个方案允许一个单独的查询可以使用多台服务器来达到快速计算的目的。这个方案允许为一个查询同时工作。通常它是通过吧数据切分到多个服务器并且每个服务器只是执行查询的这个部分,然后返回结果到一个中央服务器,在那他们被合并后返回用户。Pgpool-II有这种能力。也可以通过pl/proxy工具集来实现。
阅读(2787) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~