Chinaunix首页 | 论坛 | 博客
  • 博客访问: 810021
  • 博文数量: 170
  • 博客积分: 1603
  • 博客等级: 上尉
  • 技术积分: 1887
  • 用 户 组: 普通用户
  • 注册时间: 2010-07-09 15:54










分类: 系统运维

2016-11-15 22:34:19

有些参数要去Working with DRBD里找详细说明
有些参数去Optimizing DRBD performance找详细说明

local-io-error "/usr/lib/drbd/; /usr/lib/drbd/; echo o > /proc/sysrq-trigger";
--on-no-data-accessible ond-policy
This setting controls what happens to IO requests on a degraded, disk less node (I.e. no data store is reachable).
The available policies are io-error and suspend-io.
degraded这里指的是raid的一种状态,默认当作io-error异常处理, node里的配置覆盖disk中配置

disk-barrier在 linux-2.6.36 (or 2.6.32 RHEL6)以后不安全,默认no
disk-flushes是默认值,默认yes, 有raid电池可以关闭提高性能
no-disk-drain 这个文档说不要用

These options affect write performance on the secondary node. max-buffers is the maximum number of buffers DRBD allocates for writing data to disk while max-epoch-size is the maximum number of write requests permitted between two write barriers. max-buffers must be equal or bigger to max-epoch-size to increase performance. The default for both is 2048; setting it to around 8000 should be fine for most reasonably high-performance hardware RAID controllers.
max-buffers  drbd每次从磁盘申请的预写入空间大小
max-epoch-size 两次写入之间内存中最多保存多少数据

When the number of pending write requests on the standby (secondary) node exceeds the unplug-watermark, 
Some controllers perform best when "kicked" frequently, so for these controllers it makes sense to set this fairly low, perhaps even as low as DRBD’s allowable minimum (16). Others perform best when left alone; for these controllers a setting as high as max-buffers is advisable.
当standby (secondary)节点的未写入数据达到unplug-watermark值,将节点踢掉

Configure the hash-based message authentication code (HMAC) or secure hash algorithm to use for peer authentication. The kernel supports a number of different algorithms, some of which may be loadable as kernel modules. See the shash algorithms listed in /proc/crypto. By default, cram-hmac-alg is unset. Peer authentication also requires a shared-secret to be configured.
hash-based message authentication code  基于散列的消息认证码
secure hash algorithm to use for peer authentication.   对等认证的安全散列算法

al-extents extents
DRBD automatically maintains a "hot" or "active" disk area likely to be written to again soon based on the recent write activity. The "active" disk area can be written to immediately, while "inactive" disk areas must be "activated" first, which requires a meta-data write. We also refer to this active disk area as the "activity log".
我们将这个热点/活动区域称为activity log(activity log太小的话, meta-data被频繁的移出移入对性能影响大)

The activity log saves meta-data writes, but the whole log must be resynced upon recovery of a failed node. The size of the activity log is a major factor of how long a resync will take and how fast a replicated disk will become consistent after a crash.
activity log保存了meta-data写入的数据,当恢复一个失败节点的时候,activity log必须被完整的重新同步,activity log的大小决定了恢复失败节点所需要的时间

The activity log consists of a number of 4-Megabyte segments; the al-extents parameter determines how many of those segments can be active at the same time. The default value for al-extents is 1237, with a minimum of 7 and a maximum of 65536.
activity log中的数据以4MB为单位分段存放,al-extents的参数就设定有多少个段可以同时被激活,默认值为1237

Note that the effective maximum may be smaller, depending on how you created the device meta data, see also drbdmeta(8) 
The effective maximum is 919 * (available on-disk activity-log ring-buffer area/4kB -1),
the default 32kB ring-buffer effects a maximum of 6433 (covers more than 25 GiB of data) 
We recommend to keep this well within the amount your backend storage and replication link are able to resync inside of about 5 minutes.
默认有效的最大值是 919*(ring-buffer area/4KB -1)
ring-buffer area=--al-stripes * --al-stripe-size-kB
默认有效最大有效值是1*32KB/4KB -1 = 6433,(包含25G的数据, 6433*4MB=25.732GB)
因为主节点崩溃的情况下activity log需要被恢复,所以推举设置一个值使得backend storage能在5分钟内重新同步完
If the application using DRBD is write intensive in the sense that it frequently issues small writes scattered across the device, it is usually advisable to use a fairly large activity log. Otherwise, frequent metadata updates may be detrimental to write performance.
这种情况下,过多的metadata更新会降低性能.适当的增加activity log大小将能有效的减少metadata更新

DRBD's activity log transaction writing makes it possible, that after the crash of a primary node a partial (bit-map based) resync is sufficient to bring the node back to up-to-date. Setting al-updates to no might increase normal operation performance but causes DRBD to do a full resync when a crashed primary gets reconnected. The default value is yes.
在drbd activity log的事物写入的支持下,主节点崩溃后只需要重新同步部分数据即可恢复状态
With this parameter, the activity log can be turned off entirely (see the al-extents parameter). This will speed up writes because fewer meta-data writes will be necessary, but the entire device needs to be resynchronized opon recovery of a failed primary node. The default value for al-updates is yes.
使用这个参数, activity log可以被完全清除,
这样会提高写入速度, 因为只需要很少的metadata
相当于默认yes是支持事物的innodb, no是不支持事物的myisam

By default DRBD blocks when the available TCP send queue becomes full. That means it will slow down the application that generates the write requests that cause DRBD to send more data down that TCP connection.
In an environment where the replication bandwidth is highly variable (as would be typical in WAN replication setups), the replication link may occasionally become congested. In a default configuration, this would cause I/O on the primary node to block, which is sometimes undesirable.
Instead, you may configure DRBD to suspend the ongoing replication in this case, causing the Primary’s data set to pull ahead of the Secondary. In this mode, DRBD keeps the replication channel open?—?it never switches to disconnected mode?—?but does not actually replicate until sufficient bandwith becomes available again.

c-plan-ahead plan_time, c-fill-target fill_target, c-delay-target delay_target, c-max-rate max_rate, c-min-rate

connect-int 10 # 重连尝试间隔 默认值10 单位秒
ping-int 8  #心跳ping间隔 默认值10 单位秒
ping-timeout   #心跳ping超时 默认值500ms 单位毫秒
timeout 30  #这个参数就是就是其他节点回包超过这个时间就认为节点down,默认值60,单位0.1秒

wfc-timeout time
Wait for connection timeout. The init script drbd(8) blocks the boot process until the DRBD resources are connected. When the cluster manager starts later, it does not see a resource with internal split-brain. In case you want to limit the wait time, do it here. Default is 0, which means unlimited. The unit is seconds.
degr-wfc-timeout time
Wait for connection timeout, if this node was a degraded cluster. In case a degraded cluster (= cluster with only one node left) is rebooted, this timeout value is used instead of wfc-timeout, because the peer is less likely to show up in time, if it had been dead before. Value 0 means unlimited.
outdated-wfc-timeout time
Wait for connection timeout, if the peer was outdated. In case a degraded cluster (= cluster with only one node left) with an outdated peer disk is rebooted, this timeout value is used instead of wfc-timeout, because the peer is not allowed to become primary in the meantime. Value 0 means unlimited.
阅读(549) | 评论(0) | 转发(0) |