多路径环境下RHCS和GFS的timeout配置-emailwht-ChinaUnix博客

Robert's&nbsp;Logwht.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

emailwht

博客访问： 5544501
博文数量： 890
博客积分： 12876
博客等级：上将
技术积分： 10760
用户组：普通用户
注册时间： 2004-10-04 14:18

个人简介

猝然临之而不惊，无故加之而不怒。

文章分类

全部博文（890）

Oracle（36）
虚拟化（5）
服务器（45）
安防技术（3）
网络技术（101）

CISCO（49）
关注农村（4）
房地产（8）
我家宝宝（33）
随思所想（181）
Novell（11）
PHP&SQL（19）
Windows（142）
English（3）
创业积累（21）
网络文摘（112）
图像设计（2）
Linux（164）

apache（0）

postfix（37）

iptables（13）

Squid（17）

Shell（12）
未分配的博文（0）

文章存档

2016年（1）

2014年（18）

2013年（41）

2012年（48）

2011年（65）

2010年（84）

2009年（121）

2008年（101）

2007年（129）

2006年（95）

2005年（118）

2004年（69）

我的朋友

相关博文

多路径环境下RHCS和GFS的timeout配置

分类： LINUX

2014-01-08 17:36:29

From:http://blog.chinaunix.net/u2/64483/showart_1985312.html

适用环境：Cluster or GFS on RHEL4 and later
故障现象：日志报错
openais[3345]: [CMAN ] lost contact with quorum device

目前只要客户有共享存储，在部署Cluster和GFS的时候，都建议配置quorum disk。而上面这个报错相信大家都不会陌生吧。这个问题一般是因为qdisk进程太长时间没有与cman/ais通信，超过了qdisk的poll投票时间，从而此节点被断开。特别是在配置了multipath、rdac等多路径软件环境中做链路失效切换测试时，由于failover的时间可能比较长，造成链路切换之前qdisk就已经丢失了，节点直接被reboot，而这当然不是我们所期待的结果。那怎么解决这个问题呢？
先来了解几个基本概念：
① 集群要认为一个节点健康，需要以下3要素
· CMAN认为该节点online
· 该节点能足够连续的读写quorum disk
· 该节点heuristic有足够的score
② qdisk包括两个主要线程：主线程负责循环和进行I/O操作；第二线程负责heuristic相关。
主线程另一个工作就是每隔一段时间告诉cman/ais自己还活着。如果qdisk超过quorum_dev_poll的时间而没有和cman/ais通信，cman就会声明说此节点与quorum disk断开连接，此时日志便会有如上报错。默认的cman.h里
#define DEFAULT_QUORUMDEV_POLL 10000

单位是ms，即10秒。修改quorum_dev_poll需要在cluster.conf文件里修改cman标签：
cman quorum_dev_poll="50000">/cman>

③我们平时指的qdisk timeout是指连续一段时间对quorum disk的读写都是失败。假如cluster.conf里
quorumd device="/dev/sdb1" interval="3" min_score="2" tko="13" votes="2">

其中
interval="3"
This is the frequency of read/write cycles, in seconds.读写quorum disk的频率
tko="13"
This is the number of cycles a node must miss in order to be declared dead.连续失败多少次则判定此节点死掉

qdisk_timeout = interval x tko

④再来看看RHEL5里cman timeout是怎么去配置的，
token
This timeout specifies in milliseconds until a token loss is declared after not receiving a token. This is the time spent detecting a failure of a processor in the current configuration. Reforming a new configuration takes about 50 milliseconds in addition to this timeout. The default is 1000 milliseconds. 连续多长时间没有收到token就判定令牌丢失。默认1秒，其中有50ms是生成一个新的配置的时间。
retransmits_before_loss
This value identifies how many token retransmits should be attempted before forming a new configuration. If this value is set, retransmit and hold will be automati- cally calculated from retransmits_before_loss and token. The default is 4 retransmissions. 连续丢失几次token，才会生成新的cluster配置（将丢失token的节点踢出集群）。默认4次。
token_retransmit
This timeout specifies in milliseconds after how long before receiving a token the token is retransmitted. This will be automatically calculated if token is modi- fied. It is not recommended to alter this value without guidance from the openais community. The default is 238 milliseconds. 重发token的时间间隔，这个值是由上面的token和token_retransmit自动计算的。(1000-50)/4≈238ms

如果出现上面说的丢失心跳token的时候，日志会出现如下报错：
openais[3345]: [TOTEM] The token was lost in the OPERATIONAL state.

注意单位为毫秒。另外，也可以修改cman的标签：
注：RHEL4并未使用openais的架构，因此只能通过deadnode_timeout来修改。
好，有了前面的基础，不难想象到各个timeout值，用T(*)表示，应有如下关系：
T(MPIO)

RH官方有如下建议：
T(qdisk) = T(MPIO) × 1.3
T(cman) = T(MPIO) × 2.7

参考文档：

、man page of

、

阅读(1493) | 评论(0) | 转发(1) |

上一篇：通过QDisk增强Red Hat Cluster Suite的仲裁机制（Quorum）

下一篇：oracle误删除数据的恢复方法

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6