Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1914255
  • 博文数量: 346
  • 博客积分: 10221
  • 博客等级: 上将
  • 技术积分: 4079
  • 用 户 组: 普通用户
  • 注册时间: 2009-06-01 19:43
文章分类

全部博文(346)

文章存档

2012年(1)

2011年(102)

2010年(116)

2009年(127)

我的朋友

分类: 服务器与存储

2010-05-14 23:39:39

Solution
What is a mailbox disk?
Mailbox disks are used to store cluster related data that needs to be persistent across reboots. Specifically, information about cluster state, state of the mirrors, and ownership is read by a clustered filer during the boot process. If the filer finds a reservation on its disks, then instead of booting normally, the filer will echo "Waiting for giveback". The filer will not boot into normal operational mode and will assume that the partner has taken over. It will wait for cf giveback to be performed before continuing a normal boot. If there is no reservation in place, then the filer will boot normally. The reason why the filer writes information on the mailbox is to show its partner that it is still connected. It also records information about different states, like mirrored states, which are used to determine which plex of a mirror is more up-to-date. A mailbox is a secondary way (besides interconnect cable) to ensure a heartbeat and avoid a split-brain situation.
It also avoids the unnecessary takeover that can be caused by any potential disruption of the interconnect cable.

How does the filer choose a mailbox disk?
In normal situations, the filer always chooses the parity disk and the first data disk of the root vol to be the two mailbox disks. But if at any time one of the two disks fail, the mailbox will be changed to the Dparity disk of the root vol.

The mailbox disks can be changed by ONTAP, for example, if the parity disk of the root vol fail. Data ONTAP will put the mailbox role on the dparity disk of root vol, and will let it be the new mailbox. Then, once the new replaced disk finishes the reconstruction, the role will be changed back to the current parity and the first data disk of the root vol.


How does the filer access mailbox disks?

The filer will write information to its own mailbox disks. It reads the information that is written by its partner from the partner's mailbox disks, but it never writes anything on partner's mailbox. So, even though the scsi3 reservation may be put on partner's disk, the local filer can still read data from the partner's disks. Using this way, the filer detects whether the partner head is still alive. The filer will try to access the mailbox disks every 3-5 seconds. If all the mailbox disks in the local side are removed concurrently, a "permanent error of accessing mailbox disk" error will occur and the system will panic.


How does Data ONTAP use mailbox disks to judge in which situation to disable the cluster?
It use "majority/quorum" rule. Say, half of the members will maintain the stability of cluster. If less than half of the members are available, in other words, more than half of the members fail, the cluster will be disabled. So if at any time one of the two mailboxes is bad/broken/no response, there will be an "mailbox uncertain" or "mailbox error detected" message pop up, and the cluster is disabled for a while to check if the situation can recover. After a while, if the mailbox disks is successfully changed to other good disk in the raid group, the cluster will be enabled again.

In SyncMirror situation, there could be 4 mailbox disks on one side, 2 for local and 2 for partner, if the aggregate or volume containing the MB disks is syncmirrored. A this time, if one of them fails, the message won't appear because the quorum hasn't been reached. If two or more fail, the message will be displayed.
阅读(3827) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~