Sun Cluster 2.x and 3.x: Explanation of the PMF-东方蜘蛛-ChinaUnix博客

EastSpider's&nbsp;Favoriteseastspider.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

东方蜘蛛

博客访问： 3235112
博文数量： 443
博客积分： 11301
博客等级：上将
技术积分： 5679
用户组：普通用户
注册时间： 2004-10-08 12:30

个人简介

欢迎加入IT云增值在线QQ交流群：342584734

文章分类

全部博文（443）

LINUX（1）
Oracle（2）
音乐无线（4）
随笔（3）
SDS/SVM专题（15）
亲亲宝贝（26）
免费下载（19）
脑筋急转弯（1）
笑不笑由你（9）
生活百科（51）
休闲娱乐（7）
心路历程（26）
职业人生（35）
技术知识（8）
VxVM专题（40）
SunCluster（23）
Solaris专题（173）

ZFS（8）

Solaris10专题（33）
未分配的博文（0）

文章存档

2022年（1）

2021年（1）

2015年（2）

2014年（1）

2013年（1）

2012年（4）

2011年（19）

2010年（32）

2009年（2）

2008年（4）

2007年（31）

2006年（301）

2005年（42）

2004年（2）

我的朋友

最近访客

推荐博文

Sun Cluster 2.x and 3.x: Explanation of the PMF

分类：

2006-05-02 20:57:17

Sun Cluster 2.x and 3.x: Explanation of the Process Monitor Facility (PMF)

This document attempts to explain, in standard English, how the Process Monitor Facility, known as pmf, pmfd or rpc.pmfd, excercises control over other system processes.

PMF is a cluster (2.x and 3.x) facility to monitor processes within the cluster framework and provide for restarts of the processes where possible.

PMF is also tied to the failfast driver to alert the failfast driver when a critical cluster process has died.

This functionality has been expanded in Cluster 3.x to provide for monitoring of dataservices (resources), notably the generic data ervice.

This has led to an increase in PMF's visibility and an increase in the need for information on its operation.

BASICS

PMF is a RPC (Remote Procedure Call) service called rpc.pmfd and is located in /opt/SUNWcluster/bin (2.2) or /usr/cluster/lib/sc (3.x, 32 bit) or /usr/cluster/lib/sc/sparcv9 (3.x, 64 bit).

In Sun Cluster 2.2, it is started by a rc (run control) script in /etc/rc3.d called S23initpmf.

In Sun Cluster 3.0, this script is called S17initpmf.

During startup, the process (rpc.pmfd) sets aside some memory for itself,
puts itself into real-time mode and waits for processes to be registered through pmfadm (the administrative command).

Cluster processes that need pmf monitoring get started with pmfadm to place them in pmfd's monitoring list.

MONITORING

pmf monitors processes through the use of tags handed to it by pmfadm and by attaching itself to pids (process identifiers) in the /proc/filesystem.

Because only one monitoring process can be attached to a given pid in he /proc filesystem at a time, this prevents truss from being used on a process that is started up under pmf control.

pmf uses the /proc filesystem monitoring because one of the options that is available when starting a process under pmf control is to monitor that process' children, or sub-processes. In order to do this, pmf listens to each process (using a method similar to truss) to detect fork() system calls, indicating that a child process is being created. Once a child has been detected, pmf then gathers information about that process and keeps track of it as well.

Different levels of child monitoring can be specified for pmf's behavior.
The default is for pmf to monitor a process and all children. In that event, the original process is not restarted until it and all its children have died.

RESTARTING

The truly interesting behavior of pmf is centered around what to do when a particular process that is being monitored by pmf dies. This action is determined by:

1. Are children being monitored?
2. What action is specified for this process when it dies?
3. How much time has elapsed since it died last?
4. How many times have we tried to restart it?

These are all configurable using pmfadm. The general logic is as follows:

- If a process dies, check to see if we are still monitoring it.
- If we are supposed to stop monitoring it, quit.
- If we are still monitoring the process that died, check to see if we are monitoring its parent or children.
- If we are monitoring the parent or children, check to see if they are running.
- If the parent or children are running, quit.
- If this is the last of the parent/child processes that have died, check to see if there is an action to be performed (specified by the -a option to pmfadm).
- If there is no action, quit.
- If there is an action, execute it.
- If the action is successful, check to see if we have restarted this process too many times (configurable with -n in pmfadm ) in the pecified time interval (-t option to pmfadm).
- If we have exceeded our restart limits, quit.
- If we have not exceeded our restart limits, restart the process with the original arguments, and count another failure in the timeout period.
- If the action is not successful, remove the process from monitoring.

pmfadm -- the administrative interface

Once rpc.pmfd is started, everything else occurs via pmfadm commands.

A simplified arguments list follows:

-a : The name of the action script to execute as part of the restart logic

-c nametag : Start a process and use as its identifier.

-C level : Keep track of this level of children. Default is all;i.e., children, children's children, and so on.

-e ENV_VAR=env.value : An environment variable in the form ENV_VAR=env.value which is passed to the new process. This option can be repeated.

-E : Pass the whole pmfadm environment to the new process.

NOTE: The -e and -E options are mutually exclusive.

-h host : The name of the host to contact. Default is localhost.

-k nametag signal : Send the specified signal to the processes ssociated with nametag, including any processes associated with the action program if it is currently running. The default signal, SIGKILL, is sent if none is specified.If the process and its descendants exit, and there are remaining retries available, the process monitor res-tarts the process. The signal specified is the same set of names recognized by the kill(1) command.

-l nametag: Print out status information about nametag.

-L : Return a list of all tags running that belong to the user that issued the command, or if the user is root,all tags running on the server are shown.

-m nametag : Modify the number of retries, or time period over which to observe retries, for nametag.

-n retries : Number of retries allowed within the specified time period. The default value for this field is 0, which means that the process is not restarted once it exits.A value of -1 indicates that the number of retries is infinite.

-q nametag : Indicate whether nametag is registered and running under the process monitor. Returns 0 if it is, 1 if it is not.

-s nametag : Stop restarting the command associated with nametag.

-t period : Minutes over which to count failures. The default value is -1, which equates to infinity.

-w timeout : When used in conjunction with the -s nametag or -k nametag flags, wait up to the specified number of seconds for the processes associated with nametag to exit.

阅读(2725) | 评论(0) | 转发(0) |

上一篇：从硬件到软件SUN F4800安装步骤

下一篇：How to remove the network adapter form luster 2.2

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6