1 首先下载安装包,提供以下几个FTP地址:
ftp://ftp.rucus.ru.ac.za/pub/.1/distfiles/ direct
124.80 KB 2004-08-07watchdog_5.2.4.orig.tar.gz
///////
124.80 KB 2004-07-26
///////
124.80 KB 2004-07-26
///////
124.80 KB 2004-07-26
/////////
124.80 KB 2004-06-14
2 解压缩tar文件
$tar zxvf watchdog_5.2.4.orig.tar.gz
$cd watchdog-5.2.4.orig
3 编译代码进行安装
$ ./configure
$ make
$ make install
安装的程序的默认路径:
/usr/bin/install -c watchdog /usr/sbin/watchdog
/usr/bin/install -c wd_keepalive /usr/sbin/wd_keepalive
默认的配置文件路径:
/etc/watchdog.conf
5 进行简单配置
#ping = 172.31.14.1
#ping = 172.26.1.255
#interface = eth0
#file = /var/log/messages
#change = 1407
控制系统cpu使用状况:
#max-load-1 = 24
#max-load-5 = 18
#max-load-15 = 12
控制系统内存使用状况:
#min-memory = 1
指定repair脚本,repair脚本具体内容可以比如重新启动已经kill的所监控的程序:
#repair-binary = /usr/sbin/repair
#test-binary =
#test-timeout =
watchdog硬件地址:
#watchdog-device = /dev/watchdog
# Defaults compiled into the binary
#temperature-device =
#max-temperature = 120
# Defaults compiled into the binary
#admin = root
#interval = 10
#logtick = 1
# This greatly decreases the chance that watchdog won't be scheduled before
# your machine is really loaded
realtime = yes
priority = 1
监控程序的pid文件:
#pidfile = /var/run/syslogd.pid (一旦syslogd进程停止,系统将自动重启)
配置watchdog可以参考的文档包括:
man 8 watchdog
man 5 watchdog.conf
/usr/share/doc/watchdog/examples
我只是用它来保障sshd、apache和mysql运行稳定。
6、确保apache、ssh和mysql在watchdog之前起来
重起机器后,watchdog会查找这三个进程的pid,如果不存在,默认会重起机器(除非我们设置了repair binary),因此,一定要在启动脚本中改变启动顺序,让watchdog在这三者启动之后才启用。
7、设置repair脚本
我并不想这几个应用程序随便出点啥事就重启计算机,所以,写了个简单的脚本:
root@wlj:/var/run# cat /usr/sbin/repair.sh
#!/bin/sh
if [ -x /etc/init.d/networking ]; then
# Debian
/etc/init.d/networking restart
elif [ -x /etc/rc.d/init.d/network ]; then
# Redhat
/etc/rc.d/init.d/network restart
else
echo "Couldn't find network script to relaunch networking. Please edit $0" | logger -i -t repair -p daemon.info
exit $1
fi
if [ -x /etc/rc.local ]; then
/etc/rc.local
else
echo "rc.local restart error" |logger -i -t repair -p daemon.info
exit $1
fi
if [ -x /etc/init.d/apache2 ]; then
/etc/init.d/apache2 restart
else
echo "apache restart error" |logger -i -t repair -p daemon.info
exit $1
fi
if [ -x /etc/init.d/ssh ]; then
/etc/init.d/ssh restart
else
echo "sshd restart error" |logger -i -t repair -p daemon.info
exit $1
fi
if [ -x /etc/init.d/mysql ]; then
/etc/init.d/mysql restart
else
echo "mysql restart error" |logger -i -t repair -p daemon.info
exit $1
fi
exit 0
8、测试watchdog是否成功启用
试着杀断ssh,然后看/var/log/syslog中的日志