分类: LINUX
2011-07-15 07:10:32
对基于jboss服务辅助监控
脚本部署在10.247crontab中中的部署:
# Job log update monitor
*/10 * * * * /usr/local/shell/web-tools/service-monitor/update-monitor.sh > /dev/null 2>&1
*/20 * * * * /usr/local/shell/web-tools/service-monitor/spiderjob.sh > /dev/null 2>&1
*/40 * * * * /usr/local/shell/web-tools/service-monitor/expjob.sh > /dev/null 2>&1
监控对象:
(1)10.190 10.80 10.82 10.83 10.85 10.87 30.11 30.12 30.101
(2)40.31 40.32
(3)40.72
监控原理:
相关脚本每10分钟或20分钟或40分钟检测监控对象的jboss日志大小是否增加,来辅助判断基于jboss的服务是否正常。这种方法不能一锤子定音,但可以作为判断服务是否正常的辅助手段。
报警:
如果检测日志大小不大于上次检测日志的大小,就会连接10.247报警平台短信报警,报警内容(1)、ip地址 、 Job log have NOT updated in 10 min、报警时间.(2)、ip地址、SpiderJob log have NOT updated in 20 min、报警时间(3)、ip地址、Job log have NOT updated in 40 min、报警时间。
脚本源码:
(1)/usr/local/shell/web-tools/service-monitor/update-monitor.sh
#!/bin/sh
# monitor the logs' size
# Shao
time=`date +%T`
hour=`date +%H`
min=`date +%M`
flushtime=$hour$min
echo $flushtime
SMhome="/usr/local/shell/web-tools/service-monitor"
tmpdir="$SMhome"/tmp
jbosslog=/usr/local/jboss/server/default/log/server.log
nginxlog=nginxlog=/usr/local/nginx/logs/
apachelog=/usr/local/httpd/logs/access_log
#for ip in 10.190 10.80 10.82 10.83 10.85 10.87 30.11 30.12 30.101 40.31 40.32 40.33 ; do
for ip in 10.190 10.80 10.82 10.83 10.85 10.87 30.11 30.12 30.101; do
echo $ip
jlogtmp=$tmpdir/$ip.jlog.tmp
#flush
if ; then
echo 00 > $tmpdir/$ip.tmp
fi
size=`ssh 10.10.$ip "ls -l $jbosslog" | awk '{print $5}'`
lastsize=`tail -2 $jlogtmp|head -1`
echo $size >> $jlogtmp
# log update
echo $size $lastsize
if [ $size -eq $lastsize ]; then
ssh 10.10.10.247 "fetion.sh phone="13811371488,13811193602" msg="$ip: Job log have NOT updated in 10 min.$time."" > /dev/null
#wget --output-document=/dev/null " Job log have NOT updated in 20 min.$time."
fi
done
#####################################################################
(2)、#!/bin/sh
# monitor the logs' size
# Shao
time=`date +%T`
hour=`date +%H`
min=`date +%M`
flushtime=$hour$min
echo $flushtime
SMhome="/usr/local/shell/web-tools/service-monitor"
tmpdir="$SMhome"/tmp
jbosslog=/usr/local/jboss/server/default/log/server.log
nginxlog=nginxlog=/usr/local/nginx/logs/
apachelog=/usr/local/httpd/logs/access_log
for ip in 40.31 40.32; do
echo $ip
jlogtmp=$tmpdir/$ip.jlog.tmp
#flush
if ; then
echo 00 > $tmpdir/$ip.tmp
fi
size=`ssh 10.10.$ip "ls -l $jbosslog" | awk '{print $5}'`
lastsize=`tail -2 $jlogtmp|head -1`
echo $size >> $jlogtmp
# log update
echo $size $lastsize
if [ $size -eq $lastsize ]; then
ssh 10.10.10.247 "fetion.sh phone="13811371488,13811193602" msg="$ip: SpiderJob log have NOT updated in 20 min.$time."" > /dev/null
#wget --output-document=/dev/null " Job log have NOT updated in 20 min.$time."
fi
done
#########################################################################
(3)、/usr/local/shell/web-tools/service-monitor/expjob.sh
#!/bin/sh
# monitor the logs' size
# Shao
time=`date +%T`
hour=`date +%H`
min=`date +%M`
flushtime=$hour$min
echo $flushtime
SMhome="/usr/local/shell/web-tools/service-monitor"
tmpdir="$SMhome"/tmp
jbosslog=/usr/local/jboss/server/default/log/server.log
nginxlog=nginxlog=/usr/local/nginx/logs/
apachelog=/usr/local/httpd/logs/access_log
for ip in 40.72; do
#for ip in 10.190 10.80 10.82 10.83 10.85 10.87 30.11 30.12 30.101 40.72; do
echo $ip
jlogtmp=$tmpdir/$ip.jlog.tmp
#flush
if ; then
echo 00 > $tmpdir/$ip.tmp
fi
size=`ssh 10.10.$ip "ls -l $jbosslog" | awk '{print $5}'`
lastsize=`tail -2 $jlogtmp|head -1`
echo $size >> $jlogtmp
# log update
echo $size $lastsize
if [ $size -eq $lastsize ]; then
ssh 10.10.10.247 "fetion.sh phone="13811371488,13811193602,15010801866" msg="$ip: Job log have NOT updated in 40 min.$time."" > /dev/null
#wget --output-document=/dev/null " Job log have NOT updated in 20 min.$time."
fi
done