原理:
实现:
1,至少准备两台安装好的nagios系统,包括plugin
central_server 主控制机 需要配制apache和cgi.cfg
dist1_server 分布式机器 不需要配制apache和cgi.cfg部分.
2,central_server安装nsca插件server端,并配置nsca.cfg,启动nsca服务
3, dist1_server 安装nsca插件client端,配置好send_nsca.cfg,主要是保持和服务器端的密码相同
4, 配制central_server上的nagios.cfg,修改以下参数为下面的值
enable_notifications=1
check_external_commands=1
execute_service_checks=0
accept_passive_service_checks=1
5, dist1_server 上nagios.cfg修改
obsess_over_services=1
enable_notifications=0
check_external_commands=1
execute_service_checks=1
accept_passive_service_checks=1
ocsp_command=submit_check_result
在misccommand.cfg中添加以下command
define command{
command_name submit_check_result
command_line /usr/local/nagios/libexec/eventhandlers/submit_check_result $HOSTNAME$ '$SERVICEDESC$' $SERVICESTATE$ '$SERVICEOUTPUT$'
}
在dist1_server的/etc/hosts中加入对central_server的地址解析
写脚本/usr/local/nagios/libexec/eventhandlers/submit_check_result,并把脚本的权限修改为755,内容如下
#!/bin/sh
# Arguments:
# $1 = host_name (Short name of host that the service is
# associated with)
# $2 = svc_description (Description of the service)
# $3 = state_string (A string representing the status of
# the given service - "OK", "WARNING", "CRITICAL"
# or "UNKNOWN")
# $4 = plugin_output (A text string that should be used
# as the plugin output for the service checks)
#
# Convert the state string to the corresponding return code
return_code=-1
case "$3" in
OK)
return_code=0
;;
WARNING)
return_code=1
;;
CRITICAL)
return_code=2
;;
UNKNOWN)
return_code=-1
;;
esac
# pipe the service check info into the send_nsca program, which
# in turn transmits the data to the nsca daemon on the central
# monitoring server
/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/local/nagios/bin/send_nsca central_server -c /usr/local/nagios/etc/send_nsca.cfg
6,部署监控点,central_server和dist1_server同时部署.
所有services模板中添加
check_freshness=1
freshness_threshold=300
在所有services.cfg的具体配制中添加
freshness_threshold=300 (300这个值应该是check_interval的值的两倍,根据实际情况调整).
OK,配制完毕
7,需要注意的问题:
1,central_server上面的监控点实际上是各Distribution上监控点的叠加,所以每加一个监控点的时候,
要同时维护central_server和dist_server,这个问题期望用发布脚本可以来解决。
2,central_server自己没有主动监控,因此本身需要dist_server来监控它.
8,后续考虑:HA实现
central_server需要两台机器,通过dist_server监控central_server来促发even_handle来保证central_server自动切换.
转自:
阅读(2417) | 评论(3) | 转发(0) |