爱生活,爱养生 www.sijiyang.com 欢迎朋友来友联
分类: 系统运维
2013-02-01 13:43:31
(一)常规信息的监控
这里监控linux远程主机的系统信息,使用到NRPE插件。
1.先安装NRPE插件
# wget
# tar xvf nrpe-2.12.tar.gz
# cd nrpe-2.12
#./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config
# make install-xinetd
# vi /etc/xinetd.d/nrpe
# vi /etc/xinetd.d/nrpe
only_from = 192.168.137.89
# /etc/init.d/xinetd restart
# vi /etc/services
nrpe 5666/tcp # nrpe for nagios
# /etc/init.d/xinetd restart
# netstat -nltp
# ./check_nrpe –H 192.168.137.89
-----------------------------------------------------------------------------------------------------------------------------------------------
2.在centreon定义check_nrpe
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
----------------------------------------------------------------------------------------------------------------------------------------------
3.在nrpe.cfg文件定义给check_nrpe用的监控命令
因为是在监控本机上,这部分是可选的,可以不定义。
# vi /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 250 -c 300
command[check_cpu]=/usr/local/nagios/libexec/check_cpu.sh -w 70 -c 80
command[check_mem]=/usr/local/nagios/libexec/check_mem.sh --raw -w 85 -c 95
command[check_uptime]=/usr/local/nagios/libexec/check_uptime
----------------------------------------------------------------------------------------------------------------------------------------------
4.定义模板
把常用一些监控内容制定成模板,这样,当再新增主机监控服务的时候,就不必一个个增加了。
这里制定成模板的监控内容有:
CPU、memory、process、disk、uptime(包含了load、users)。
照此步骤,依次增加!check_disk 、!check_mem 、!check_uptime、!check_total_procs、!check_zombie_procs。
!check_uptime,完成后如图:
----------------------------------------------------------------------------------------------------------------------------------------------
5.把定义的nrpe模板关联到主机模板
----------------------------------------------------------------------------------------------------------------------------------------------
6.被监控机
1)现在打开另一台被监控linux主机终端
2)按前面的方法安装nagios-plugins和nrpe插件
3)定义nrpe的监控命令
# vi /usr/local/nagios/etc/nrpd.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 250 -c 300
command[check_cpu]=/usr/local/nagios/libexec/check_cpu.sh -w 70 -c 80
command[check_mem]=/usr/local/nagios/libexec/check_mem.sh --raw -w 85 -c 95
command[check_uptime]=/usr/local/nagios/libexec/check_uptime
4)nrpe.cfg中定义的插件命令,这里必须存在,如果没有的,请把它们copy或download下来
在Centreon监控主机执行:
# cd /usr/local/nagios/libexec
# scp check_cpu.sh root@192.168.137.213:/usr/local/nagios/libexec/
# scp check_mem.sh root@192.168.137.213:/usr/local/nagios/libexec/
在被监控主机执行:
# cp /usr/bin/uptime check_uptime
# chown nagios.nagios check_cpu.sh
# chown nagios.nagios check_mem.sh
# chown nagios.nagios check_uptime
5)修改服务配置文件,只允许被centreon监控,并重启xined服务使其生效
# vi /usr/local/nagios/etc/nrpe.cfg
server_address=192.168.137.213
allowed_hosts=192.168.137.213,192.168.137.89
# vi /etc/xinetd.d/nrpe
only_from = 192.168.137.213 192.168.137.89
# /etc/init.d/xinetd restart
# netstat -nltp
在centreon监控机和被监控机上测试连接:
正确之后在才进行后面的操作。
----------------------------------------------------------------------------------------------------------------------------------------------
7.监控远程linux主机的基本信息
7.1 Centreon监控机上,增加被监控主机:
7.2查看,添加web主机的时候,已经自动添加了要监控的服务(因为我们前面定义了模板,并做了关联):
7.3 激活配置:
如果监控看中没有上面的Graphs图形标志,请重启下centstorage服务:
# /etc/init.d/centstorage restart
----------------------------------------------------------------------------------------------------------------------------------------------
(二)监控远程linux主机的其它内容
web (Nginx_status 、tomcat 、php) mysql、mysql主从状态、ssh、ntp、ftp、rsync
1. 监控nginx状态,
使用check_nginx.sh插件:
下载:
1.1 定义命令
在被监控机上增加对check_ngios.sh的定义:
# vi /usr/local/nagios/etc/nrpe.cfg
command[check_nginx]=/usr/local/nagios/libexec/check_nginx.sh -w 50 -c 100
# /etc/init.d/xinetd restart
1.2 修改脚本里的变量值:
# vi /usr/local/nagios/libexec/check_nginx.sh
ST_OK=0
ST_WR=1
ST_CR=2
ST_UK=3
hostname="localhost"
port=80
path_pid=/usr/local/webserver/nginx/logs/
name_pid="nginx.pid"
status_page="nginx_status"
1.3修改nginx配置文件,关重启nginx:
# vi nginx.conf
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
}
# ps –aux |grep nginx
# kill –HUP 26219
1.4 增加服务:
Check_command check_nrpe
Args !check_nginx
----------------------------------------------------------------------------------------------------------------------------------------------
2. 监控apache服务的状态,
使用check_apache.sh 插件
下载:
# chmod +x check_apache.sh
# chown nagios.nagios check_apache.sh
2.1 在被监控机上:
# vi /etc/httpd/conf/httpd.conf
ExtendedStatus On
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1 192.168.137.89
# /etc/init.d/httpd restart
2.2 centreon监控机上:
定义命令:
Command_name check_apache
Command_line $USER1$/check_apache.sh -H $HOSTADDRESS$ -P $ARG1$ -b $ARG2$ -p $ARG3$ -n $ARG4$ -wr $ARG5$ -cr $ARG6$
2.3 增加服务:
Check_command check_apache
Args !80!/usr/sbin!/var/run/!httpd.pid!100!250
3. 监控tomcat
3.1 被监控机上:
# vi test.jps
tomcat UP <%= new java.util.Date() %>
centreon监控机上
定义命令:
Command_name check_tomcat
Command_line $USER1$/check_http -H $ARG1$ -p $ARG2$ --url=$ARG3$ --onredirect=critical --string=UP
3.2增加服务:
Check_command check_tomcat
Args !!8080!/test.jsp
------------------------------------------------------------------------------------------------------------------------------------------------------------------
4. 监控HTTP
4.1定义命令:
Command_name check_http
Command_line $USER1$/check_http -H $ARG1$ -p $ARG2$ -w $ARG3$ -c $ARG4$
4.2增加服务:
Check_command check_http
Args !!80!5!10
------------------------------------------------------------------------------------------------------------------------------------------------------------------
5. 监控mysql:
一般监控:
5.1 被监控机上:
mysql> grant usage ON *.* to nagios@'192.168.137.89' identified by 'nagiospw';
Query OK, 0 rows affected (0.00 sec)
5.2 定义命令:
command_name check_mysql
command_line $USER1$/check_mysql-H $HOSTADDRESS$ -P $ARG1$ -u $ARG2$ -p $ARG3$
5.3 增加服务:
check_command check_mysql
Args !3306!nagios!nagiospw
------------------------------------------------------------------------------------------------------------------------------------------------------------------
主从监控:
5.4 被监控机上:
mysql> grant usage,replication client ON *.* to nagios@'192.168.137.89' identified by 'nagios';
5.5 centreon监控机上:
下载:
wget
# tar xvf check_mysql_health-2.1.3.tar.gz
# cd check_mysql_health-2.1.3
# ./configure --prefix=/usr/local/nagios \
--with-nagios-user=nagios --with-nagios-group=nagios \
--with-perl
# make && make install
# chown nagios.nagios /usr/local/nagios/libexec/check_mysql_health
# vi /usr/local/nagios/libexec/check_mysql_health
5.6 定义命令:
command_name check_mysql_health
command_line $USER1$/check_mysql_health --hostname $ARG1$ --port $ARG2$ --username $ARG3$ --password $ARG4$ --mode $ARG5$
5.7 增加服务:
check_command check_mysql_health
Args !192.168.137.203!3306!nagios!nagiospw!slave-io-running
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
6. 监控ssh
6.1定义命令:
command_namecheck_ssh
command_line $USER1$/check_ssh –H $HOSTADDRESS$ -p $ARG1$
6.2增加服务:
check_command check_ssh
Args !22
------------------------------------------------------------------------------------------------------------------------------------------------------------------
7. 监控ftp
7.1定义命令:
command_namecheck_ftp
command_line $USER1$/check_ftp –H $HOSTADDRESS$ -p $ARG1$ -w $ARG2$ -c $ARG3$
7.2增加服务:
check_command check_ftp
Args !21!10!15
------------------------------------------------------------------------------------------------------------------------------------------------------------------
8. 监控NTP
8.1定义命令:
command_name check_ntp
command_line $USER1$/check_ntp -H$HOSTADDRESS$-w$ARG1$-c$ARG2$-j$ARG3$-k$ARG4$
8.2 增加服务:
check_command check_ntp
Args !1!3!-1:100!-1:200
------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.监控rsync
下载:
9.1定义命令:
command_name check_rsync
command_line $USER1$/check_rsync2.pl -H $HOSTADDRESS$
9.2 增加服务:
check_command check_rsync