博客是我工作的好帮手,遇到困难就来博客找资料
分类: 系统运维
2017-02-20 17:52:30
服务器端安装
1. 查看安装服务器环境(LAMP)
2. #rpm -qa | grep httpd
3. #rpm -qa | grep php
4. 没有的话安装
5. # yum -y install gcc glibc glibc-common gd gd-devel php openssl-devel httpd
6. 创建用户:
7. # useradd -m -s /bin/bash nagios
8.
9. # usermod -G nagios nagios
10.# vi /etc/passwd
11.nagios:x:500:500::/home/nagios:/sbin/nologin
12.改成:
13.nagios:x:500:500::/home/nagios:/bin/bash
14.创建一个用户组名为nagcmd 用于从Web接口执行外部命令。将nagios用户和apache用户都加到这个组中。
15.因为要用到 CGI 的 Web 监控面板,所以这里我们还要添加一个 nagcmd 组,用于 CGI 执行相关指令。
16.# /usr/sbin/groupadd nagcmd
17.# /usr/sbin/usermod -G nagcmd nagios
18.# /usr/sbin/usermod -a -G nagcmd daemon (因为是编译方式安装的apache,默认是以daemon用户运行)
下载相关的软件包,
服务器端需要安装以下三个包,客户端只需要安装后两个插件包:
1. [root@server ~]#cd /usr/local/src/
[root@server src]#
[root@server tarbag]# wget
4. [root@server tarbag]#wget
解压并编译安装Nagios:
# tar xvzf nagios-3.4.3.tar.gz
# cd nagios
运行Nagios配置脚本并使用先前开设的用户及用户组:
1. # ./configure --prefix=/usr/local/nagios --with-command-group=nagcmd
编译Nagios程序包源码:
1. # make all
安装二进制运行程序、初始化脚本、配置文件样本并设置运行目录权限:
1. # make install
2. # make install-init //在/etc/rc.d/init.d安装启动脚本
3. # make install-config //安装示例配置文件,安装的路径是/usr/local/nagios/etc
4. # make install-commandmode //配置目录权限
5. #ls /usr/local/nagios/
6. bin etc libexec sbin share var
三、配置nagios网页访问
1. 配置httpd
2. 生成Nagios的Apache配置文件
3. # cd /uer/local/src/nagios
4. # make install-webconf
5. /usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf
6. # cd sample-config
7. 参考sample-config/httpd.conf配置内容添加到Apache的httpd.conf配置文件中
8. 创建一个nagiosadmin的用户用于Nagios的Apache接口登录。记下你所设置的登录口令,一会儿你会用到它。
9. # htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
10.密码:nagiosadmin
重启Apache服务以使设置生效,访问看是否正常。
四、对nagios进行配置
样例配置文件默认安装在这个目录下/usr/local/nagios/etc,这些样例文件可以配置Nagios使之正常运行,只需要做一个简单的修改...
用你擅长的编辑器软件来编辑这个/usr/local/nagios/etc/objects/contacts.cfg配置文件,更改email地址nagiosadmin的联系人定义信息中的EMail信息为你的EMail信息以接收报警内容。
1. vi /usr/local/nagios/etc/objects/contacts.cfg
1、安装nagios插件
11.#cd ../
12.#tar zxvf nagios-plugins-1.4.16.tar.gz
#cd nagios-plugins-2.1.4
14.#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios/ //指定安装目录及用户和组
15.#make;make install
16.
17.安装NRPE插件,想获取客户机上更为详细的信息,还必须在服务器及客户端上安装NRPE插件。
18.#cd ..
19.#tar zxvf nrpe-2.12.tar.gz
20.#cd nrpe-2.12
21.#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios/
22.# make all
23.# make install-plugin;make install-daemon;make install-daemon-config
24.# ls /usr/local/nagios/libexec/
25.check_apt check_ftp check_mailq check_overcr check_tcp .......
26.验证Nagios的样例配置文件
27.# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
28.如果没有报错,可以启动Nagios服务
29.启动httpd及nagios服务并验证
30.#chkconfig --add nagios //设置nagios及http开机自启动
31.#chkconfig nagios on
32.#chkconfig httpd on
33.#service nagios start
34.#service httpd start
2、客户端安装
1. #useradd -s /sbin/nologin nagios //添加nagios用户
2. 安装nagios-plugins
3. # tar -zxvf nagios-plugins-1.4.15.tar.gz
4. # cd nagios-plugins-1.4.15
5. # ./configure --prefix=/usr/local/nagios
6. # make
7. # make install
8. # chown nagios.nagios /usr/local/nagios/
9. # chown -R nagios.nagios /usr/local/nagios/libexec/
10.安装nrpe插件
11.# tar -zxvf nrpe-2.12.tar.gz
12.# cd nrpe-2.12
13.# ./configure --prefix=/usr/local/nagios/
14.#
15.# make install-plugin 安装check_nrpe这个插件
16.# make install-daemon 安装daemon
17.# make install-daemon-config 安装配置文件
18.如果安装时报错:checking for SSL headers... configure: error: Cannot find ssl headers
19.# rpm -qa|grep openssl
20.openssl-devel-0.9.8e-12.el5_4.6
21.openssl-0.9.8e-12.el5_4.6
22.yum install openssl-devel
23.或者下载:
24.tar zxvf openssl-1.0.0a.tar.gz
25.cd openssl-1.0.0a
26../config
27.make
28.make test
29.make install
30.修改客户端配置文件
31.vi /usr/local/nagios/etc/nrpe.cfg
32.server_port:5666
33.allowed_hosts=127.0.0.1,192.168.1.95 //添加服务器端的IP地址
34.指定nagios监控主机ip,多个ip用逗号分隔,后面的IP地址,是nagios服务端的ip地址,也就是说只允许指定的ip通过nrpe开的端口5666取得本机的信息。
35.然后修改nrpe.cfg中的command部分。
36.启动NRPE守护进程:(可以将此命令加入/etc/rc.local,以便开机自动启动)
37.#/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
38.可以将此命令加入/etc/rc.local,以便开机自动启动
39.echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.local
40.#netstat -utpln |grep nrpe //查看nrpe进程是否已正常启动
41.#/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 NRPE v2.14 //nrpe测试结果,此结果为nrpe已经正常工作了
42.然后在nagios监控服务器上测试
43.#/usr/local/nagios/libexec/check_nrpe -H 192.168.1.77//被监控主机ip
44.返回信息被监控服务器上安装的NRPE版本:NRPE v2.12
3、定义监控内容
1. # vi /usr/local/nagios/etc/nrpe.cfg //定义监控服务器内容
2. command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 #监控登陆的用户数量
3. command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 #监控CPU的负载
4. command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2 #监控磁盘利用率,这里的sda2必须是实际的硬盘分区,可使用fdisk –l查
5. command[check_swap]=/usr/local/nagios//libexec/check_swap -w 20 -c 10 #监控交换空间 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z #监控进程中的僵尸进程
6. command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 #监控所有进程
7. 注意:command后面括号中的内容就是定义的变量,变量名可以任意指定,只需和服务器配置文件中的一致即可
4、自动添加主机服务
五、安装中遇到错误
[root@localhost nagios]# make all
cd ./base && make
make[1]:Entering directory '/tmp/nagios/base'
make[1]:*** No rule to make target '/include/locations.h', needed by'broker.o'. Stop.
make[1]:Leaving directory '/tmp/nagios/base'
make:***[all]Error 2
[root@localhost nagios]#
解决办法:
安装perl
yum –yinstall perl
重新编译即可
一、Nagios服务端安装
1、安装所需依赖关系包
2、添加Nagios所需用户及组
3、编译安装Nagios及创建登陆Nagios WEB程序用户
4、Nagios-plugin(插件)
5、配置服务自启动
二、基于NRPE配置Nagios监控Win主机
1、被监控端
安装:NSClient++-0.3.9-x64
2、监控端
1.测试与被监控端连通性
2.监控端定义命令、定义主机、定义服务
3.将定义好的模板加入到nagios.cfg文件中
4.重启服务
三、基于NRPE监控Linux主机
1、被监控端:
1.添加用户
2.安装插件nagios-plugins-1.4.15
3.安装NRPE
4.配置NRPE配置文件
#vi /usr/local/nagios/etc/nrpe.cfg
5.定义nrpe启动脚本且增加权限
6.添加自启动
7.启动服务
2、配置监控端:
1.安装NRPE
安装完成后,生成check_nrpe,使用此插件进行测试被监控主机
2.定义命令
3.定义主机和服务
4.将定义好的linhost.cfg配置文件的路径添加至/usr/localhost/etc/nagios.cfg中
5.测试配置文件 /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
6.重启服务
7.网页检查hosts监控状况
******另关于基于NRPE监控windows主机,另行查询网上资料******
一、安装配置Nagios
1、解决安装Nagios的依赖关系:
# yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server
2、添加nagios运行所需要的用户和组:
# groupadd nagcmd
# useradd -G nagcmd nagios
# passwd nagios
# usermod -a -G nagcmd apache
3、编译安装nagios:
# tar zxf nagios-3.3.1.tar.gz
# cd nagios-3.3.1
# ./configure --with-command-group=nagcmd --enable-event-broker
# make all
# make install
# make install-init
# make install-commandmode
# make install-config
# vi /usr/local/nagios/etc/objects/contacts.cfg
emailnagios@localhost #这个是默认设置
# make install-webconf
# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
# service httpd restart
4、编译、安装nagios-plugins
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make
# make install
5、配置并启动Nagios
#vi /usr/local/nagios/etc/nagios.cfg
# chkconfig --add nagios
# chkconfig nagios on
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
# service nagios start
6、配置selinux
#getenforce
#setenforce 0
7、通过web界面查看nagios:
登录时需要指定前面设定的web认证帐号和密码。
二、配置文件
Nagios的主配置文件
/usr/local/nagios/etc/nagios.cfg
Nagios模板配置目录
/usr/local/nagios/etc/objects
调用check命令目录/usr/local/nagios/libexec
三、基于NSClinet++ 监控远程Win主机
1、安装配置被监控端
安装NSClient++-0.3.9-x64
2、进行测试是否连通
#cd /usr/local/nagios/libexec
#./check_nt -H 192.168.1.119 -v UPTIME -p 12489
如有密码则:#./check_nt -H 192.168.1.119 -v UPTIME -p 12489 -s luoxj,123
3、监控端进行配置
&&&定义commands.cfg-------------------定义命令
#cd /usr/local/nagios/etc/objects/
#vi commands.cfg
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
}
&&&定义主机及服务
#vi windows.cfg
define host{
use windows-server
host_name winhost
alias My Windows Host
address 192.168.1.119
}
define service{ use generic-service host_name winhost service_description NSClient++ Version check_command check_nt!CLIENTVERSION }定义服务可根据实际情况进行变更名称可使用vim中替换进行:.,$s@winserver@winhost@g
&&&启用定义的文件,增加定义文件路径
#vi /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/windows.cfg
&&&进行测试,以确定配置文件没有问题
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg -d
#service nagios restart
四、基于NRPE监控远程Linux主机
1、安装配置被监控端
1)先添加nagios用户
# useradd -s /sbin/nologin nagios
2)NRPE依赖于nagios-plugins,因此,需要先安装之
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make all
# make instal
3)安装NRPE
# tar -zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure --with-nrpe-user=nagios \
--with-nrpe-group=nagios \
--with-nagios-user=nagios \
--with-nagios-group=nagios \
--enable-command-args \
--enable-ssl
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config
4)配置NRPE
# vim /usr/local/nagios/etc/nrpe.conf
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_address=172.16.100.11
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=172.16.100.1
command_timeout=60
connection_timeout=300
debug=0
&&&&&&&&定义监控对象命令&&&&&&&&&
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
5)启动NRPE
# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
为了便于NRPE服务的启动,可以将如下内容定义为/etc/init.d/nrped脚本:
#vi /etc/init.d/nrped
#!/bin/bash
# chkconfig: 2345 88 12
# description: NRPE DAEMON
NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg
case "$1" in
start)
echo -n "Starting NRPE daemon..."
$NRPE -c $NRPECONF -d
echo " done."
;;
stop)
echo -n "Stopping NRPE daemon..."
pkill -u nagios nrpe
echo " done."
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 start|stop|restart"
;;
esac
exit 0
#chmod +x /etc/init.d/nrped
#service nrped start
#netstat -tnlp ##检查nrpe端口5666是否启用
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 17282/nrpe
#service iptables stop
#setenforce 0
6)配置允许远程主机监控的对象
在被监控端,可以通过NRPE监控的服务或资源需要通过nrpe.conf文件使用命令进行定义,定义命令的语法格式为:command[
command[check_rootdisk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /
command[check_swap]=/usr/local/nagios/libexec/check_disk -w 40% -c 20%
command[check_sensors]=/usr/local/nagios/libexec/check_sensors
command[check_users]=/usr/local/nagios/libexec/check_users -w 10 -c 20
command[check_load]=/usr/local/nagios/libexec/check_load -w 10,8,5 -c 20,18,15
command[check_zombies]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_all_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
2、配置监控端
1)安装NRPE
# tar -zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure --with-nrpe-user=nagios \
--with-nrpe-group=nagios \
--with-nagios-user=nagios \
--with-nagios-group=nagios \
--enable-command-args \
--enable-ssl
# make all
# make install-plugin
安装完成后,/usr/local/nagios/libexec/check_nrpe就会生成此插件,可测试客户端工作正常于否
#cd /usr/local/nagios/libexec/
#./check_nrpe -H 192.168.1.124
NRPE v2.12
2)定义如何监控远程主机及服务:
通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:
check_nrpe -H
定义监控远程Linux主机的命令:
#vi /usr/local/nagios/etc/objects/commands.cfg 添加nrpe命令
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
************建立模板文件******************
#cd /usr/local/nagios/etc/objects/
#vim linhost.cfg 或是 #cp localhost.cfg linhost.cfg
***定义远程Linux主机:
define host{
use linux-server
host_name linhost
alias my Linux Host
address 192.168.1.124
}
如主机组不需要则注释,添加服务可参照被监控端
/usr/local/nagios/etc/nrpe.cfg中的
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Zcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
进行详细配置服务 ***定义远程Linux服务:也可以在后加参数进行设定监控
define service{
use generic-service host_name linhost service_description check_users check_command check_nrpe!check_users }# Create a service for monitoring the uptime of the server
# Change the host_name to match the name of the host you defined abovedefine service{ use generic-service host_name linhost service_description load check_command check_nrpe!check_load }# Create a service for monitoring CPU load
# Change the host_name to match the name of the host you defined abovedefine service{ use generic-service host_name linhost service_description sda1 check_command check_nrpe!check_sda1 }# Create a service for monitoring memory usage
# Change the host_name to match the name of the host you defined abovedefine service{ use generic-service host_name linhost service_description Zombie check_command check_nrpe!check_zombie_procs }define service{ use generic-service host_name linhost service_description Total procs check_command check_nrpe!check_total_procs }
3)将设定好的linhost.cfg文件添加至/usr/local/nagios/etc/nagios.cfg中
#vi /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/linhost.cfg
4)进行测试配置文件
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
5)重启服务#service nagios restart
#1.nagios的监控模式定义及监控模式的选择
1.1.主动模式,由nagios服务器端发出的请求主动探测就可以得到数据的监控模式,也就是说不需要再
客户端安装任何插件(适合对 端口 URL http ssh mysql rsync等监控)。当然主动模式也可以配置为被动模式探测
1.2.半被动模式,我们把负载,内存,硬盘,虚拟内存,磁盘IO,温度,风扇等
对于这些本地资源性能的监控,一般使用半被动模式(通过调用nrpe,snmp)
1.3.被动模式
主动模式:和nrpe无关了,就是利用服务端本地插件直获取信息
被动模式:主程序通过check_nrpe插件,和客户端nrpe进程沟通,调用本地插件获取数据
#2.配置服务端
[root@nagios tools]# ll /usr/local/nagios/
total 32
drwxrwxr-x 2 nagios nagios 4096 Jul 14 23:25 bin #命令的目录
drwxrwxr-x 3 nagios nagios 4096 Jul 14 23:25 etc #配置文件的目录
drwxr-xr-x 2 root root 4096 Jul 14 23:24 include
drwxrwxr-x 2 nagios nagios 4096 Jul 14 23:25 libexec #插件
drwxr-xr-x 5 root root 4096 Jul 14 23:24 perl
drwxrwxr-x 2 nagios nagios 4096 Jul 14 23:21 sbin #cgi 的程序
drwxrwxr-x 11 nagios nagios 4096 Jul 14 23:24 share #web程序,nagios界面展示的php程序
drwxrwxr-x 5 nagios nagios 4096 Jul 16 10:03 var #日志和数据
[root@nagios tools]# cd /usr/local/nagios/etc
[root@nagios etc]# ls -l
total 76
-rw-rw-r-- 1 nagios nagios 11669 Jul 14 23:21 cgi.cfg
-rw-r--r-- 1 root root 21 Jul 14 23:22 htpasswd.users #密码验证文件
-rw-rw-r-- 1 nagios nagios 44710 Jul 14 23:21 nagios.cfg #nagios主配置文件
-rw-r--r-- 1 nagios nagios 7207 Jul 14 23:25 nrpe.cfg
drwxrwxr-x 2 nagios nagios 4096 Jul 14 23:21 objects
-rw-rw---- 1 nagios nagios 1340 Jul 14 23:21 resource.cfg
#生成hosts.cfg文件
[root@nagios etc]# cd objects/
[root@nagios objects]# head -51 localhost.cfg >hosts.cfg
[root@nagios objects]# chown nagios.nagios /usr/local/nagios/etc/objects/hosts.cfg
#生成 services.cfg文件
[root@nagios objects]# touch services.cfg
[root@nagios objects]# chown nagios.nagios /usr/local/nagios/etc/objects/services.cfg
[root@nagios objects]# ll
total 52
-rw-rw-r-- 1 nagios nagios 7716 Jul 14 23:21 commands.cfg #存放nagios 命令相关配置,实现nagios命令和linux系统命令关联
-rw-rw-r-- 1 nagios nagios 2166 Jul 14 23:21 contacts.cfg #存放报警联系人的相关配置文件
-rw-r--r-- 1 nagios nagios 1870 Jul 16 12:00 hosts.cfg #新增,存放具体被监控主机相关配置
-rw-rw-r-- 1 nagios nagios 5403 Jul 14 23:21 localhost.cfg
-rw-rw-r-- 1 nagios nagios 3124 Jul 14 23:21 printer.cfg
-rw-r--r-- 1 nagios nagios 0 Jul 16 12:03 services.cfg #新增,存放具体被监控服务相关配置
-rw-rw-r-- 1 nagios nagios 3293 Jul 14 23:21 switch.cfg
-rw-rw-r-- 1 nagios nagios 10812 Jul 14 23:21 templates.cfg #模板配置文件
-rw-rw-r-- 1 nagios nagios 3208 Jul 14 23:21 timeperiods.cfg #存放报警周期时间等相关配置
-rw-rw-r-- 1 nagios nagios 4019 Jul 14 23:21 windows.cfg
#修改 nagios.cfg 文件前,备份/etc 目录防止改错
[root@nagios etc]# cd ..
[root@nagios nagios]# tar zcvf etc.tar.gz ./etc/
./etc/
./etc/nagios.cfg
./etc/cgi.cfg
./etc/nrpe.cfg
./etc/htpasswd.users
./etc/objects/
./etc/objects/printer.cfg
./etc/objects/localhost.cfg
./etc/objects/contacts.cfg
./etc/objects/windows.cfg
./etc/objects/timeperiods.cfg
./etc/objects/switch.cfg
./etc/objects/commands.cfg
./etc/objects/templates.cfg
./etc/resource.cfg
[root@nagios nagios]# cd etc
[root@nagios etc]# vi nagios.cfg +34
#添加3行,
注释1行
# You can specify individual object config files as shown below:
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
#添加这2行
cfg_file=/usr/local/nagios/etc/objects/services.cfg
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
#注释1行,这是本机监控
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
# directive as shown below:
#添加1行(主动监控使用)
cfg_dir=/usr/local/nagios/etc/services #添加services(服务)目录包含
#cfg_dir=/usr/local/nagios/etc/servers #服务器
#cfg_dir=/usr/local/nagios/etc/printers #打印机
#cfg_dir=/usr/local/nagios/etc/switches #交换机
#cfg_dir=/usr/local/nagios/etc/routers #路由器
#创建services目录 并授权
[root@nagios etc]#cd /usr/local/nagios/etc
[root@nagios etc]# mkdir services
[root@nagios etc]# chown -R nagios.nagios services/
[root@nagios etc]# ll
total 80
-rw-rw-r-- 1 nagios nagios 11669 Jul 14 23:21 cgi.cfg
-rw-r--r-- 1 root root 21 Jul 14 23:22 htpasswd.users
-rw-rw-r-- 1 nagios nagios 44852 Jul 16 11:55 nagios.cfg
-rw-r--r-- 1 nagios nagios 7207 Jul 14 23:25 nrpe.cfg
drwxrwxr-x 2 nagios nagios 4096 Jul 16 12:03 objects
-rw-rw---- 1 nagios nagios 1340 Jul 14 23:21 resource.cfg
drwxr-xr-x 2 nagios nagios 4096 Jul 16 11:56 services #新增,存放主动监控项目
#配置服务端监控客户端
[root@nagios etc]# cd objects/
[root@nagios objects]# vi hosts.cfg
# Define a host for the local machine
define host{
use linux-server
host_name 1.3-samba
alias 1.3-samba
address 10.89.1.3
}
define host{
use linux-server
host_name 1.2-nagios
alias 1.2-nagios
address 10.89.1.2
}
define host{
use linux-server
host_name 1.34-web-lnmp
alias 1.34-web-lnmp
address 10.89.1.34
}
define host{
use linux-server
host_name 1.34-web
alias 1.34-web
address 10.89.1.34
}
# Define an optional hostgroup for Linux machines
define hostgroup{
hostgroup_name linux-servers ; The name of the hostgroup
alias Linux Servers ; Long name of the group
members 1.3-samba,1.2-nagios,1.34-web-lnmp,1.34-web
}
保存退出
检查语法,先编辑nagios文件,使出错信息显示出来
[root@nagios objects]# vim /etc/init.d/nagios +183
$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1; 修改为:
$NagiosBin -v $NagiosCfgFile
保存退出
检查语法,如果报错如下:
[root@nagios objects]# /etc/init.d/nagios checkconfig
Checking services...
Error: There are no services defined!
Checked 0 services.
解决方法:
[root@nagios objects]# vi services.cfg
define service {
use generic-service
host_name 1.3-samba
service_description Disk Partition
check_command check_nrpe!check_disk
}
再检查语法
[root@nagios objects]# /etc/init.d/nagios checkconfig
Total Warnings: 1
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
OK.
[root@nagios objects]#
如果报下面错误:
Error:Service check command 'check_nrpe' specified in service 'Disk Partition' for
host '10-client01' not defined anywhere!
则编辑commands.cfg 文件:
[root@nagios objects]# vi commands.cfg
再末尾加上
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
注意:command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ 实际上就是使用这个命令探测:
[root@nagios objects]#/usr/local/nagios/libexec/check_nrpe -H 10.89.1.3 -c check_disk
保存并退出
再检查语法
[root@nagios objects]# /etc/init.d/nagios checkconfig
没有错误的情况下:
[root@nagios objects]# /etc/init.d/nagios reload
Running configuration check...done.
Reloading nagios configuration...done
[root@nagios objects]#
wKioL1hblIjRr84hAABeqkBD1sI501.jpg-wh_50
如果登录后提示错误:
It appears as though you do not have permission to view information for any of the hosts you requested...
If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI
and check the authorization options in your CGI configuration file.
编辑 cgi.cfg
[root@nagios objects]# cd ../
[root@nagios etc]# vi cgi.cfg
# PHYSICAL HTML PATH
# This is the path where the HTML files for Nagios reside. This
# value is used to locate the logo images needed by the statusmap
# and statuswrl CGIs.
physical_html_path=/usr/local/nagios/share
:g/nagiosadmin/s//alvin/g #替换nagiosadmin 为 alvin,该用户是我安装的时候添加的
保存并退出
[root@nagios objects]# /etc/init.d/nagios reload
1.主动监控模式
监控客户端LNMP 网站服务
服务器端:
[root@nagios]#cd /usr/local/nagios/etc/objects
[root@nagios objects]# vi commands.cfg
#在最下面增加:
# 'check_weburl' command definition
define command{
command_name check_weburl
command_line $USER1$/check_http $ARG1$ -w 10 -c 30
}
保存退出
[root@nagios objects]#cd /usr/local/nagios/etc/services
创建主动模式监控配置文件webzd.cfg
[root@nagios objects]#vi webzd.cfg
define service {
use generic-service
host_name 1.34-web
service_description blog_ip
check_command check_weburl! -I 10.89.1.34
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
process_perf_data 1
}
define service {
use generic-service
host_name 1.34-web
service_description blog_url
check_command check_http! -H bolg.etiantian.org
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service {
use generic-service
host_name 1.34-web
service_description blog_port_80
check_command check_tcp!80
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service {
use generic-service
host_name 1.34-web
service_description ssh_port
check_command check_tcp! 22
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service {
use generic-service
host_name 1.34-web
service_description mysql_port
check_command check_tcp!3306
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service {
use generic-service
host_name 1.34-web
service_description rsync
check_command check_tcp!873
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
保存并退出
检查语法
[root@@nagios objects]# /etc/init.d/nagios checkconfig
没有错误的情况下:
[root@@nagios objects]# /etc/init.d/nagios reload
------------------------------------------------------------------
自定义插件:监控密码文件是否改变
客户端测试:
将/etc/passwd生成md5值
[root@weblnmp ~]# md5sum /etc/passwd
5e2ebd59c3ebb7bd3c4b09b0674ca746 /etc/passwd
保存到/etc/alvin.md5 (文件名随便取,存放的位置任意)
[root@weblnmp ~]# md5sum /etc/passwd >/etc/alvin.md5
分析md5值是否变化,没有变化显示"OK"
[root@weblnmp ~]# md5sum -c /etc/alvin.md5
/etc/passwd: OK
实战:
1.在客户端添加自定义脚本
cd /usr/local/nagios/libexec
cat check_passwd
#!/bin/bash
char=`md5sum -c /etc/alvin.md5 2>/dev/null|grep "OK"|wc -l`
if [ $char -eq 1 ];
then
echo "passwd is ok"
exit 0
else
echo "passwd is changed"
exit 2
fi
#添加执行的权限
[root@weblnmp libexec~]# chmod +x check_passwd
[root@weblnmp libexec~]# ll check_passwd
-rwxr-xr-x 1 root root 166 Jul 22 21:33 check_passwd
2.在客户端增加命令,并重启nrpe服务使之生效
[root@weblnmp libexec~]#vim /usr/local/nagios/etc/nrpe.cfg
添加check_passwd定义命令
command[check_passwd]=/usr/local/nagios/libexec/check_passwd
[root@weblnmp libexec~]#pkill nrpe
#重新启动nrpe
[root@weblnmp libexec~]#/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
#查看是否启动了
[root@weblnmp libexec~]ps -ef|grep nrpe
nagios 64672 1 0 Dec22 ? 00:00:21 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 72255 72099 0 11:48 pts/0 00:00:00 grep nrpe
3.在服务端测试
[root@nagios services]# /usr/local/nagios/libexec/check_nrpe -H 10.89.1.34 -c check_passwd
passwd is ok
添加服务脚本
[root@nagios ~]#cd /usr/local/nagios/etc/objects/
[root@nagios objects]# vi services.cfg
#在后面添加
define service{
use generic-service
host_name 1.34-web-lnmp
service_description check_passwd
check_command check_nrpe!check_passwd
}
检测语法并重新加载/etc/init.d/nagios checkconfig
没有错误的情况下:
[root@nagios objects]# /etc/init.d/nagios reload
4.改变性测试在客户端执行添加用户命令
[root@weblnmp ~]#useradd jack
服务端执行
[root@nagios services]# /usr/local/nagios/libexec/check_nrpe -H 10.89.1.34 -c check_passwd
passwd is changed
nagios--check_redis监控redis
#!/bin/bash
redis_bin='/home/app/redis/src'
redis_ip=(192.168.1.161 192.168.1.162 192.168.1.163 192.168.1.164)
redis_master_port='6379'
redis_slave_port='6380'
for (( i = 0; i < 1; i++ )); do
ALIVE_master=''$redis_bin'/redis-cli -h '${redis_ip[$i]}' -p '$redis_master_port' ping'
ALIVE_slave=''$redis_bin'/redis-cli -h '${redis_ip[$i]}' -p '$redis_slave_port' ping'
if [ `$ALIVE_master` == "PONG" ] && [ `$ALIVE_slave` == "PONG" ]; then
echo "redis ${redis_ip[$i]} is healthy."
exit 0
else
echo "the redis ${redis_ip[$i]} 6379 or 6380 is down."
exit 1
fi
done