Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1845087
  • 博文数量: 636
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3950
  • 用 户 组: 普通用户
  • 注册时间: 2014-08-06 21:58
个人简介

博客是我工作的好帮手,遇到困难就来博客找资料

文章分类

全部博文(636)

文章存档

2024年(5)

2022年(2)

2021年(4)

2020年(40)

2019年(4)

2018年(78)

2017年(213)

2016年(41)

2015年(183)

2014年(66)

我的朋友

分类: 系统运维

2017-02-20 17:52:30

服务器端安装 

1. 查看安装服务器环境(LAMP) 

2. #rpm -qa | grep httpd 

3. #rpm -qa | grep php 

4. 没有的话安装 

5. # yum -y install gcc glibc glibc-common gd gd-devel php openssl-devel httpd 

6. 创建用户: 

7. # useradd -m -s /bin/bash nagios 

8.  

9. # usermod -G nagios nagios 

10.# vi /etc/passwd 

11.nagios:x:500:500::/home/nagios:/sbin/nologin 

12.改成: 

13.nagios:x:500:500::/home/nagios:/bin/bash 

14.创建一个用户组名为nagcmd 用于从Web接口执行外部命令。将nagios用户和apache用户都加到这个组中。 

15.因为要用到 CGI 的 Web 监控面板,所以这里我们还要添加一个 nagcmd 组,用于 CGI 执行相关指令。 

16.# /usr/sbin/groupadd nagcmd 

17.# /usr/sbin/usermod -G nagcmd nagios 

18.# /usr/sbin/usermod -a -G nagcmd daemon (因为是编译方式安装的apache,默认是以daemon用户运行

下载相关的软件包,


服务器端需要安装以下三个包,客户端只需要安装后两个插件包


1. [root@server ~]#cd /usr/local/src/ 

[root@server src]#

[root@server tarbag]# wget    

4. [root@server tarbag]#wget

解压并编译安装Nagios:

# tar xvzf nagios-3.4.3.tar.gz

# cd nagios     

运行Nagios配置脚本并使用先前开设的用户及用户组:

1. # ./configure --prefix=/usr/local/nagios --with-command-group=nagcmd 

编译Nagios程序包源码:

1. # make all 

安装二进制运行程序、初始化脚本、配置文件样本并设置运行目录权限:

1. # make install 

2. # make install-init          //在/etc/rc.d/init.d安装启动脚本 

3. # make install-config       //安装示例配置文件,安装的路径是/usr/local/nagios/etc 

4. # make install-commandmode        //配置目录权限 

5. #ls /usr/local/nagios/ 

6. bin etc libexec sbin share var 

三、配置nagios网页访问

1. 配置httpd 

2. 生成Nagios的Apache配置文件 

3. # cd /uer/local/src/nagios 

4. # make install-webconf 

5.   /usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf 

6. # cd sample-config 

7. 参考sample-config/httpd.conf配置内容添加到Apache的httpd.conf配置文件中 

8. 创建一个nagiosadmin的用户用于Nagios的Apache接口登录。记下你所设置的登录口令,一会儿你会用到它。 

9. # htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin 

10.密码:nagiosadmin 

重启Apache服务以使设置生效,访问看是否正常。

四、对nagios进行配置

样例配置文件默认安装在这个目录下/usr/local/nagios/etc,这些样例文件可以配置Nagios使之正常运行,只需要做一个简单的修改...

用你擅长的编辑器软件来编辑这个/usr/local/nagios/etc/objects/contacts.cfg配置文件,更改email地址nagiosadmin的联系人定义信息中的EMail信息为你的EMail信息以接收报警内容。

1. vi /usr/local/nagios/etc/objects/contacts.cfg 

1、安装nagios插件

11.#cd ../ 

12.#tar zxvf nagios-plugins-1.4.16.tar.gz   

#cd nagios-plugins-2.1.4   

14.#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios/   //指定安装目录及用户和组 

15.#make;make install 

16. 

17.安装NRPE插件,想获取客户机上更为详细的信息,还必须在服务器及客户端上安装NRPE插件。 

18.#cd .. 

19.#tar zxvf nrpe-2.12.tar.gz 

20.#cd nrpe-2.12 

21.#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios/ 

22.# make all 

23.# make install-plugin;make install-daemon;make install-daemon-config 

24.# ls /usr/local/nagios/libexec/  

25.check_apt check_ftp check_mailq check_overcr check_tcp ....... 

26.验证Nagios的样例配置文件 

27.# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg 

28.如果没有报错,可以启动Nagios服务 

29.启动httpd及nagios服务并验证 

30.#chkconfig --add nagios   //设置nagios及http开机自启动 

31.#chkconfig nagios on 

32.#chkconfig httpd on 

33.#service nagios start 

34.#service httpd start 

2、客户端安装

1. #useradd -s /sbin/nologin nagios //添加nagios用户 

2. 安装nagios-plugins 

3. # tar -zxvf nagios-plugins-1.4.15.tar.gz 

4. # cd nagios-plugins-1.4.15 

5. # ./configure --prefix=/usr/local/nagios 

6. # make 

7. # make install 

8. # chown nagios.nagios /usr/local/nagios/ 

9. # chown -R nagios.nagios /usr/local/nagios/libexec/ 

10.安装nrpe插件 

11.# tar -zxvf nrpe-2.12.tar.gz 

12.# cd nrpe-2.12 

13.# ./configure --prefix=/usr/local/nagios/  

14.#                

15.# make install-plugin   安装check_nrpe这个插件 

16.# make install-daemon    安装daemon 

17.# make install-daemon-config   安装配置文件 

18.如果安装时报错:checking for SSL headers... configure: error: Cannot find ssl headers 

19.# rpm -qa|grep openssl 

20.openssl-devel-0.9.8e-12.el5_4.6 

21.openssl-0.9.8e-12.el5_4.6 

22.yum install openssl-devel 

23.或者下载:

24.tar zxvf openssl-1.0.0a.tar.gz 

25.cd openssl-1.0.0a 

26../config 

27.make 

28.make test 

29.make install 

30.修改客户端配置文件 

31.vi /usr/local/nagios/etc/nrpe.cfg 

32.server_port:5666 

33.allowed_hosts=127.0.0.1,192.168.1.95   //添加服务器端的IP地址 

34.指定nagios监控主机ip,多个ip用逗号分隔,后面的IP地址,是nagios服务端的ip地址,也就是说只允许指定的ip通过nrpe开的端口5666取得本机的信息。 

35.然后修改nrpe.cfg中的command部分。 

36.启动NRPE守护进程:(可以将此命令加入/etc/rc.local,以便开机自动启动) 

37.#/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 

38.可以将此命令加入/etc/rc.local,以便开机自动启动 

39.echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.local 

40.#netstat -utpln |grep nrpe        //查看nrpe进程是否已正常启动 

41.#/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 NRPE v2.14    //nrpe测试结果,此结果为nrpe已经正常工作了 

42.然后在nagios监控服务器上测试 

43.#/usr/local/nagios/libexec/check_nrpe -H 192.168.1.77//被监控主机ip 

44.返回信息被监控服务器上安装的NRPE版本:NRPE v2.12 

3、定义监控内容

1. # vi /usr/local/nagios/etc/nrpe.cfg   //定义监控服务器内容 

2. command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 #监控登陆的用户数量 

3. command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 #监控CPU的负载 

4. command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2 #监控磁盘利用率,这里的sda2必须是实际的硬盘分区,可使用fdisk –l查 

5. command[check_swap]=/usr/local/nagios//libexec/check_swap -w 20 -c 10 #监控交换空间  command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z #监控进程中的僵尸进程 

6. command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 #监控所有进程 

7. 注意:command后面括号中的内容就是定义的变量,变量名可以任意指定,只需和服务器配置文件中的一致即可 

4、自动添加主机服务

 

 

 

 

五、安装中遇到错误

[root@localhost nagios]# make all

cd ./base && make

make[1]:Entering directory '/tmp/nagios/base'

make[1]:*** No rule to make target '/include/locations.h', needed by'broker.o'. Stop.

make[1]:Leaving directory '/tmp/nagios/base'

make:***[all]Error 2

[root@localhost nagios]#

 

解决办法:

安装perl

 yum –yinstall  perl

重新编译即可




一、Nagios服务端安装

1、安装所需依赖关系包

2、添加Nagios所需用户及组

3、编译安装Nagios及创建登陆Nagios WEB程序用户

4、Nagios-plugin(插件)

5、配置服务自启动

二、基于NRPE配置Nagios监控Win主机

    1、被监控端

        安装:NSClient++-0.3.9-x64

    2、监控端

        1.测试与被监控端连通性

        2.监控端定义命令、定义主机、定义服务

        3.将定义好的模板加入到nagios.cfg文件中

        4.重启服务

三、基于NRPE监控Linux主机

    1、被监控端:

        1.添加用户

        2.安装插件nagios-plugins-1.4.15

        3.安装NRPE

        4.配置NRPE配置文件 

                #vi /usr/local/nagios/etc/nrpe.cfg

        5.定义nrpe启动脚本且增加权限

        6.添加自启动

        7.启动服务

    2、配置监控端:

        1.安装NRPE

            安装完成后,生成check_nrpe,使用此插件进行测试被监控主机

        2.定义命令

        3.定义主机和服务

        4.将定义好的linhost.cfg配置文件的路径添加至/usr/localhost/etc/nagios.cfg中

        5.测试配置文件  /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

        6.重启服务

        7.网页检查hosts监控状况

******另关于基于NRPE监控windows主机,另行查询网上资料******



一、安装配置Nagios


1、解决安装Nagios的依赖关系:


yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server


2、添加nagios运行所需要的用户和组:

# groupadd  nagcmd

# useradd -G nagcmd nagios

# passwd nagios

# usermod -a -G nagcmd apache

3、编译安装nagios:

# tar zxf nagios-3.3.1.tar.gz

# cd nagios-3.3.1

# ./configure --with-command-group=nagcmd --enable-event-broker

# make all

# make install

# make install-init

# make install-commandmode

# make install-config

# vi /usr/local/nagios/etc/objects/contacts.cfg

emailnagios@localhost       #这个是默认设置

# make install-webconf

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

# service httpd restart

4、编译、安装nagios-plugins

# tar zxf nagios-plugins-1.4.15.tar.gz

# cd nagios-plugins-1.4.15

# ./configure --with-nagios-user=nagios --with-nagios-group=nagios

# make

# make install

5、配置并启动Nagios    

 #vi /usr/local/nagios/etc/nagios.cfg

# chkconfig --add nagios

# chkconfig nagios on

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# service nagios start

6、配置selinux

#getenforce

#setenforce 0

7、通过web界面查看nagios:

登录时需要指定前面设定的web认证帐号和密码。


二、配置文件

Nagios的主配置文件

 /usr/local/nagios/etc/nagios.cfg

Nagios模板配置目录

/usr/local/nagios/etc/objects

调用check命令目录/usr/local/nagios/libexec


三、基于NSClinet++  监控远程Win主机

1、安装配置被监控端

    安装NSClient++-0.3.9-x64

2、进行测试是否连通


#cd /usr/local/nagios/libexec


#./check_nt -H 192.168.1.119 -v UPTIME -p 12489 


如有密码则:#./check_nt -H 192.168.1.119 -v UPTIME -p 12489 -s luoxj,123


3、监控端进行配置

&&&定义commands.cfg-------------------定义命令

 #cd /usr/local/nagios/etc/objects/

 #vi commands.cfg   


define command{

        command_name    check_nt

        command_line    $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$

        }

&&&定义主机及服务

#vi windows.cfg

define host{

        use             windows-server  

        host_name       winhost

        alias           My Windows Host

        address         192.168.1.119


        }

define service{        use                     generic-service        host_name               winhost        service_description     NSClient++ Version        check_command           check_nt!CLIENTVERSION        }定义服务可根据实际情况进行变更名称可使用vim中替换进行:.,$s@winserver@winhost@g

&&&启用定义的文件,增加定义文件路径

#vi /usr/local/nagios/etc/nagios.cfg

cfg_file=/usr/local/nagios/etc/objects/windows.cfg

&&&进行测试,以确定配置文件没有问题

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg -d

#service nagios restart


四、基于NRPE监控远程Linux主机

1、安装配置被监控端

    1)先添加nagios用户

    # useradd -s /sbin/nologin nagios

    2)NRPE依赖于nagios-plugins,因此,需要先安装之

    # tar zxf nagios-plugins-1.4.15.tar.gz

    # cd nagios-plugins-1.4.15

    # ./configure --with-nagios-user=nagios --with-nagios-group=nagios

    # make all

    # make instal

    3)安装NRPE

    # tar -zxvf nrpe-2.12.tar.gz

    # cd nrpe-2.12.tar.gz

    # ./configure --with-nrpe-user=nagios \

    --with-nrpe-group=nagios \

    --with-nagios-user=nagios \

    --with-nagios-group=nagios \

    --enable-command-args \

    --enable-ssl

    # make all

    # make install-plugin

    # make install-daemon

    # make install-daemon-config

    4)配置NRPE

    # vim /usr/local/nagios/etc/nrpe.conf

    log_facility=daemon

    pid_file=/var/run/nrpe.pid

    server_address=172.16.100.11

    server_port=5666

    nrpe_user=nagios

    nrpe_group=nagios

    allowed_hosts=172.16.100.1

    command_timeout=60

    connection_timeout=300


    debug=0

    &&&&&&&&定义监控对象命令&&&&&&&&&

    command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

    command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

    command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1

    command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z

    command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 

    5)启动NRPE

    # /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d

    为了便于NRPE服务的启动,可以将如下内容定义为/etc/init.d/nrped脚本:

    #vi /etc/init.d/nrped

    #!/bin/bash

    # chkconfig: 2345 88 12

    # description: NRPE DAEMON

NRPE=/usr/local/nagios/bin/nrpe

NRPECONF=/usr/local/nagios/etc/nrpe.cfg

case "$1" in

       start)

              echo -n "Starting NRPE daemon..."

              $NRPE -c $NRPECONF -d

              echo " done."

              ;;

       stop)

              echo -n "Stopping NRPE daemon..."

              pkill -u nagios nrpe

              echo " done."

       ;;

       restart)

              $0 stop

              sleep 2

              $0 start

              ;;

       *)

              echo "Usage: $0 start|stop|restart"

              ;;

       esac


exit 0

    #chmod +x /etc/init.d/nrped

    #service nrped start

    #netstat -tnlp    ##检查nrpe端口5666是否启用

        tcp        0      0 0.0.0.0:5666                0.0.0.0:*                   LISTEN      17282/nrpe 

    #service iptables stop

    #setenforce 0

    6)配置允许远程主机监控的对象

    在被监控端,可以通过NRPE监控的服务或资源需要通过nrpe.conf文件使用命令进行定义,定义命令的语法格式为:command[]=。比如:

    command[check_rootdisk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /

    command[check_swap]=/usr/local/nagios/libexec/check_disk -w 40% -c 20%

    command[check_sensors]=/usr/local/nagios/libexec/check_sensors

    command[check_users]=/usr/local/nagios/libexec/check_users -w 10 -c 20

    command[check_load]=/usr/local/nagios/libexec/check_load -w 10,8,5 -c 20,18,15

    command[check_zombies]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z

    command[check_all_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

2、配置监控端

1)安装NRPE

# tar -zxvf nrpe-2.12.tar.gz

# cd nrpe-2.12.tar.gz

# ./configure --with-nrpe-user=nagios \

--with-nrpe-group=nagios \

--with-nagios-user=nagios \

--with-nagios-group=nagios \

--enable-command-args \

     --enable-ssl

# make all

# make install-plugin

安装完成后,/usr/local/nagios/libexec/check_nrpe就会生成此插件,可测试客户端工作正常于否

#cd /usr/local/nagios/libexec/

#./check_nrpe -H 192.168.1.124

NRPE v2.12

2)定义如何监控远程主机及服务:

通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:

check_nrpe -H [-n] [-u] [-p ] [-t ] [-c ] [-a ]

定义监控远程Linux主机的命令:

#vi /usr/local/nagios/etc/objects/commands.cfg   添加nrpe命令  


define command{

        command_name    check_nrpe

        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

}

************建立模板文件******************

#cd /usr/local/nagios/etc/objects/

#vim linhost.cfg   或是  #cp localhost.cfg linhost.cfg       

 ***定义远程Linux主机:

define host{

        use                     linux-server       

        host_name         linhost

        alias                   my Linux Host

        address              192.168.1.124

        }

如主机组不需要则注释,添加服务可参照被监控端

/usr/local/nagios/etc/nrpe.cfg中的

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Zcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 

进行详细配置服务 ***定义远程Linux服务:也可以在后加参数进行设定监控

 define service{

        use                     generic-service        host_name               linhost        service_description     check_users        check_command           check_nrpe!check_users        }# Create a service for monitoring the uptime of the server

# Change the host_name to match the name of the host you defined abovedefine service{        use                     generic-service        host_name               linhost        service_description     load        check_command           check_nrpe!check_load        }# Create a service for monitoring CPU load

# Change the host_name to match the name of the host you defined abovedefine service{        use                     generic-service        host_name               linhost        service_description     sda1        check_command           check_nrpe!check_sda1        }# Create a service for monitoring memory usage

# Change the host_name to match the name of the host you defined abovedefine service{        use                     generic-service        host_name               linhost        service_description     Zombie        check_command           check_nrpe!check_zombie_procs        }define service{        use                     generic-service        host_name               linhost        service_description     Total procs        check_command           check_nrpe!check_total_procs        }


3)将设定好的linhost.cfg文件添加至/usr/local/nagios/etc/nagios.cfg中


#vi /usr/local/nagios/etc/nagios.cfg


cfg_file=/usr/local/nagios/etc/objects/linhost.cfg


4)进行测试配置文件 

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

5)重启服务#service nagios restart




#1.nagios的监控模式定义及监控模式的选择


1.1.主动模式,由nagios服务器端发出的请求主动探测就可以得到数据的监控模式,也就是说不需要再

  客户端安装任何插件(适合对 端口 URL http ssh mysql rsync等监控)。当然主动模式也可以配置为被动模式探测


1.2.半被动模式,我们把负载,内存,硬盘,虚拟内存,磁盘IO,温度,风扇等

对于这些本地资源性能的监控,一般使用半被动模式(通过调用nrpe,snmp)


1.3.被动模式


主动模式:和nrpe无关了,就是利用服务端本地插件直获取信息


被动模式:主程序通过check_nrpe插件,和客户端nrpe进程沟通,调用本地插件获取数据


#2.配置服务端

[root@nagios tools]# ll /usr/local/nagios/

total 32

drwxrwxr-x  2 nagios nagios 4096 Jul 14 23:25 bin            #命令的目录

drwxrwxr-x  3 nagios nagios 4096 Jul 14 23:25 etc            #配置文件的目录

drwxr-xr-x  2 root   root   4096 Jul 14 23:24 include

drwxrwxr-x  2 nagios nagios 4096 Jul 14 23:25 libexec        #插件

drwxr-xr-x  5 root   root   4096 Jul 14 23:24 perl

drwxrwxr-x  2 nagios nagios 4096 Jul 14 23:21 sbin           #cgi 的程序

drwxrwxr-x 11 nagios nagios 4096 Jul 14 23:24 share          #web程序,nagios界面展示的php程序

drwxrwxr-x  5 nagios nagios 4096 Jul 16 10:03 var            #日志和数据


[root@nagios tools]# cd /usr/local/nagios/etc

[root@nagios etc]# ls -l

total 76

-rw-rw-r-- 1 nagios nagios 11669 Jul 14 23:21 cgi.cfg

-rw-r--r-- 1 root   root      21 Jul 14 23:22 htpasswd.users  #密码验证文件

-rw-rw-r-- 1 nagios nagios 44710 Jul 14 23:21 nagios.cfg      #nagios主配置文件

-rw-r--r-- 1 nagios nagios  7207 Jul 14 23:25 nrpe.cfg

drwxrwxr-x 2 nagios nagios  4096 Jul 14 23:21 objects

-rw-rw---- 1 nagios nagios  1340 Jul 14 23:21 resource.cfg


#生成hosts.cfg文件

[root@nagios etc]# cd objects/

[root@nagios objects]# head -51 localhost.cfg >hosts.cfg


[root@nagios objects]# chown nagios.nagios /usr/local/nagios/etc/objects/hosts.cfg 


#生成 services.cfg文件

[root@nagios objects]# touch services.cfg

[root@nagios objects]# chown nagios.nagios /usr/local/nagios/etc/objects/services.cfg 

[root@nagios objects]# ll

total 52

-rw-rw-r-- 1 nagios nagios  7716 Jul 14 23:21 commands.cfg      #存放nagios 命令相关配置,实现nagios命令和linux系统命令关联

-rw-rw-r-- 1 nagios nagios  2166 Jul 14 23:21 contacts.cfg      #存放报警联系人的相关配置文件

-rw-r--r-- 1 nagios nagios  1870 Jul 16 12:00 hosts.cfg         #新增,存放具体被监控主机相关配置

-rw-rw-r-- 1 nagios nagios  5403 Jul 14 23:21 localhost.cfg

-rw-rw-r-- 1 nagios nagios  3124 Jul 14 23:21 printer.cfg

-rw-r--r-- 1 nagios nagios     0 Jul 16 12:03 services.cfg      #新增,存放具体被监控服务相关配置

-rw-rw-r-- 1 nagios nagios  3293 Jul 14 23:21 switch.cfg

-rw-rw-r-- 1 nagios nagios 10812 Jul 14 23:21 templates.cfg     #模板配置文件

-rw-rw-r-- 1 nagios nagios  3208 Jul 14 23:21 timeperiods.cfg   #存放报警周期时间等相关配置

-rw-rw-r-- 1 nagios nagios  4019 Jul 14 23:21 windows.cfg


#修改 nagios.cfg 文件前,备份/etc 目录防止改错


[root@nagios etc]# cd ..

[root@nagios nagios]# tar zcvf etc.tar.gz ./etc/

./etc/

./etc/nagios.cfg

./etc/cgi.cfg

./etc/nrpe.cfg

./etc/htpasswd.users

./etc/objects/

./etc/objects/printer.cfg

./etc/objects/localhost.cfg

./etc/objects/contacts.cfg

./etc/objects/windows.cfg

./etc/objects/timeperiods.cfg

./etc/objects/switch.cfg

./etc/objects/commands.cfg

./etc/objects/templates.cfg

./etc/resource.cfg

[root@nagios nagios]# cd etc

[root@nagios etc]# vi nagios.cfg +34


#添加3行,


注释1行

     # You can specify individual object config files as shown below:

      cfg_file=/usr/local/nagios/etc/objects/commands.cfg

      cfg_file=/usr/local/nagios/etc/objects/contacts.cfg

      cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg

      cfg_file=/usr/local/nagios/etc/objects/templates.cfg

#添加这2行

      cfg_file=/usr/local/nagios/etc/objects/services.cfg    

      cfg_file=/usr/local/nagios/etc/objects/hosts.cfg

#注释1行,这是本机监控

      #cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

           

# directive as shown below:

#添加1行(主动监控使用)

     cfg_dir=/usr/local/nagios/etc/services  #添加services(服务)目录包含


    #cfg_dir=/usr/local/nagios/etc/servers  #服务器

    #cfg_dir=/usr/local/nagios/etc/printers #打印机

    #cfg_dir=/usr/local/nagios/etc/switches #交换机

    #cfg_dir=/usr/local/nagios/etc/routers  #路由器

#创建services目录 并授权

[root@nagios etc]#cd /usr/local/nagios/etc

[root@nagios etc]# mkdir services

[root@nagios etc]# chown -R nagios.nagios services/


[root@nagios etc]# ll

total 80

-rw-rw-r-- 1 nagios nagios 11669 Jul 14 23:21 cgi.cfg

-rw-r--r-- 1 root   root      21 Jul 14 23:22 htpasswd.users

-rw-rw-r-- 1 nagios nagios 44852 Jul 16 11:55 nagios.cfg

-rw-r--r-- 1 nagios nagios  7207 Jul 14 23:25 nrpe.cfg

drwxrwxr-x 2 nagios nagios  4096 Jul 16 12:03 objects

-rw-rw---- 1 nagios nagios  1340 Jul 14 23:21 resource.cfg

drwxr-xr-x 2 nagios nagios  4096 Jul 16 11:56 services          #新增,存放主动监控项目 


#配置服务端监控客户端


[root@nagios etc]# cd objects/


[root@nagios objects]# vi hosts.cfg

# Define a host for the local machine


define host{

        use                   linux-server

        host_name               1.3-samba

        alias                   1.3-samba

        address                 10.89.1.3

        }



define host{

       use                      linux-server

       host_name                1.2-nagios

       alias                    1.2-nagios

       address                  10.89.1.2

       }



define host{

       use                      linux-server

       host_name                1.34-web-lnmp

       alias                    1.34-web-lnmp

       address                  10.89.1.34

       }


define host{

       use                      linux-server

       host_name                1.34-web

       alias                    1.34-web

       address                  10.89.1.34

       }


# Define an optional hostgroup for Linux machines


define hostgroup{

        hostgroup_name  linux-servers ; The name of the hostgroup

        alias           Linux Servers ; Long name of the group

        members         1.3-samba,1.2-nagios,1.34-web-lnmp,1.34-web

        }


保存退出


检查语法,先编辑nagios文件,使出错信息显示出来


[root@nagios objects]# vim /etc/init.d/nagios +183


 $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;   修改为:

 $NagiosBin -v $NagiosCfgFile


保存退出


检查语法,如果报错如下:

[root@nagios objects]# /etc/init.d/nagios checkconfig

Checking services...

Error: There are no services defined!

        Checked 0 services.


解决方法:

[root@nagios objects]# vi services.cfg 


define service {

        use                     generic-service

        host_name               1.3-samba

        service_description     Disk Partition

        check_command           check_nrpe!check_disk



再检查语法

[root@nagios objects]# /etc/init.d/nagios checkconfig


Total Warnings: 1

Total Errors:   0


Things look okay - No serious problems were detected during the pre-flight check

 OK.

[root@nagios objects]# 


如果报下面错误:


Error:Service check command 'check_nrpe' specified in service 'Disk Partition' for

 host '10-client01' not defined anywhere!

则编辑commands.cfg 文件:

[root@nagios objects]# vi commands.cfg


再末尾加上


# 'check_nrpe' command definition

define command{

        command_name    check_nrpe

        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

        }


注意:command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ 实际上就是使用这个命令探测:

[root@nagios objects]#/usr/local/nagios/libexec/check_nrpe -H 10.89.1.3 -c check_disk


保存并退出

再检查语法

[root@nagios objects]# /etc/init.d/nagios checkconfig


没有错误的情况下:


[root@nagios objects]# /etc/init.d/nagios reload

Running configuration check...done.

Reloading nagios configuration...done

[root@nagios objects]# 


wKioL1hblIjRr84hAABeqkBD1sI501.jpg-wh_50



如果登录后提示错误:

It appears as though you do not have permission to view information for any of the hosts you requested...



If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI

and check the authorization options in your CGI configuration file.


编辑 cgi.cfg

[root@nagios objects]# cd ../

[root@nagios etc]# vi cgi.cfg


# PHYSICAL HTML PATH

# This is the path where the HTML files for Nagios reside.  This

# value is used to locate the logo images needed by the statusmap

# and statuswrl CGIs.


physical_html_path=/usr/local/nagios/share


:g/nagiosadmin/s//alvin/g           #替换nagiosadmin 为 alvin,该用户是我安装的时候添加的


保存并退出

[root@nagios objects]# /etc/init.d/nagios reload



1.主动监控模式

监控客户端LNMP 网站服务

服务器端:

[root@nagios]#cd /usr/local/nagios/etc/objects

[root@nagios objects]# vi commands.cfg 

#在最下面增加:


# 'check_weburl' command definition

define command{

        command_name    check_weburl

        command_line    $USER1$/check_http $ARG1$ -w 10 -c 30

        }

保存退出


[root@nagios objects]#cd /usr/local/nagios/etc/services


创建主动模式监控配置文件webzd.cfg


[root@nagios objects]#vi webzd.cfg

define service {

        use               generic-service

        host_name           1.34-web

        service_description    blog_ip

        check_command        check_weburl! -I 10.89.1.34

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval    1

        check_period         24x7

        notification_interval   30

        notification_period    24x7

        notification_options    w,u,c,r

        contact_groups        admins

        process_perf_data      1

}


define service {

        use               generic-service

        host_name           1.34-web

        service_description     blog_url

        check_command        check_http! -H bolg.etiantian.org 

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval    1

        check_period         24x7

        notification_interval   30

        notification_period    24x7

        notification_options    w,u,c,r

        contact_groups        admins


}



define service {

        use               generic-service

        host_name           1.34-web

        service_description    blog_port_80

        check_command        check_tcp!80

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval    1

        check_period         24x7

        notification_interval   30

        notification_period    24x7

        notification_options   w,u,c,r

        contact_groups       admins


}



define service {

        use               generic-service

        host_name           1.34-web

        service_description    ssh_port

        check_command        check_tcp! 22

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval    1

        check_period         24x7

        notification_interval   30

        notification_period    24x7

        notification_options    w,u,c,r

        contact_groups        admins


}

define service {

        use               generic-service

        host_name           1.34-web

        service_description    mysql_port

        check_command        check_tcp!3306

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval    1

        check_period         24x7

        notification_interval   30

        notification_period     24x7

        notification_options    w,u,c,r

        contact_groups        admins


}


define service {

        use               generic-service

        host_name           1.34-web

        service_description    rsync

        check_command        check_tcp!873

        max_check_attempts     3

        normal_check_interval   2

        retry_check_interval   1

        check_period         24x7

        notification_interval   30

        notification_period    24x7

        notification_options    w,u,c,r

        contact_groups        admins


}

保存并退出


检查语法

[root@@nagios objects]# /etc/init.d/nagios checkconfig


没有错误的情况下:


[root@@nagios objects]# /etc/init.d/nagios reload

------------------------------------------------------------------


自定义插件:监控密码文件是否改变



客户端测试:

将/etc/passwd生成md5值

[root@weblnmp ~]# md5sum /etc/passwd

5e2ebd59c3ebb7bd3c4b09b0674ca746  /etc/passwd

保存到/etc/alvin.md5 (文件名随便取,存放的位置任意)  

[root@weblnmp ~]# md5sum /etc/passwd >/etc/alvin.md5

分析md5值是否变化,没有变化显示"OK" 

[root@weblnmp ~]# md5sum -c /etc/alvin.md5

/etc/passwd: OK


实战:

1.在客户端添加自定义脚本

cd /usr/local/nagios/libexec


cat check_passwd


#!/bin/bash

char=`md5sum -c /etc/alvin.md5 2>/dev/null|grep "OK"|wc -l`

if [ $char -eq 1 ];

then

    echo "passwd is ok"

    exit 0

else

    echo "passwd is changed"

    exit 2

fi


#添加执行的权限

[root@weblnmp libexec~]# chmod +x check_passwd 

[root@weblnmp libexec~]# ll check_passwd 

-rwxr-xr-x 1 root root 166 Jul 22 21:33 check_passwd

2.在客户端增加命令,并重启nrpe服务使之生效

[root@weblnmp libexec~]#vim /usr/local/nagios/etc/nrpe.cfg

添加check_passwd定义命令

command[check_passwd]=/usr/local/nagios/libexec/check_passwd


[root@weblnmp libexec~]#pkill nrpe  

 #重新启动nrpe  

[root@weblnmp libexec~]#/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 

#查看是否启动了

[root@weblnmp libexec~]ps -ef|grep nrpe    

nagios    64672      1  0 Dec22 ?        00:00:21 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

root      72255  72099  0 11:48 pts/0    00:00:00 grep nrpe

3.在服务端测试

[root@nagios services]# /usr/local/nagios/libexec/check_nrpe -H 10.89.1.34 -c check_passwd

passwd is ok


添加服务脚本

[root@nagios ~]#cd /usr/local/nagios/etc/objects/

[root@nagios objects]# vi services.cfg

#在后面添加

define service{

        use               generic-service

        host_name           1.34-web-lnmp

        service_description    check_passwd

        check_command        check_nrpe!check_passwd

}


检测语法并重新加载/etc/init.d/nagios checkconfig

没有错误的情况下:

[root@nagios objects]# /etc/init.d/nagios reload


4.改变性测试在客户端执行添加用户命令 

[root@weblnmp ~]#useradd jack

服务端执行

[root@nagios services]# /usr/local/nagios/libexec/check_nrpe -H 10.89.1.34 -c check_passwd

passwd is changed


         nagios--check_redis监控redis


#!/bin/bash

 

redis_bin='/home/app/redis/src'

redis_ip=(192.168.1.161 192.168.1.162 192.168.1.163 192.168.1.164)

redis_master_port='6379'

redis_slave_port='6380'

 

for (( i = 0; i < 1; i++ )); do

        ALIVE_master=''$redis_bin'/redis-cli -h '${redis_ip[$i]}' -p '$redis_master_port' ping'

        ALIVE_slave=''$redis_bin'/redis-cli -h '${redis_ip[$i]}' -p '$redis_slave_port' ping'

 

if [ `$ALIVE_master` == "PONG" ] && [ `$ALIVE_slave` == "PONG" ]; then

        echo "redis ${redis_ip[$i]} is healthy."

        exit 0

else

        echo "the redis ${redis_ip[$i]} 6379 or 6380 is down." 

        exit 1

fi

 

done


阅读(1397) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~