Chinaunix首页 | 论坛 | 博客
  • 博客访问: 402057
  • 博文数量: 112
  • 博客积分: 10
  • 博客等级: 民兵
  • 技术积分: 800
  • 用 户 组: 普通用户
  • 注册时间: 2010-12-29 13:41
文章分类

全部博文(112)

文章存档

2020年(1)

2018年(10)

2017年(27)

2016年(18)

2015年(31)

2014年(25)

分类: 系统运维

2016-01-13 00:27:17

一、Nagios概述
Nagios是一款可运行在Linux和UNIX平台上的开源监控软件,能有效监控主机的运行状态、网络状态、各种系统问题及日志异常等,同时提供了三种预警手段:web、邮件、短信。
Nagios主要分为核心和插件两部分。核心只提供了很少一部分的监控功能,插件提供了大部分的监控功能。

二、Nagios服务端安装(192.168.1.125)
1、安装包下载地址
Nagios服务器:


汉化补丁:
被监控linux主机:


被监控windows主机:


2、准备软件包
    Nagios在nagios3.1.x版本之后,配置web监控界面需要php的支持。

点击(此处)折叠或打开

  1. yum -y install httpd gcc glibc glibc-common gd gd-devel php
  2. service httpd start
  3. chkconfig --level 35 httpd on

3、配置apache
修改/etc/httpd/conf/httpd.conf中apache进程启动用户为nagios

点击(此处)折叠或打开

  1. User apache
  2. Group apache
  3. 修改为:
  4. User nagios
  5. Group nagios

  6. DirectoryIndex index.html index.html.var
  7. 修改为:
  8. DirectoryIndex index.html index.php
  9. 接着添加下列内容:
  10. AddType application/x-httpd-php .php
  11. 安全起见,一般需求必须授权才能访问Nagios的web监控界面,需增加验证配置,在httpd.conf最后添加如下信息:
  12. #setting for nagios
    ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
     
         AuthType Basic
         Options ExecCGI
         AllowOverride None
         Order allow,deny
         Allow from all
         AuthName "Nagios Access"
    AuthUserFile /usr/local/nagios/etc/htpasswd
         Require valid-user

    Alias /nagios "/usr/local/nagios/share"
     
         AuthType Basic
         Options None
         AllowOverride None
         Order allow,deny
         Allow from all
         AuthName "nagios Access"
    AuthUserFile /usr/local/nagios/etc/htpasswd
         Require valid-user


 

  4、创建apache目录验证文件以及nagios登陆web页面账号密码:

点击(此处)折叠或打开

  1. htpasswd -c /usr/local/nagios/etc/htpasswd sxm
  2. New password:sxm123
  3. Re-type new password:sxm123
  4. Adding password for user sxm

apache报错及解决办法:

点击(此处)折叠或打开

  1. 启动apache遇到错误:httpd: Could not reliably determine the server's fully qualified domain name
  2. vim /etc/httpd/conf/httpd.conf
  3. #ServerName
  4. 改为
  5. ServerName localhost:80

5、安装nagios

点击(此处)折叠或打开

  1. /usr/sbin/useradd -s /sbin/nologin nagios
  2. mkdir /usr/local/nagios
  3. chown -R nagios.nagios /usr/local/nagios
  4. tar -zxvf nagios-3.4.3.tar.gz
  5. cd nagios
  6. ./configure --prefix=/usr/local/nagios    #指定nagios的安装目录/usr/local/nagios
  7. make all                                  
  8. make install                              #安装nagios主程序CGI和HTML文件
  9. make install-init                         #在/etc/rc.d/init.d/下创建nagios启动脚本
  10. make install-commandmode                  #配置目录权限
  11. make install-config                       #安装示例配置文件/usr/local/nagios/etc
  12. chkconfig --add nagios
  13. chkconfig --level 35 nagios on            #设置nagios开机启动
  14. service nagios start

Nagios安装成功后,/usr/local/nagios目录下生成六个目录:
bin                        Nagios可执行程序所在目录
etc                        Nagios配置文件所在目录
libexec                  Nagios外部插件所在目录
sbin                      NagiosCGI文件所在目录,执行外部命令所需文件所在目录
share                    Nagios网页文件所在目录
var                       Nagios日志文件,lock等文件所在目录
var/archives          Nagios日志自动归档目录
var/rw                  用来存放外部命令文件的目录

nagios新建用户后启动报错及解决办法:

点击(此处)折叠或打开

  1. nagios新建用户后启动报Starting nagios:This account is currently not available.
  2. [root@nagios conf]# service nagios restart
    Running configuration check...done.
    Stopping nagios: done.
    Starting nagios:This account is currently not available.
     done.
  3. vim /etc/passwd
  4. nagios:x:500:500::/home/nagios:/sbin/nologin
  5. 改为
  6. nagios:x:500:500::/home/nagios:/bin/bash
  7. 重启nagios
  8. [root@nagios conf]# service nagios restart
    Running configuration check...done.
    Stopping nagios: done.
    Starting nagios: done.
  9. /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg   #验证nagios配置文件
  10. /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg   #启动nagios守护进程模式



 
6、安装Nagios插件

点击(此处)折叠或打开

  1. tar -zxvf nagios-plugins-2.1.1.tar.gz
  2. cd nagios-plugins-2.1.1
  3. ./configure --prefix=/usr/local/nagios
  4. make
  5. make install

 
7、配置Nagios
Nagios/usr/local/nagios/etc目录下的配置文件:
cgi.cfg                            控制CGI访问的配置文件
nagios.cfg                       Nagios主配置文件
resource.cfg                    变量定义文件(资源文件),在此文件中定义变量,以便由其它配置文件引用,例如:$USER/$
objects                           objects目录下有很多配置文件模板,用于定义Nagios对象
objects/commands.cfg      命令定义配置文件,其中定义的命令可以被其它配置文件引用
例:添加一个监控网站页面的command,check_http
check_http参数
Usage:
 check_http -H  | -I  [-u ] [-p ]
       [-J ] [-K ]
       [-w ] [-c ] [-t ] [-L] [-E] [-a auth]
       [-b proxy_auth] [-f ]
       [-e ] [-d string] [-s string] [-l] [-r  | -R ]
       [-P string] [-m :] [-4|-6] [-N] [-M ]
       [-A string] [-k string] [-S ] [--sni] [-C [,]]
       [-T ] [-j method]

7.1、定义command.cfg中check_http命令

点击(此处)折叠或打开

  1. #-I主机IP地址,-u url,-p 端口,-s 关键词。
  2. define command{
  3.         command_name check_http_word
  4.         command_line $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ -s $ARG3$
  5.         }

7.2、检查定义好的命令

点击(此处)折叠或打开

  1. 检车网页中的关键词是否有welcome
  2. #/usr/local/nagios/libexec/check_http -I 192.168.1.124 -u /index.html -p 80 -s "welcome"
  3. HTTP OK: HTTP/1.1 200 ok - 280 bytes in 0.005 second response time |time=0.004636s;;;0.000000 size=280B;;;0
7.3、定义service.cfg

点击(此处)折叠或打开

  1. define service{
  2.         user local-service
  3.         host_name 192.168.1.124
  4.         service_description http_word
  5.         check_command check_http_word!/index.html!80!welcome
  6.         }

7.4、重启nagios,在nagios网页services中查看新添加的check_http_word服务。

objects/contacts.cfg          定义联系人和联系人组的配置文件
objects/localhost.cfg         定义监控本地主机的配置
objects/printer.cfg            定义监控打印机的一个配置文件模板,默认不启用此文件
objects/switch.cfg            监控路由器的一个配置文件模板,默认不启用此文件
objects/templates.cfg       定义主机和服务的一个模板配置文件,可在其它配置文件中引用
objects/timeperiods.cfg    定义Nagios监控时间段的配置文件
objects/windows.cfg        监控Windows主机的配置文件模板,默认不启用 

编辑一个新文件hosts.cfg:

点击(此处)折叠或打开

  1. define host{
  2. use linux-server
  3. host_name web1
  4. alias sxm-web1
  5. address 192.168.1.123
  6. }
  7. define host{
  8. use linux-server
  9. host_name web2
  10. alias sxm-web2
  11. address 192.168.1.124
  12. }
  13. define host{
  14. use linux-server
  15. host_name mysql
  16. alias sxm-mysql
  17. address 192.168.1.126
  18. }
  19. define hostgroup{
  20. hostgroup_name sxm-nagios
  21. alias sxm nagios
  22. members web1,web2,mysql
  23. }

编辑一个新文件services.cfg

点击(此处)折叠或打开

  1. ################# web #####################
  2. define service{
  3. use local-service
  4. host_name web1
  5. service_description PING
  6. check_command check_ping!100.0,20%!500.0,60%
  7. }
  8. define service{
  9. use local-service
  10. host_name web1
  11. service_description SSH
  12. check_command check_ssh
  13. }
  14. define service{
  15. use local-service
  16. host_name web1
  17. service_description ftp
  18. check_command check_tcp!21
  19. }
  20. define service{
  21. use local-service
  22. host_name web1
  23. service_description http
  24. check_command check_http
  25. }
  26. define service{
  27. use local-service
  28. host_name web2
  29. service_description PING
  30. check_command check_ping!100.0,20%!500.0,60%
  31. }
  32. define service{
  33. use local-service
  34. host_name web2
  35. service_description SSH
  36. check_command check_ssh
  37. }
  38. define service{
  39. use local-service
  40. host_name web2
  41. service_description ftp
  42. check_command check_tcp!21
  43. }
  44. define service{
  45. use local-service
  46. host_name web2
  47. service_description http
  48. check_command check_tcp!80
  49. }
  50. ################# mysql #####################
  51. define service{
  52. use local-service
  53. host_name mysql
  54. service_description PING
  55. check_command check_ping!100.0,20%!500.0,60%
  56. }
  57. define service{
  58. use local-service
  59. host_name mysql
  60. service_description SSH
  61. check_command check_ssh
  62. }
  63. define service{
  64. use local-service
  65. host_name mysql
  66. service_description ftp
  67. check_command check_ftp
  68. }
  69. define service{
  70. use local-service
  71. host_name mysql
  72. service_description mysqlport
  73. check_command check_tcp!3306
  74. }
编辑一个新文件servicegroup.cfg


点击(此处)折叠或打开

  1. define servicegroup{
  2. servicegroup_name servicegroup
  3. alias service_group
  4. members web1,PING,web1,SSH,web1,http,web2,PING,web2,SSH,web2,http,web2,users,web2,load,web2,disk,web2,swap
  5. }


修改cgi.cfg文件:

点击(此处)折叠或打开

  1. default_user_name=sxm
  2. authorized_for_system_information=nagiosadmin,sxm
  3. authorized_for_configuration_information=nagiosadmin,sxm
  4. authorized_for_system_commands=sxm
  5. authorized_for_all_services=nagiosadmin,sxm
  6. authorized_for_all_hosts=nagiosadmin,sxm
  7. authorized_for_all_service_commands=nagiosadmin,sxm
  8. authorized_for_all_host_commands=nagiosadmin,sxm

修改nagios.cfg

点击(此处)折叠或打开

  1. cfg_file=/usr/local/nagios/etc/hosts.cfg
  2. cfg_file=/usr/local/nagios/etc/services.cfg
  3. cfg_file=/usr/local/nagios/etc/objects/commands.cfg
  4. cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
  5. cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
  6. cfg_file=/usr/local/nagios/etc/objects/templates.cfg


8、Nagios服务端上安装nrpe外部扩展插件
    nrpe是Nagios的一个功能扩展,它通过远程服务器上安装的nrpe构件及Nagios插件程序来向Nagios服务器提供该服务器的一些本地情况,例如,CPU负载、内存使用、磁盘使用等。
 

点击(此处)折叠或打开

  1. [root@nagios tmp]# tar -zxvf nrpe-2.13.tar.gz
  2. [root@nagios tmp]# cd nrpe-2.13
  3. [root@nagios nrpe-2.13]# ./configure
  4. [root@nagios nrpe-2.13]# make all
  5. [root@nagios nrpe-2.13]# make install-plugin
  6. [root@nagios nrpe-2.13]# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.124
    NRPE v2.13

编译nrpe报错:”checking for SSL headers... configure: error: Cannot find ssl headers“
原因是缺少openssl-devel包
执行yum -y install openssl-devel  

9、定义一个check_nrpe监控命令
/usr/local/nagios/etc/objects/commands.cfg

点击(此处)折叠或打开

  1. define command{
  2. command_name check_nrpe
  3. command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
  4. }
添加远程主机监控
/usr/local/nagios/etc/services.cfg

点击(此处)折叠或打开

  1. define service{
           use                             local-service
           host_name                       web2
           service_description             users
           check_command                   check_nrpe!check_users
           }


    define service{
           use                             local-service
           host_name                       web2
           service_description             load
           check_command                   check_nrpe!check_load
           }


    define service{
           use                             local-service
           host_name                       web2
           service_description             disk
           check_command                   check_nrpe!check_sda3
           }


    define service{
           use                             local-service
           host_name                       web2
           service_description             swap
           check_command                   check_nrpe!check_swap
           }





三、配置Nagios客户端(192.168.1.124)
1、安装Nagios插件

点击(此处)折叠或打开

  1. [root@web2 tmp]# useradd -s /sbin/nologin nagios
  2. [root@web2 tmp]# yum -y install gcc glibc glibc-common gd gd-devel openssl-devel
  3. [root@web2 tmp]# tar -zxvf nagios-plugins-2.1.1.tar.gz
  4. [root@web2 tmp]# cd nagios-plugins-2.1.1
  5. [root@web2 nagios-plugins-2.1.1]# ./configure
  6. [root@web2 nagios-plugins-2.1.1]# make
  7. [root@web2 nagios-plugins-2.1.1]# make install
  8. [root@web2 ~]# chown nagios.nagios /usr/local/nagios
  9. [root@web2 ~]# chown -R nagios.nagios /usr/local/nagios/libexec
2、安装nrpe插件

点击(此处)折叠或打开

  1. [root@web2 tmp]# tar -zxvf nrpe-2.13.tar.gz
  2. [root@web2 tmp]# cd nrpe-2.13
  3. [root@web2 nrpe-2.13]# ./configure
  4. [root@web2 nrpe-2.13]# make all
  5. [root@web2 nrpe-2.13]# make install-plugin
  6. [root@web2 nrpe-2.13]# make install-daemon
  7. [root@web2 nrpe-2.13]# make install-daemon-config

3、配置nrpe
/usr/local/nagios/etc/nrpe.cof

点击(此处)折叠或打开

  1. allowed_hosts=127.0.0.1
  2. 改为
  3. allowed_hosts=127.0.0.1,192.168.1.125     #nagios监控服务端地址
4、启动nrpe守护进程

点击(此处)折叠或打开

  1. [root@web2 nrpe-2.13]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
  2. [root@web2 nrpe-2.13]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
  3. NRPE v2.13    #正常结果

5、定义监控服务器内容
在/usr/local/nagios/etc/nrpe.cfg中定义

点击(此处)折叠或打开

  1. #监控远程服务器的当前用户数
  2. command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
  3. #监控远程服务器的cpu负载
  4. command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
  5. #监控远程服务器的磁盘利用率
  6. command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
  7. #监控远程服务器僵尸进程
  8. command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
  9. #监控远程服务器进程总数
  10. command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
  11. #监控远程服务器的交换空间
  12. command[check_swap_]=/usr/local/nagios/libexec/check_swap -w 20 -c 10







 

阅读(1227) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~