Chinaunix首页 | 论坛 | 博客
  • 博客访问: 12735
  • 博文数量: 7
  • 博客积分: 145
  • 博客等级: 入伍新兵
  • 技术积分: 80
  • 用 户 组: 普通用户
  • 注册时间: 2011-10-13 05:46
文章分类
文章存档

2011年(7)

我的朋友

分类: LINUX

2011-10-21 22:38:15

Nagios是一款用于系统和网络监控的应用程序。它可以在你设定的条件下对主机和服务进行监控,在状态变差和变好的时候给出告警信息。

nagios特征简单说明:

监控网络服务(SMTPPOP3HTTPNNTPPING等);
监控主机资源(处理器负荷、磁盘利用率等);

简单地插件设计使得用户可以方便地扩展自己服务的检测方法;

并行服务检查机制;

具备定义网络分层结构的能力,用"parent"主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机宕机或不可达状态;

当服务或主机问题产生与解决时将告警发送给联系人(通过EMail、短信、用户定义方式);

具备定义事件句柄功能,它可以在主机或服务的事件发生时获取更多问题定位;

自动的日志回滚;

可以支持并实现对主机的冗余监控;

可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等


nagios-3.0.6.tar.gz -----------------------
主程序
nagios-plugins-1.4.13.tar.gz------------------
插件
nrpe_2.8.1.tar.gz --------------------------
监控Linux需要
nsclient 0.3.5 ---------------------------
监控windows需要

nagios服务器端(192.168.1.176
linux
被监控端 (192.168.1.175

一。安装
nagios
服务器端配置

1.准备软件包 (我偷懒了,嘿嘿)
yum install httpd
yum install gcc
yum install glibc glibc-common
yum install gd gd-devel
yum install mysql mysql-server mysql-devel
yum install gnutls


2.
建立用户
useradd nagios
passwd nagios

建立一个用户组名为nagcmd组,用于web借口执行外部命令。并将nagios用户和apache用户都加到这个组中
groupadd nagcmd
usermod -G nagcmd nagios
usermod -G nagcmd apache

3.下载nagios和插件程序包
wget

wget

wget


4.
安装nagiso
tar xzf nagios-3.0.6.tar.gz
cd nagios-3.0.6.tar.gz
运行Nagios配置脚本并使用先前开设的用户及用户组:

./configure --with-group=nagios --with-user=nagios --with-command-group=nagcmd --with-gd-lib=/usr/lib --with-gd-inc=/usr/include

编译Nagios程序包源码

make all

安装二进制运行程序、初始化脚本、配置文件样本并设置运行目录权限

make install
make install-init
make install-config
make install-commandmode

5.定义收报警邮件的邮箱
vi /usr/local/nagios/etc/objects/contacts.cfg
更改email地址nagiosadmin的联系人定义信息中的EMail信息为你的EMail信息以接收报警内容。

6.配置web接口

安装Nagios WEB配置文件到Apacheconf.d 目录下

make install-webconf

创建一个nagiosadmin 的用户用于NagiosWEB接口登录。记下你所设置的登录口令,一会儿你会用到它。

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
输入密码 (记住密码,这是你等下登陆nagios web页面的用户名和密码)

重启Apache服务以使设置生效。

service httpd restart

chown -R nagiso.nagiso /usr/local/nagios/etc/htpasswd.users
(
这个一定要修改,这个属主权限没有更改为nagios的话,web页面很多没有权限打开,我因为这个,调试了很久)

编辑httpd.conf配置文件
vi /etc/httpd/conf/httpd.conf
在配置文件最后添加如下内容
ScriptAlias "/nagios/cgi-bin" "/usr/local/nagios/sbin"

   Options ExecCGI
   AllowOverride None
   Order allow,deny
   Allow from all
   AuthName "Nagios Access"
   AuthType Basic
   AuthUserFile /usr/local/nagios/etc/htpasswd.users
   Require valid-user

Alias /nagios "/usr/local/nagios/share"

   Options None
   AllowOverride None
   Order allow,deny
   Allow from all
   AuthName "Nagios Access"
   AuthType Basic
   AuthUserFile /usr/local/nagios/etc/htpasswd.users
   Require valid-user

重启apache
killall httpd
service httpd restart
[root@duoduo-test /]# service httpd restart
Stopping httpd:                                            [  OK  ]
Starting httpd: [Fri Mar 26 00:51:01 2010] [warn] The ScriptAlias directive in /etc/httpd/conf/httpd.conf at line 992 will probably never match because it overlaps an earlier ScriptAlias.
[Fri Mar 26 00:51:01 2010] [warn] The Alias directive in /etc/httpd/conf/httpd.conf at line 1003 will probably never match because it overlaps an earlier Alias.
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
                                                          [  OK  ]
重启httpd服务,会出现警告信息,但是不会影响nagios的运行,此问题我在网上查询了很久的资料,没有明确的方法

7.编译比安装nagios插件
tar -zxvf nagios-plugins-1.4.11.tar.gz
cd nagios-plugins-1.4.11
./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-gourp=nagios
make&&make install

8.验证nagios的样例配置文件
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
出现这样的就代表没有错误,假如有错误,会指出哪个配置文件哪行有错误,只要去修改就行
[root@duoduo-test local]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios 3.0.6
Copyright (c) 1999-2008 Ethan Galstad (
)
Last Modified: 12-01-2008
License: GPL

Reading configuration data...

Running pre-flight check on configuration data...

Checking services...
        Checked 19 services.
Checking hosts...
        Checked 2 hosts.
Checking host groups...
        Checked 1 host groups.
Checking service groups...
        Checked 0 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 25 commands.
Checking time periods...
        Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
[root@duoduo-test local]#

chkconfig --add nagios
chkconfig nagios on

如果没有报错,可以启动Nagios服务
service nagios start

9.关闭selinux
vi /etc/sysconfig/selinux

SELINUX=disabled

selinux设置为disabled状态,重启系统使selinux配置生效

10。登陆web界面查看nagiso

输入刚刚设置的nagiosadmin的用户民和密码就ok
另外。我配置的时候遇到了2个问题

1)关于cgi的权限问题无法分配
修改/usr/local/nagios的属主组权限为nagios
2)页面无法显示的

编辑vi /usr/local/nagios/etc/cgi.cfg
use_authentication=1
修改为0

安装完毕!!!!

二。监控配置
linux
系统
1.
被监控端端配置(192.168.1.175),需要安装nrpe_2.8.1.tar.gz和插件nagios-plugins-1.4.13.tar.gz

useradd nagios (新建用户nagios
passwd nagios  
(修改密码)
wget
(下载nagiso插件)
tar -zxvf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure
make
make install
编译完后,会在/usr/local/nagios/下生成两个目录libexecshare,请查看

chown -R nagios.nagios /usr/local/nagios  (
修改目录属主)

2.安装nrpe
tar -zxvf nrpe_2.8.1.tar.gz
cd nrpe_2.8.1
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config

vi /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1改为192.168.1.176(我的nagios服务器端)

修改成你的nagios服务器的ip

启动nrpe
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

查看5666端口是否已监听,防火墙开放5666端口
netstat -antl | grep 5666

可以看到里面监控对象
vi /usr/local/nagios/etc/nrpe.cfg
# The following examples use hardcoded command arguments...
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

nagiso服务器上配置(192.168.1.175
1
。安装nrpe
tar -zxvf nagios-nrpe_2.12.tar.gz
cd nagios-nrpe_2.12
./configure
make all
make install-plugin

测试连通性
/usr/local/nagios/libexec/check_nrpe -H
被监控端的IP
[root@duoduo-test local]# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.175
NRPE v2.8.1

如果返回nrpe的版本号,就正常啰
如果返回连接拒绝,那就先telnet ip 5666,然后在查看iptables的策略

3.修改配置文件
1
)。定义nrpe
由于nrpe为外构组件,所以必须在commands.cfg中定义

[root@duoduo-test local]# vi /usr/local/nagios/etc/objects/commands.cfg
配置文件最下面添加

#check nrpe
define command{
        command_name check_nrpe
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }

2.定义监控对象的配置文件
vi /usr/local/nagios/etc/nagios.cfg
添加
cfg_file=/usr/local/nagios/etc/objects/linuxserver.cfg
配置文件名linuxserver.cfg可以自己更改但是要以.cfg为后缀

新建linuxserver.cfg
vi /usr/local/nagios/etc/objects/linuxserver.cfg
添加

define host{
           use          linux-server
          host_name     aiyo-mailserver
          alias         aiyo-mailserver
          address               210.51.47.213
        }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     HTTP
         check_command     check_http
       }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     FTP
        check_command   check_ftp
        }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     SSH
        check_command   check_ssh
       }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     SMTP
        check_command   check_smtp
       }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     POP3
        check_command   check_pop
       }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     check-swap
        check_command           check_nrpe!check_swap
        }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     check-load
        check_command           check_nrpe!check_load
         }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     check-disk
         check_command           check_nrpe!check_had1
         }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     zombie_procs
        check_command           check_nrpe!check_zombie_procs
               }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     check-users
        check_command           check_nrpe!check_users
               }

define service{
        use             generic-service
        host_name       aiyo-mailserver
        service_description     total_procs
        check_command           check_nrpe!check_total_procs
                   }
保存退出

此配置文件中定义了对象和服务

2.检测配置文件的正确性
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
重启nagios,查看web页面

4.针对mysql监控
定义mysql命令
vi /usr/local/nagios/etc/commands.cfg
在最后增加
# 'check_mysql' command definition
define command{
        command_name check_mysql
        command_line $USER1$/check_Mysql -H $HOSTADDRESS$ -u nagios -d nagdb
        }

vi /usr/local/nagios/etc/objects/linuxserver.cfg
增加mysql的监控

define service{
       use             generic-service
        host_name       linux-192.168.1.175
       service_description     mysql
       check_command   check_mysql
      }

检测配置文件正确性
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

重启nagios
killall nagios
service nagios restart

PS:
增加nagiosnrpe开机自动运行

echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d"  >> /etc/rc.local
ehco "/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg" >>/etc/rc.local

 

阅读(1534) | 评论(0) | 转发(0) |
0

上一篇:没有了

下一篇:Linux 开机启动流程

给主人留下些什么吧!~~