Chinaunix首页 | 论坛 | 博客
  • 博客访问: 143769
  • 博文数量: 19
  • 博客积分: 216
  • 博客等级: 入伍新兵
  • 技术积分: 751
  • 用 户 组: 普通用户
  • 注册时间: 2011-01-05 11:30
个人简介

欢迎喜欢linux技术的朋友共同交流

文章分类
文章存档

2018年(2)

2014年(5)

2013年(8)

2012年(4)

我的朋友

分类: 系统运维

2013-10-08 14:41:27

Nagios+Centreon部署及使用文档

 

一、系统环境

1LAMP(参考http://blog.chinaunix.net/uid-8319462-id-3406527.html

安装完mysql后需要创建一个db_namenagios的数据库

2Nagios+Nrpe(参考http://blog.chinaunix.net/uid-8319462-id-3416628.html

3Ndoutils

需要使用创建的nagios数据库

4Rrdtool

5、  Centreon

 

二、安装Nagios

 

1、添加nagios用户

groupadd nagios

useradd -g nagios -d /usr/local/nagios -s /bin/false nagios

 

-d:指定登陆起始目录

-s:指定登陆后使用的shell

 

2、开始安装

tar zxvf nagios-3.2.0.tar.gz

cd nagios-3.2.0

 

./configure --prefix=/usr/local/nagios

make all

make install

make install-init

make install-commandmode

make install-config

 

3、安装nagios插件

tar zxvf nagios-plugins-1.4.14.tar.gz

 

cd nagios-plugins-1.4.14

./configure --prefix=/usr/local/nagios/

(如果是AS4,则需要添加参数:--enable-redhat-pthread-workaround

make

make install

 

 

chown -R nagios:nagios /usr/local/nagios

chmod 755 /usr/local/nagios

 

4、整合apache

验证,将以下内容添加到httpd.conf文件最后:

Alias /nagios/cgi-bin/images/ "/usr/local/nagios/share/images/"

    AllowOverride None

    Options None

    Order allow,deny

    Allow from all

    AuthName "Nagios Access"

    AuthType Basic

    AuthUserFile /usr/local/nagios/etc/htpasswd

    Require valid-user

 

ScriptAlias /nagios/cgi-bin/ "/usr/local/nagios/sbin/"

    AllowOverride None

    Options None

    Order allow,deny

    Allow from all

    AuthName "Nagios Access"

    AuthType Basic

    AuthUserFile /usr/local/nagios/etc/htpasswd

    Require valid-user

 

Alias /nagios/ "/usr/local/nagios/share/"

    AllowOverride None

    Options None

    Order allow,deny

    Allow from all

    AuthName "Nagios Access"

    AuthType Basic

    AuthUserFile /usr/local/nagios/etc/htpasswd

    Require valid-user

 

/usr/local/apache/bin/htpasswd -c /usr/local/nagios/etc/htpasswd nagios(登陆账号)

提示设置两便密码

 

 

6、修改权限

检查htpasswd文件权限,改成nagios用户nagios

 

chmod 755 /usr/local/nagios

否则通过页面访问会提示:

You don't have permission to access

 

/usr/local/apache/bin/apachectl -t检查httpd.conf文件语法是否正确,确认ok重启apache

 

使用域名方式登录,看到对话框输入用户名和密码即可(使用ip登陆则不会出现验证窗口)

 

 

cd /usr/local/nagios/etc

 

vi nagios.cfg

vi resource.cfg

 

vi cgi.cfg

修改cgi.cfg use_authentication=1use_authentication=0,即不用验证.不然有一些页面不会显示

 

7、配置nagios

vi commands.cfg

重点内容如下:

# 'notify-host-by-email' command definition

define command{

        command_name    notify-host-by-email

        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState:

$HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/local/bin/sendEmail -f vip@east.net -

t $CONTACTEMAIL$ -s smtp.east.net -u "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" -m "Type: $NOTIFICATIONTYPE$\n

Host: $HOSTNAME$\nState:$HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" -xu vip -xp 13579

        }

 

# 'notify-service-by-email' command definition

define command{

        command_name    notify-service-by-email

        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\

nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT

$" | /usr/local/bin/sendEmail -f vip@east.net  -t lvbin@east.net -s  smtp.east.net -u "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALI

AS$/$SERVICEDESC$ is $SERVICESTATE$ **" -m "Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState:$HOSTSTATE$\nAddress: $HOSTADDRESS$\nI

nfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" -xu vip -xp 13579

        }

 

# 'notify-host-by-sms' command definition

define command{

        command_name notify-host-by-sms

        command_line /usr/local/nagios/fetion2009/fetion --config=/soft/install/login.conf --index=1 --to=13426201234 --msg-utf8="HOST $HOSTADDRESS$ $SERVICESTATE$"

}

 

# 'notify-service-by-sms' command definition

define command{

        command_name notify-service-by-sms

        command_line /usr/local/nagios/fetion2009/fetion --config=/soft/install/login.conf --index=1 --to=$CONTACTPAGER$ --msg-utf8="$SERVICEDESC$ $HOSTADDRESS$ $SERVICESTATE$"

}

 

vi hosts.cfg

vi services.cfg

 

8、设置自动运行

vi /etc/rc.d/rc.local

加入:

/usr/local/apache/bin/apachectl start

/etc/init.d/nagios start

 

 

9、启动

/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

/usr/local/etc/rc.d/nagios start

 

10、添加飞信服务

tar zcvf fetion2009.tar.gz

tar zcvf library_linux.tar.gz

cp library_linux/*.* fetion2009/lib/

 

测试:/usr/local/nagios/fetion2009/fetion --config=/soft/install/login.conf --index=1 --to=$CONTACTPAGER$ --msg-utf8="test"

 

实现飞信自动报警,必须将飞信的文件目录权限改成nagios:nagios

 

 

11、安装nrpe

tar  nrpe-2.12.tar.gz

cd  nrpe-2.12

./configure --prefix=/usr/local/nrpe

make all

make install-plugin

make install-daemon

make install-daemon-config

make install-xinetd

 

 

安装nrpe,编译的时候提示以下信息

checking for SSL headers... configure: error: Cannot find ssl headers

原因是缺少openssl-devel包,解决办法

yum -y install openssl-devel

 

 

服务器端和客户端的nrpe版本必须一致才能正确采集数据,否则会出现一下报错:

CHECK_NRPE: Socket timeout after 10 seconds

Connection refused or timed out

 

 

安装完nrpe后,在安装目录/usr/local/nrpe/libexec只有一个文件check_nrpe,而在nagios插件目录,却缺少这个文件,因此需要把这个文件复制到nagios插件目录;同样,因为nrpe需要调用的诸如check_disk等插件在自己的目录没有,可是这些文件确是nagios插件所存在的,所以也需要从nagios目录复制一份过来

cp /usr/local/nrpe/libexec/check_nrpe  /usr/local/nagios/libexec

cp /usr/local/nagios/libexec/check_disk  /usr/local/nrpe/libexec

cp /usr/local/nagios/libexec/check_load  /usr/local/nrpe/libexec

cp /usr/local/nagios/libexec/check_ping  /usr/local/nrpe/libexec

cp /usr/local/nagios/libexec/check_procs  /usr/local/nrpe/libexec

 

vi /etc/services

加入nrpe                  5666/tcp           # NRPE

 

 

vi /etc/sysconfig/network

HOSTNAME=192.168.0.7改成ip格式

 

service xinetd restart

netstat -at | grep nrpe

 

12、测试nrpe

/usr/local/nagios/libexec/check_nrpe -H localhost

显示NRPE v2.12说明安装成功

 

vi /etc/sysconfig/iptables

插入-I RH-Firewall-1-INPUT -m tcp -p tcp --dport 5666 -j ACCEPT

service iptables save

 

vi /usr/local/nagios/etc/nrpe.cfg

 

13、启动nrpe

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg --daemon

 

至此Nagios服务安装完毕,现在既可使用Nagios来监控主机及相关业务。

 

如何监控windows主机

下载NSClient++-Win32-0.3.5.zip

windows主机安装NSClient++-Win32-0.3.5.zip

解压并将文件夹改名为NSClient,移到C盘根目录

 

打开DOS

nsclient++ /install

nsclient++ SysTray #如果出错不用管!

 

编辑NES.ini

[modules] 选项里

去掉所有的注释符号; 除了

CheckWMI.dllRemoteConfiguration.dll

 

修改allowd_host=210.x.x.x(nagios服务器的ip)

如果这一步要修改passwd,那么nagios服务器里面command.cfg也要修改!

 

[NSClient] 里面,去掉port=12489的注释!

他靠端口12489侦听,所以防火墙要打开这个端口!

 

然后启动nsclient

nsclient++ /start

 

配置nagios.cfg

 

vi /usr/local/nagios/etc/nagios.cfg

#cfg_file=/usr/local/nagios/etc/objects/windows.cfg 去掉这句话的注释

 

如果监控多台主机,需要在增加相应的配置文件,如:

#cfg_file=/usr/local/nagios/etc/objects/eastnt14.cfg

 

配置windows.cfg

 

vi /usr/local/nagios/etc/objects/windows.cfg

 

define host{

use windows-server

host_name winserver alias

My Windows Server

address 被监控端的IP

}

 

修改hostnameaddress,很重要!!

然后下面的很多定义,都可以不用改,想知道每个定义的意思,去看看官方的文档!!

下面的定义全部修改hostname 都改为自己的!一定要一样!

 

保存并退出!

 

vi  /usr/local/nagios/etc/cgi.cfg

修改

use_authentication=1

use_authentication=0

 

重启nagios

/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

/usr/local/etc/rc.d/nagios restart

 

如果这个时候出错!尝试去telnet win服务器的ip 12489端口!!

 

 

监控80端口(http服务)

 

/usr/local/nagios/libexec/check_http -H test.east.net -p 80 -I 192.168.0.17

返回:HTTP OK: HTTP/1.1 200 OK - 2095 bytes in 0.007 second response time |time=0.007139s;;;0.000000 size=2095B;;;0

 

 

主机分组

 

1、在templates.cfg添加主机组名称

vi templates.cfg

 

HOST TEMPLATES最后添加以下信息

define host{

        name                    Daiwei          ; The name of this host template

        use                     Daiwei          ; Inherit default values from the generic-host template

        check_period            24x7            ; By default, switches are monitored round the clock

        check_interval          2               ; Switches are checked every 5 minutes

        retry_interval          1               ; Schedule host check retries at 1 minute intervals

        max_check_attempts      10              ; Check each switch 10 times (max)

        check_command           check-host-alive        ; Default command to check if routers are "alive"

        notification_period     24x7            ; Send notifications at any time

        notification_interval   10              ; Resend notifications every 30 minutes

        notification_options    d,r             ; Only send notifications for specific host states

        contact_groups          admins          ; Notifications get sent to the admins by default

        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE

        }

 

 

需要将所有被监控的主机进行分组,则只能在其中一个主机的配置文件里面添加如下信息:

 

vi localhost.cfg

 

在最后添加:

define hostgroup{

        hostgroup_name  linux-server ; The name of the hostgroup

        alias           Linux Servers ; Long name of the group

        members         localhost,web3     ; Comma separated list of hosts that belong to this group

        }

 

vi skysymbol_com_cn.cfg

 

use里面换成Hostgroup名称:

 

define host{

        use             Daiwei           ; Inherit default values from a template

        host_name       skysymbol.com.cn ; The name we're giving to this host

        alias           skysymbol.com.cn ; A longer name associated with the host

        address         211.100.28.224   ; IP address of the host

        }

 

define service{

        use                     generic-service

        host_name               skysymbol.com.cn

        service_description     Http

        check_command           check_http!-H

        }

 

最后添加:

 

define hostgroup{

        hostgroup_name  Daiwei          ; The name of the hostgroup

        alias           Daiwei          ; Long name of the group

        members         skysymbol.com.cn,citizen.com.cn

        }

 

 

并将其他组成员的host_name添加到members中且一定要注意host_name必须正确,一个define hostgroup只能添加到一个主机配置文件中,重复添加则会在检测nagios配置文件时报错。

 

所以如果你想分3个组,分别是group1,group2,group3,则需要在各主机成员中找一个配置文件加入define hostgroup信息,在members中加入各个组成员的host_name,如host1,host2……,那么最后在nagios检测页面的Host Groups栏目中就可以看到分组信息,例如:

 

group1     group2

 

host1        host4

host2        host5

host3        host6

 

 

报警频率调整:

vi escalations.cfg

 

#主机报警

define hostescalation{

    host_name BACKEND_10.75.1.109,BACKEND_10.75.1.108,BACKEND_10.75.1.91,BACKEND_10.69.3.176,BACKEND_10.73.14.229,BACKEND_10.75.1.61,BACKEND_10.54.40.27,BACKEND_10.73.14.45,BACKEND_10.54.40.32,BACKEND_10.75.1.80,BACKEND_10.81.11.27,BACKEND_10.75.1.88

    first_notification          3 #第三条报警以后改变报警频率

    last_notification           0 #n条后报警频率回复,0为不恢复

    notification_interval       120 #变更频率后间隔120分钟报警一次

    contact_groups              test_admin

    }

 

#服务报警

define serviceescalation{

    host_name BACKEND_10.75.1.109,BACKEND_10.75.1.108,BACKEND_10.75.1.91,BACKEND_10.69.3.176,BACKEND_10.73.14.229,BACKEND_10.75.1.61,BACKEND_10.54.40.27,BACKEND_10.73.14.45,BACKEND_10.54.40.32,BACKEND_10.75.1.80,BACKEND_10.81.11.27,BACKEND_10.75.1.88

    service_description         PING

    first_notification          3

    last_notification           0

    notification_interval       120

    contact_groups              test_admin

    }

 

 

三、Nagios安装及使用过程中的问题:

1contacts.cfg里面定义服务的名称一定要和commands.cfg里面定义的一致

2timeperiods.cfg里定义的hostgroup一定要和hosts.cfghostgroup_name定义的一致

3、主机配置文件(例如eastnt.cfg)里定义主机中的user只能用windows-server或者linux-server,否则启动会报错

 

# Definitions for monitoring the local (Linux) host

cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

4、默认是注销掉的,一定要把注销的符号去掉,否则检查nagios的配置文件要报错

5、这里添加每个需要监控的主机,如:cfg_file=/usr/local/nagios/etc/objects/windows.cfg,然后在objects建立windows.cfg才能被监控,两个地方要一一对应

 

6、出现以下报错,请查看nagios.cfg是否禁用了timeperiods.cfg文件,该文件记录了服务监控的周期

Error: Check period '24x7' specified for service 'Total Processes' on host 'localhost' is not defined anywhere!

Error: Notification period '24x7' specified for service 'Total Processes' on host 'localhost' is not defined anywhere!

 

 

7、出现以下报错,请搜索generic-contact看其他文件内是否有相同的项目,只可保留一项。

Duplicate definition found for contact 'generic-contact' (config file '/usr/local/nagios/etc/objects/templates.cfg', starting on line 28)

 

8、出现如下报错,请检查command.cfg里面有没有与service.cfg内定义的服务内容

Error: Service check command 'check_nrpe' specified in service 'Root Partition' for host 'ChongQing-SERVER-160' not defined anywhere!

 

如:service.cfg内容为:

 

define service{

    hostgroup_name                  test-hosts

    service_description             Root Partition

    check_period                    24x7

    max_check_attempts              4

    normal_check_interval           3

    retry_check_interval            2

    contact_groups                  admins

    notification_interval           10

    notification_period             24x7

    notification_options            w,u,c,r

    check_command                   check_nrpe!check_disk1

    }

 

则:command.cfg内必须定义check_nrpe服务,:

 

define command{

        command_name check_nrpe

        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 7877 -c $ARG1$ -to 20

        }

 

9、出现以下报错,请重新安装nrpe,编译的时候增加--enable-command-args参数:

$ /usr/local/nagios/libexec/check_nrpe -H 192.168.1.20 -c check_disk -a 60 80 /dev/sdb1

CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.

 

./configure --prefix=/usr/local/nagios--enable-command-args

 

 

如执行/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_disk遇到报错NRPE: Unable to read output

请执行

$ chmod 755 /usr/local/nagios

 

10、出现CHECK_NRPE: Error - Could not complete SSL handshake.报错,需要检查nrpe.cfgallowed_hosts,将nagios服务器公网ip加到白名单中即可,并重启nrpe服务。

 

四、安装Ndoutils

1、安装

tar zxf ndoutils-1.5.2.tar.gz

cd ndoutils-1.5.2

/configure --prefix=/usr/local/nagios --with-mysql-lib=/usr/lib/mysql --with-mysql-inc=/usr/include/mysql

make

cp src/ndo2db-3x src/file2sock src/log2ndo src/ndomod-3x.o /usr/local/nagios/bin/

cp config/ndo2db.cfg-simple /usr/local/nagios/etc/ndo2db.cfg

cp config/ndomod.cfg-simple /usr/local/nagios/etc/ ndomod.cfg

chown nagios.nagios -R /usr/local/nagios/bin /usr/local/nagios/etc/ndo

 

2、配置并创建数据库

vi /usr/local/nagios/etc/ndo2db.cfg

修改下面内容

db_host=localhost

db_name=nagios

db_prefix=nagios_

db_user=root

db_pass=123456

 

cd db/

mysql -u root -p123456 nagios < mysql.sql

vi /usr/local/nagios/etc/nagios.cfg

加入下面内容

event_broker_options=-1

broker_module=/usr/local/nagios/bin/ndomod-3x.o config_file=/usr/local/nagios/etc/ndomod.cfg

 

3、启动

/usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg

 

4、重启nagios

/usr/local/nagios/bin/nagios –s reload

 

五、安装Rrdtool

tar zxvf rrdtool-1.4.7.tar.gz

cd rrdtool-1.4.7

./configure --prefix=/usr/local/rrdtool

make && make install

 

六、安装Centreon

1、安装

tar zxf centreon-2.2.2.tar.gz

cd centreon-2.2.2;./install.sh –i

按要求配置即可,这里面不做介绍了,网上资料很多,再安装的时候我再把截图补上

 

2、如果需要重新安装,按下面操作删除一些文件夹后再安装,以免有问题

rm -rf /usr/local/centreon /etc/centreon /var/lib/centreon /etc/httpd/conf.d/centreon.conf

 

3、登陆

浏览器输入,这个地址看apache里面怎么配置,登陆页面后继续配置centreon

 

配置完成后即可显示登陆页面,至此centreon部署完毕,后面的内容我还需要继续实践,包括配置使用、批量添加主机和服务、邮件和短信报警、分布式部署等内容会陆续补充……

阅读(4272) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~