Chinaunix首页 | 论坛 | 博客
  • 博客访问: 570907
  • 博文数量: 67
  • 博客积分: 2792
  • 博客等级: 少校
  • 技术积分: 1038
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-13 19:00
文章分类

全部博文(67)

文章存档

2014年(2)

2013年(7)

2012年(1)

2011年(1)

2010年(56)

分类: 系统运维

2013-05-10 16:28:06

1.安装Nagios服务端:

1.1介绍:

1.2安装GD库

添加标记

vi /etc/portage/package.use

media-libs/gd jpeg png

uuwatch_liwei ~ # emerge gd

1.3.安装nagios

uuwatch_liwei ~ # emerge nagios

安装成功后,会自动新建用户nagios

修改这个新用户的注释信息,以后此用户发送报警邮件中就不会带有注释信息

nano /etc/passwd           

删除nagios那一行中的added by portage for nagios-plugins

 

保存为nagios:x:101:101::/var/nagios/home:/bin/bash

 

添加nagios使这个用户可以发邮件

nano /etc/ssmtp/revaliases

nagios:deamon@uuwatch.com:smtp.uuwatch.com

修改配置文件:

uuwatch_liwei ~ # vi /etc/nagios/objects/contacts.cfg

Email          

1.4.安装nagios-plugin插件

uuwatch_liwei ~ # emerge nagios-plugins

1.5.检查配置文件并启动

uuwatch_liwei ~ # nagios -v /etc/nagios/nagios.cfg

uuwatch_liwei ~ # /etc/init.d/nagios start

1.6.修改Apache配置文件(创建nagios虚拟主机,从源码包拷贝模版):

uuwatch_liwei ~ # cp nagios-3.2.3/sample-config/httpd.conf /etc/apache2/vhosts.d/nagios.uuwatch.com.include

内容修改如下:

ScriptAlias /nagios/cgi-bin "/etc/nagios/sbin"

ServerName  test.nagios.uuwatch.com

   Options ExecCGI

   AllowOverride None

   Order allow,deny

   Allow from all

   AuthName "Nagios Access"

   AuthType Basic

   AuthUserFile /usr/lib/nagios/htpasswd.user

   Require valid-user

 

Alias /nagios "/etc/nagios/share"

DocumentRoot "/etc/nagios/share"

   Options None

   AllowOverride None

   Order allow,deny

   Allow from all

   AuthName "Nagios Access"

   AuthType Basic

   AuthUserFile /usr/lib/nagios/htpasswd.user

   Require valid-user

1.7.创建登陆账号

uuwatch_liwei ~ # htpasswd -c /usr/lib/nagios/htpasswd.user  nagios

登陆后如出现以下情况:

It appears as though you do not have permission to view information for any of the hosts you requested...

If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI
and check the authorization options in your CGI configuration file.

解决方法:

uuwatch_liwei ~ # vi /etc/nagios/cgi.cfg

 

default_user_name=nagios

authorized_for_system_information=nagiosadmin,nagios

authorized_for_configuration_information=nagiosadmin,nagios

authorized_for_system_commands=nagiosadmin,nagios

authorized_for_all_services=nagiosadmin,nagios

authorized_for_all_hosts=nagiosadmin,nagios

authorized_for_all_service_commands=nagiosadmin,nagios

authorized_for_all_host_commands=nagiosadmin,nagios

或者:

uuwatch_liwei ~ # cp /etc/nagios/ /etc/nagios_bak -r

uuwatch_liwei ~ # sed s/nagiosadmin/nagiosadmin\,nagios/g /etc/nagios _bak/cgi.cfg |grep -v "#" |grep -v "^$" >/etc/nagios/cgi.cfg

uuwatch_liwei ~ #  /etc/init.d/nagios restart

六、汉化:

uuwatch_liwei ~ # tar vjxf nagios-cn-3.2.3.tar.bz2

uuwatch_liwei ~ # cd nagios-cn-3.2.3

uuwatch_liwei nagios-cn-3.2.3 # ./configure

uuwatch_liwei nagios-cn-3.2.3 # make && make install

问题:whoops! Error: could not open CGI config file '/etc/nagios/cgi.cfg'

解决方法:usermod -a -G nagios apache

1.8.让Apache支持php

uuwatch_liwei ~ # vi /etc/conf.d/apache2

APACHE2_OPTS="-D PHP5 -D DEFAULT_VHOST -D INFO -D SSL -D SSL_DEFAULT_VHOST -D LANGUAGE"

1.9.安装配置nrpe

uuwatch_liwei ~ # emerge nrpe

uuwatch_liwei ~ # emerge openrc

修改nagios配置,让其可以使用check_nrpe插件:

uuwatch_liwei ~ # vi /etc/nagios/objects/commands.cfg

添加如下行:

#'check_nrpe' command definition from liwei at 2013/04/17

define command{

        command_name    check_nrpe

        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

        }

到此服务器端全部安装成功;

2.配置nagios服务端:

2.1.默认配置文件介绍

Cgi.cfg                     控制CGI访问的配置文件(比如,指定nagios主配置文件及添加用户访问权限等)

nagios.cfg                  nagios主配置文件

resource.cfg                    变量定义文件,又称为资源文件,在些文件中定义变量,以便由其他配置文件引用,如$USER1$

objects                     这是一个目录,在这个目录下有很多配置文件模版,用于定义nagios对像

objects/commands.cfg            命令定义配置文件,其中定义的命令可以被其他配置文件引用

objects/contacts.cfg        定义联系人和联系人组的配置文件

objects/localhost.cfg       定义监控本地主机的配置文件

objects/printer.cfg         定义监控打印机的一个配置文件模版,默认没有启用此文件

objects/switch.cfg          监控路由器的一个配置文件模版,默认没有启用此文件

objects/templates.cfg       定义主机和服务的一个模版配置文件,可以在其他文件中引用

objects/timeperiods.cfg     定义nagios监控时间段的配置文件

objects/windows.cfg         监控Windows主机的一个配置文件模版,默认未启用

2.2.配置实例

自定义hosts.cfg里存放要监控的主机及主机组,services.cfg里存放监控这些主机的哪些服务:

Hosts.cfg:

define host{

        use             linux-server

        host_name       localhost

        alias           localhost

        address         127.0.0.1

}

 

define host{

        use             linux-server

        host_name       db2

        alias           db2

        address         192.168.0.88

}

define hostgroup{

        hostgroup_name  sa-server

        alias           sa server

        members         localhost,db2

}

Services.cfg:

define service{

        use                     local-service

        host_name               localhost,db2

        service_description     PING

        check_command           check_ping!100.0,20%!500.0,60%

}

 

define service{

        use                     local-service

        host_name               localhost,db2

        service_description     SSH

        check_command           check_ssh

}

 

define service{

        use                     local-service

        host_name               localhost,db2

        service_description     SSHD

        check_command           check_tcp!22

}

 

define service{

        use                     local-service

        host_name               localhost,db2

        service_description     http

        check_command           check_http

}

然后修改主配置文件nagios.cfg,添加以下两行:

cfg_file=/etc/nagios/objects/hosts.cfg

cfg_file=/etc/nagios/objects/services.cfg

修改contacts.cfg配置文件:

define contact{

        contact_name                    nagiosadmin            

        use                             generic-contact        

        alias                           Nagios Admin           

        email                           liwei@uuwatch.com    

        }

define contactgroup{

        contactgroup_name       admins

        alias                   Nagios Administrators

        members                 nagiosadmin

        }

遇到问题:监控的主机不停的动,时好时坏;

解决:ps ef grep nagios查看是否有多个nagios进程,全部Kill掉后,重启nagios解决。

3.客户端配置

3.1.安装nrpe:

uuwatch_client ~ # emerge nrpe

uuwatch_client ~ # emerge openrc (不安装提示function问题)

3.2.修改nrpe配置文件

uuwatch_client ~ # vi /etc/nagios/nrpe.cfg

可以把nrpe服务加入到/etc/service文件里 nrpe   5666/tcp    #nrpe

allowed_hosts=127.0.0.1 修改为 allowed_hosts=127.0.0.1,192.168.0.45(服务器端IP

3.3.启动nrpe

uuwatch_client ~ # /etc/init.d/nrpe start

uuwatch_client ~ # netstate -auntp |grep nrpe

tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN      24070/nrpe

4.客户端配置实例

通过check_nrpe监控客户端磁盘状态、登陆用户数、swap使用状况、CPU负载等

4.1.修改客户端nrpe.cfg文件,如下:

log_facility=daemon

pid_file=/var/run/nrpe.pid

server_port=5666

server_address=192.168.0.88(客户端本机地址)

nrpe_user=nagios

nrpe_group=nagios

allowed_hosts=127.0.0.1,192.168.0.45(后面跟nagios服务器的IP地址)

 

dont_blame_nrpe=0

debug=0

command_timeout=60

connection_timeout=300

command[check_users]=/usr/lib64/nagios/plugins/check_users -w 5 -c 10

command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20

command[check_md125]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/md125

command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200

command[check_swap]=/usr/lib64/nagios/plugins/check_swap -w 20 -c 10

command[check_md125]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/md127

command[xxxx]指要监控服务的指令与参数

4.2.修改服务器上的services.cfg文件,添加如下文件:

define service{

        use                     local-service

        host_name               db2

        service_description     users

        check_command           check_nrpe!check_users

}

 

define service{

        use                     local-service

        host_name               db2

        service_description     load

        check_command           check_nrpe!check_load

}

 

define service{

        use                     local-service

        host_name               db2

        service_description     disk

        check_command           check_nrpe!check_md127

}

 

define service{

        use                     local-service

        host_name               db2

        service_description     swap

        check_command           check_nrpe!check_swap

}

define servicegroup{

        servicegroup_name       servergroup

        alias                   server-group

        members                 db2,PING,db2,SSH,db2,SSHD,db2,http,db2,users,db2,disk,db2,swap,db2,load,localhost,PING,localhost,SSH,localhost,SSHD

}

其中services.cfgcheck_command         check_nrpe!xxxx的参数要与客户端nrpe.cfgcommand[xxxx]选项中的xxxx是相对应的。

重启nagios服务器,/etc/init.d/nagios restart

 

5.报警配置

5.1.邮件报警

5.1.1.安装sendMail软件

uuwatch_liwei ~ # wget

uuwatch_liwei ~ # tar vzxf sendEmail-v1.56.tar.gz

uuwatch_liwei ~ # cp a sendEmail-v1.56/sendEmail  /bin/

uuwatch_liwei ~ # chmod 755 /bin/sendEmail

5.1.2.修改nagios配置文件commands.cfg

uuwatch_liwei~ # vi /etc/nagios/objects/commands.cfg

# 'notify-host-by-email' command definition ----------modify by liwei at 2013/04/17

define command{

        command_name    notify-host-by-email

 

        command_line    /usr/bin/printf "%b" "Host: $HOSTNAME$
Notification: $HOSTNOTIFICATIONNUMBER$
Command: $HOSTCHECKCOMMAND$
Datetime: $LONGDATETIME$

Info: $HOSTOUTPUT$
$LONGHOSTOUTPUT$" | /usr/local/bin/sendEmail -f liwei_linux@hotmail.com -t $CONTACTEMAIL$ -s smtp.live.com -u "Host $HOSTSTATE$: $HOSTADDRESS$" -xu liwei_linux@hotmail.com -xp xxxxxxx -o message-content-type=html -o message-charset=utf8

        }

 

# 'notify-service-by-email' command definition-------modify by liwei at 2013/04/17

define command{

        command_name    notify-service-by-email

        command_line    /usr/bin/printf "%b" "Host: $HOSTALIAS$
NotifyTimes: $SERVICENOTIFICATIONNUMBER$
Command: $SERVICECHECKCOMMAND$
Datetime: $LONGDATETIME$

Additional Info:
$SERVICEOUTPUT$
$LONGSERVICEOUTPUT$" | /usr/local/bin/sendEmail -f liwei_linux@hotmail.com -t $CONTACTEMAIL$ -s smtp.live.com -u "Service $SERVICESTATE$: $HOSTADDRESS$ | $SERVICEDESC$" -xu liwei_linux@hotmail.com -xp xxxx -o message-content-type=html -o message-charset=utf8

        }

修改完后重启nagios服务器,然后关闭客户端的nrpe测试一下能否收到邮件。


5.2.手机短信报警(限移动手机用户):

5.2.1.安装fetion

下载安装包

uuwatch_liwei ~ # wget

下载最新主程序fetion:

到“nagios所需”里找到fetion或者到下载fetion程序(Linux)

下载机器人支持库

uuwatch_liwei ~ # wget

uuwatch_liwei ~ # tar vzxf fetion20091117-linux.tar.gz

uuwatch_liwei ~ # mv fx/ /etc/fetion

linuxso_20101113.rarwindows机器解压,然后上传到服务器/usr/local/fetion/目录下,或者直接解压我自己压缩进去的包,fetion-so.tar.gz

uuwatch_liwei ~ # tar vzxf fetion-so.tar.gz C /usr/local/fetion

fetion-so里的文件拷贝到/usr/local/fetion目录下。

uuwatch_liwei ~ # chmod 755 /usr/local/fetion/fetion

uuwatch_liwei ~ # cp /usr/local/fetion/libcrypto.so.4 libssl.so.4 /usr/lib/

uuwatch_liwei ~ # cp /usr/local/fetion/libACE* /usr/lib/

uuwatch_liwei ~ # vi /etc/ld.so.conf

添加:

/etc/fetion

uuwatch_liwei ~ # ldd /usr/local/fetion/fetion (测试是否正常,如没not found则正确)

测试是否能发送短信:

uuwatch_liwei ~ # /usr/local/fetion/fetion --mobile=13661133354 --pwd=password--to 13661133354 --msg-utf8="test" debug

此处会出现验证码,解决方法:到/usr/local/fetion目录下找到图片,下载到wiondows客户端上查看验证码为多少,然后输入,输入一次后,以后不需要再输入,除非重装。

5.2.2.修改nagios配置文件command.cfg,定义fetion发送指令:

uuwatch_liwei ~ # vi /etc/nagios/objects/commands.cfg

#'notify-host-by-sms' command definition-------- from liwei at 2013/04/17

define command{

        command_name    notify-host-by-sms

        command_line /usr/local/fetion/fetion  --mobile=13661133354 --pwd=passwd --to=$CONTACTPAGER$ --msg-utf8="Host $HOSTSTATE$ alert for $HOSTNAME$! on '$DATETIME$'"

}

#'notify-service-by-sms' command definition-------- from liwei at 2013/04/17

define command{

        command_name    notify-service-by-sms

        command_line /usr/local/fetion/fetion  --mobile=13661133354 --pwd=passwd --to=$CONTACTPAGER$ --msg-utf8="$HOSTADDRESS$' $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"

}

5.2.3.修改/etc/nagios/objects/templates.cfg

define contact{

        name                            generic-contact        

        service_notification_period     24x7                   

        host_notification_period        24x7                  

        service_notification_options    w,u,c,r,f,s            

        host_notification_options       d,u,r,f,s             

        service_notification_commands notify-service-by-email,notify-service-by-sms   

        host_notification_commands      notify-host-by-email,notify-host-by-sms

        register                        0                     

        }

5.2.4.修改/etc/nagios/objects/contacts.cfg

uuwatch_liwei ~ # vi /etc/nagios/objects/contacts.cfg

define contact{

        contact_name                    nagiosadmin          

        use                             generic-contact    

        alias                           Nagios Admin           

        email                           liwei@uuwatch.com         

        pager                           13661133354

        }

 

 

 

阅读(3153) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~