一、Nagios概述
Nagios是一款可运行在Linux和UNIX平台上的开源监控软件,能有效监控主机的运行状态、网络状态、各种系统问题及日志异常等,同时提供了三种预警手段:web、邮件、短信。
Nagios主要分为核心和插件两部分。核心只提供了很少一部分的监控功能,插件提供了大部分的监控功能。
二、Nagios服务端安装(192.168.1.125)
1、安装包下载地址
Nagios服务器:
汉化补丁:
被监控linux主机:
被监控windows主机:
2、准备软件包
Nagios在nagios3.1.x版本之后,配置web监控界面需要php的支持。
-
yum -y install httpd gcc glibc glibc-common gd gd-devel php
-
service httpd start
-
chkconfig --level 35 httpd on
-
3、配置apache
修改/etc/httpd/conf/httpd.conf中apache进程启动用户为nagios
-
User apache
-
Group apache
-
修改为:
-
User nagios
-
Group nagios
-
-
DirectoryIndex index.html index.html.var
-
修改为:
-
DirectoryIndex index.html index.php
-
接着添加下列内容:
-
AddType application/x-httpd-php .php
-
安全起见,一般需求必须授权才能访问Nagios的web监控界面,需增加验证配置,在httpd.conf最后添加如下信息:
-
#setting for nagios
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
AuthType Basic
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
Alias /nagios "/usr/local/nagios/share"
AuthType Basic
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "nagios Access"
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
4、创建apache目录验证文件以及nagios登陆web页面账号密码:
-
htpasswd -c /usr/local/nagios/etc/htpasswd sxm
-
New password:sxm123
-
Re-type new password:sxm123
-
Adding password for user sxm
-
apache报错及解决办法:
-
启动apache遇到错误:httpd: Could not reliably determine the server's fully qualified domain name
-
vim /etc/httpd/conf/httpd.conf
-
#ServerName
-
改为
-
ServerName localhost:80
-
5、安装nagios
-
/usr/sbin/useradd -s /sbin/nologin nagios
-
mkdir /usr/local/nagios
-
chown -R nagios.nagios /usr/local/nagios
-
tar -zxvf nagios-3.4.3.tar.gz
-
cd nagios
-
./configure --prefix=/usr/local/nagios #指定nagios的安装目录/usr/local/nagios
-
make all
-
make install #安装nagios主程序CGI和HTML文件
-
make install-init #在/etc/rc.d/init.d/下创建nagios启动脚本
-
make install-commandmode #配置目录权限
-
make install-config #安装示例配置文件/usr/local/nagios/etc
-
chkconfig --add nagios
-
chkconfig --level 35 nagios on #设置nagios开机启动
-
service nagios start
-
Nagios安装成功后,/usr/local/nagios目录下生成六个目录:
bin Nagios可执行程序所在目录
etc Nagios配置文件所在目录
libexec Nagios外部插件所在目录
sbin NagiosCGI文件所在目录,执行外部命令所需文件所在目录
share Nagios网页文件所在目录
var Nagios日志文件,lock等文件所在目录
var/archives Nagios日志自动归档目录
var/rw 用来存放外部命令文件的目录
nagios新建用户后启动报错及解决办法:
-
nagios新建用户后启动报Starting nagios:This account is currently not available.
-
[root@nagios conf]# service nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios:This account is currently not available.
done.
-
vim /etc/passwd
-
nagios:x:500:500::/home/nagios:/sbin/nologin
-
改为
-
nagios:x:500:500::/home/nagios:/bin/bash
-
重启nagios
-
[root@nagios conf]# service nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
-
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg #验证nagios配置文件
-
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg #启动nagios守护进程模式
6、安装Nagios插件
-
tar -zxvf nagios-plugins-2.1.1.tar.gz
-
cd nagios-plugins-2.1.1
-
./configure --prefix=/usr/local/nagios
-
make
-
make install
-
7、配置Nagios
Nagios/usr/local/nagios/etc目录下的配置文件:
cgi.cfg 控制CGI访问的配置文件
nagios.cfg Nagios主配置文件
resource.cfg 变量定义文件(资源文件),在此文件中定义变量,以便由其它配置文件引用,例如:$USER/$
objects objects目录下有很多配置文件模板,用于定义Nagios对象
objects/commands.cfg 命令定义配置文件,其中定义的命令可以被其它配置文件引用
例:添加一个监控网站页面的command,check_http
check_http参数
Usage: check_http -H | -I [-u ] [-p ]
[-J ] [-K ]
[-w ] [-c ] [-t ] [-L] [-E] [-a auth]
[-b proxy_auth] [-f ]
[-e ] [-d string] [-s string] [-l] [-r | -R ]
[-P string] [-m :] [-4|-6] [-N] [-M ]
[-A string] [-k string] [-S ] [--sni] [-C [,]]
[-T ] [-j method]
7.1、定义command.cfg中check_http命令
-
#-I主机IP地址,-u url,-p 端口,-s 关键词。
-
define command{
-
command_name check_http_word
-
command_line $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ -s $ARG3$
-
}
-
7.2、检查定义好的命令
-
检车网页中的关键词是否有welcome
-
#/usr/local/nagios/libexec/check_http -I 192.168.1.124 -u /index.html -p 80 -s "welcome"
-
HTTP OK: HTTP/1.1 200 ok - 280 bytes in 0.005 second response time |time=0.004636s;;;0.000000 size=280B;;;0
7.3、定义service.cfg
-
define service{
-
user local-service
-
host_name 192.168.1.124
-
service_description http_word
-
check_command check_http_word!/index.html!80!welcome
-
}
-
7.4、重启nagios,在nagios网页services中查看新添加的
check_http_word服务。
objects/contacts.cfg 定义联系人和联系人组的配置文件
objects/localhost.cfg 定义监控本地主机的配置
objects/printer.cfg 定义监控打印机的一个配置文件模板,默认不启用此文件
objects/switch.cfg 监控路由器的一个配置文件模板,默认不启用此文件
objects/templates.cfg 定义主机和服务的一个模板配置文件,可在其它配置文件中引用
objects/timeperiods.cfg 定义Nagios监控时间段的配置文件
objects/windows.cfg 监控Windows主机的配置文件模板,默认不启用
编辑一个新文件hosts.cfg:
-
define host{
-
use linux-server
-
host_name web1
-
alias sxm-web1
-
address 192.168.1.123
-
}
-
-
define host{
-
use linux-server
-
host_name web2
-
alias sxm-web2
-
address 192.168.1.124
-
}
-
-
define host{
-
use linux-server
-
host_name mysql
-
alias sxm-mysql
-
address 192.168.1.126
-
}
-
-
define hostgroup{
-
hostgroup_name sxm-nagios
-
alias sxm nagios
-
members web1,web2,mysql
-
}
编辑一个新文件services.cfg
-
################# web #####################
-
define service{
-
use local-service
-
host_name web1
-
service_description PING
-
check_command check_ping!100.0,20%!500.0,60%
-
}
-
-
define service{
-
use local-service
-
host_name web1
-
service_description SSH
-
check_command check_ssh
-
}
-
-
define service{
-
use local-service
-
host_name web1
-
service_description ftp
-
check_command check_tcp!21
-
}
-
-
define service{
-
use local-service
-
host_name web1
-
service_description http
-
check_command check_http
-
}
-
-
define service{
-
use local-service
-
host_name web2
-
service_description PING
-
check_command check_ping!100.0,20%!500.0,60%
-
}
-
-
define service{
-
use local-service
-
host_name web2
-
service_description SSH
-
check_command check_ssh
-
}
-
-
define service{
-
use local-service
-
host_name web2
-
service_description ftp
-
check_command check_tcp!21
-
}
-
-
define service{
-
use local-service
-
host_name web2
-
service_description http
-
check_command check_tcp!80
-
}
-
################# mysql #####################
-
define service{
-
use local-service
-
host_name mysql
-
service_description PING
-
check_command check_ping!100.0,20%!500.0,60%
-
}
-
-
define service{
-
use local-service
-
host_name mysql
-
service_description SSH
-
check_command check_ssh
-
}
-
-
define service{
-
use local-service
-
host_name mysql
-
service_description ftp
-
check_command check_ftp
-
}
-
-
define service{
-
use local-service
-
host_name mysql
-
service_description mysqlport
-
check_command check_tcp!3306
-
}
编辑一个新文件servicegroup.cfg
-
define servicegroup{
-
servicegroup_name servicegroup
-
alias service_group
-
members web1,PING,web1,SSH,web1,http,web2,PING,web2,SSH,web2,http,web2,users,web2,load,web2,disk,web2,swap
-
}
修改cgi.cfg文件:
-
default_user_name=sxm
-
authorized_for_system_information=nagiosadmin,sxm
-
authorized_for_configuration_information=nagiosadmin,sxm
-
authorized_for_system_commands=sxm
-
authorized_for_all_services=nagiosadmin,sxm
-
authorized_for_all_hosts=nagiosadmin,sxm
-
authorized_for_all_service_commands=nagiosadmin,sxm
-
authorized_for_all_host_commands=nagiosadmin,sxm
修改nagios.cfg
-
cfg_file=/usr/local/nagios/etc/hosts.cfg
-
cfg_file=/usr/local/nagios/etc/services.cfg
-
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
-
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
-
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
-
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
-
8、Nagios服务端上安装nrpe外部扩展插件
nrpe是Nagios的一个功能扩展,它通过远程服务器上安装的nrpe构件及Nagios插件程序来向Nagios服务器提供该服务器的一些本地情况,例如,CPU负载、内存使用、磁盘使用等。
-
[root@nagios tmp]# tar -zxvf nrpe-2.13.tar.gz
-
[root@nagios tmp]# cd nrpe-2.13
-
[root@nagios nrpe-2.13]# ./configure
-
[root@nagios nrpe-2.13]# make all
-
[root@nagios nrpe-2.13]# make install-plugin
-
[root@nagios nrpe-2.13]# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.124
NRPE v2.13
编译nrpe报错:”checking for SSL headers... configure: error: Cannot find ssl headers“
原因是缺少openssl-devel包
执行yum -y install openssl-devel
9、定义一个check_nrpe监控命令
/usr/local/nagios/etc/objects/commands.cfg
-
define command{
-
command_name check_nrpe
-
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
-
}
添加远程主机监控
/usr/local/nagios/etc/services.cfg
-
define service{
use local-service
host_name web2
service_description users
check_command check_nrpe!check_users
}
define service{
use local-service
host_name web2
service_description load
check_command check_nrpe!check_load
}
define service{
use local-service
host_name web2
service_description disk
check_command check_nrpe!check_sda3
}
define service{
use local-service
host_name web2
service_description swap
check_command check_nrpe!check_swap
}
三、配置Nagios客户端(192.168.1.124)
1、安装Nagios插件
-
[root@web2 tmp]# useradd -s /sbin/nologin nagios
-
[root@web2 tmp]# yum -y install gcc glibc glibc-common gd gd-devel openssl-devel
-
[root@web2 tmp]# tar -zxvf nagios-plugins-2.1.1.tar.gz
-
[root@web2 tmp]# cd nagios-plugins-2.1.1
-
[root@web2 nagios-plugins-2.1.1]# ./configure
-
[root@web2 nagios-plugins-2.1.1]# make
-
[root@web2 nagios-plugins-2.1.1]# make install
-
[root@web2 ~]# chown nagios.nagios /usr/local/nagios
-
[root@web2 ~]# chown -R nagios.nagios /usr/local/nagios/libexec
2、安装nrpe插件
-
[root@web2 tmp]# tar -zxvf nrpe-2.13.tar.gz
-
[root@web2 tmp]# cd nrpe-2.13
-
[root@web2 nrpe-2.13]# ./configure
-
[root@web2 nrpe-2.13]# make all
-
[root@web2 nrpe-2.13]# make install-plugin
-
[root@web2 nrpe-2.13]# make install-daemon
-
[root@web2 nrpe-2.13]# make install-daemon-config
3、配置nrpe
/usr/local/nagios/etc/nrpe.cof
-
allowed_hosts=127.0.0.1
-
改为
-
allowed_hosts=127.0.0.1,192.168.1.125 #nagios监控服务端地址
4、启动nrpe守护进程
-
[root@web2 nrpe-2.13]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
-
[root@web2 nrpe-2.13]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
-
NRPE v2.13 #正常结果
5、定义监控服务器内容
在/usr/local/nagios/etc/nrpe.cfg中定义
-
#监控远程服务器的当前用户数
-
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
-
#监控远程服务器的cpu负载
-
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
-
#监控远程服务器的磁盘利用率
-
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
-
#监控远程服务器僵尸进程
-
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
-
#监控远程服务器进程总数
-
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
-
#监控远程服务器的交换空间
-
command[check_swap_]=/usr/local/nagios/libexec/check_swap -w 20 -c 10
阅读(1262) | 评论(0) | 转发(0) |