Chinaunix首页 | 论坛 | 博客
  • 博客访问: 610962
  • 博文数量: 244
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 130
  • 用 户 组: 普通用户
  • 注册时间: 2016-06-27 09:53
个人简介

记录学习,记录成长

文章分类

全部博文(244)

我的朋友

分类: LINUX

2015-12-02 18:05:08

基于LVS+keepalived的web高可用集群

1.环境
为了配置简单,这里不再另外打开两台虚拟机作为web服务器,而是将web服务器和两个主备节点放在一起
node1:192.168.85.144 Master httpd
node2:192.168.85.145 Backup httpd
VIP:192.168.85.128

1.node1和node2上安装keepakived,ipvadm和httpd,ipvsadm是为了更好的观察集群状态
[root@node1 ~]# yum install keepalived httpd ipvsadm -y

2.两个节点上提供web测试页面
[root@node2 ~]# cat /var/www/html/index.html ; ssh node1 'cat /var/www/html/index.html'

node2.a.com


node1.a.com


此时先让httpd服务停止;

3.配置node1上的keepalived.conf
[root@node1 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     root@localhost
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}

vrrp_script chk_httpd {
        script "if [ -f /var/run/httpd/httpd.pid ]; then exit 0; else exit 1; fi"
        interval 2
        fall 2
        rise 1
  }
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    notify_master "/etc/keepalived/master.sh"
    notify_backup "/etc/keepalived/backup.sh"
    notify_fault "/etc/keepalived/fault.sh"
   
    track_script {
    chk_httpd
  }
    virtual_ipaddress {
        192.168.85.128
    }
}

virtual_server 192.168.85.128 80 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    nat_mask 255.255.255.0
    persistence_timeout 50
    protocol TCP

    real_server 192.168.85.144 80 {
        weight 3
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
    real_server 192.168.85.145 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}

编辑三个.sh文件监控keepalived角色的切换过程:
[root@node1 keepalived]# cat master.sh ; cat backup.sh ;cat fault.sh
#!/bin/bash
logfile=/var/log/keepalived-master.log
echo "[Master]" >> $logfile
date >> $logfile

#!/bin/bash
logfile=/var/log/keepalived-master.log
echo "[Backup]" >> $logfile
date >> $logfile

#!/bin/bash
logfile=/var/log/keepalived-master.log
echo "[Fault]" >> $logfile
date >> $logfile
将这三个文件和keepalived配置文件复制给备节点;

4.配置node2上的keepalived
只需修改MASTER为BACKUP,且权限小于100这两处即可,其他的不变;

5.启动过程分析
5.1先在node1上启动两个服务
[root@node1 keepalived]# /etc/init.d/httpd start
Starting httpd: [  OK  ]
[root@node1 keepalived]# /etc/init.d/keepalived start
Starting keepalived: [  OK  ]

keepalived正常运行后会启动三个进程,其中一个是父进程,负责监控其余两个子进程(vrrp子进程和healthcheckers子进程),
观察状态日志:
[root@node1 keepalived]# tail -f /var/log/keepalived-state.log
[Master]
Wed Dec  2 15:34:41 CST 2015

5.2备节点上也启动两个服务
[root@node2 keepalived]# /etc/init.d/httpd start
Starting httpd: [  OK  ]
[root@node2 keepalived]# /etc/init.d/keepalived start
Starting keepalived: [  OK  ]

观察状态日志文件:
[root@node2 ~]# tail -f /var/log/keepalived-state.log 
[Backup]
Wed Dec  2 15:40:19 CST 2015

附上两个节点的日志信息:
node1:
[root@node1 keepalived]# tail -f /var/log/messages
Dec  2 15:34:40 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Transition to MASTER STATE
Dec  2 15:34:40 localhost Keepalived_healthcheckers[28960]: TCP connection to [192.168.85.145]:80 failed !!!
Dec  2 15:34:40 localhost Keepalived_healthcheckers[28960]: Removing service [192.168.85.145]:80 from VS [192.168.85.128]:80
Dec  2 15:34:40 localhost Keepalived_healthcheckers[28960]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 15:34:40 localhost Keepalived_healthcheckers[28960]: SMTP alert successfully sent.
Dec  2 15:34:41 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Entering MASTER STATE
Dec  2 15:34:41 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) setting protocol VIPs.
Dec  2 15:34:41 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128
Dec  2 15:34:41 localhost Keepalived_healthcheckers[28960]: Netlink reflector reports IP 192.168.85.128 added
Dec  2 15:34:46 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128
Dec  2 15:40:16 localhost Keepalived_healthcheckers[28960]: TCP connection to [192.168.85.145]:80 success.
Dec  2 15:40:16 localhost Keepalived_healthcheckers[28960]: Adding service [192.168.85.145]:80 to VS [192.168.85.128]:80
Dec  2 15:40:16 localhost Keepalived_healthcheckers[28960]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 15:40:17 localhost Keepalived_healthcheckers[28960]: SMTP alert successfully sent.

node2 :
[root@node2 keepalived]# tail -f /var/log/messages
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: Opening file '/etc/keepalived/keepalived.conf'.
Dec  2 15:40:19 localhost Keepalived_healthcheckers[28335]: Configuration is using : 14376 Bytes
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: Configuration is using : 39867 Bytes
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: Using LinkWatch kernel netlink reflector...
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Entering BACKUP STATE
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Dec  2 15:40:19 localhost Keepalived_healthcheckers[28335]: Using LinkWatch kernel netlink reflector...
Dec  2 15:40:19 localhost Keepalived_healthcheckers[28335]: Activating healthchecker for service [192.168.85.144]:80
Dec  2 15:40:19 localhost Keepalived_healthcheckers[28335]: Activating healthchecker for service [192.168.85.145]:80
Dec  2 15:40:19 localhost Keepalived_vrrp[28336]: VRRP_Script(chk_httpd) succeeded

从日志可以看出:
主节点运行后,VRRP_Script模块首先运行了check_httpd的检查,发现httpd服务正常,检测到另一个节点不在线就剔除出集群,然后进入MASTER角色,最后将VIP添加进系统,完成主节点的启动;

备用节点在启动keepalived服务后,由于自身是BACKUP角色,所以首先会进入BACKUP状态,接着也会运行VRRP_Script模块检查httpd服务的运行状态,如果httpd服务正常,将输出succeeded

此时的集群状态信息:
[root@node2 ~]# ipvsadm -l -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.85.128:80 rr persistent 50
  -> 192.168.85.144:80            Route   3      0          0         
  -> 192.168.85.145:80            Local   1      0          0         

此时的VIP状况:
[root@node1 ~]# ip addr
1: lo: mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:97:1c:35 brd ff:ff:ff:ff:ff:ff
    inet 192.168.85.144/24 brd 192.168.85.255 scope global eth0
    inet 192.168.85.128/32 scope global eth0
    inet6 fe80::20c:29ff:fe97:1c35/64 scope link 
       valid_lft forever preferred_lft forever
可以用浏览器测试一下;
6.keepalived的故障切换过程分析

6.1关闭node1节点的httpd服务,然后查看日志
Dec  2 15:55:51 localhost Keepalived_vrrp[28962]: VRRP_Script(chk_httpd) failed
Dec  2 15:55:51 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Entering FAULT STATE
Dec  2 15:55:51 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) removing protocol VIPs.
Dec  2 15:55:51 localhost Keepalived_healthcheckers[28960]: Netlink reflector reports IP 192.168.85.128 removed
Dec  2 15:55:51 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Now in FAULT state
Dec  2 15:55:53 localhost Keepalived_healthcheckers[28960]: TCP connection to [192.168.85.144]:80 failed !!!
Dec  2 15:55:53 localhost Keepalived_healthcheckers[28960]: Removing service [192.168.85.144]:80 from VS [192.168.85.128]:80
Dec  2 15:55:53 localhost Keepalived_healthcheckers[28960]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 15:55:53 localhost Keepalived_healthcheckers[28960]: SMTP alert successfully sent.

可以看出,关闭服务后,VRRP_Script模块检测到了并进入FAULT状态,最后将VIP从该节点上移除;
[root@node1 ~]# ip addr 
1: lo: mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:97:1c:35 brd ff:ff:ff:ff:ff:ff
    inet 192.168.85.144/24 brd 192.168.85.255 scope global eth0
    inet6 fe80::20c:29ff:fe97:1c35/64 scope link 
       valid_lft forever preferred_lft forever

6.2此时node2节点上的日志信息:
Dec  2 15:55:50 localhost Keepalived_healthcheckers[28335]: TCP connection to [192.168.85.144]:80 failed !!!
Dec  2 15:55:50 localhost Keepalived_healthcheckers[28335]: Removing service [192.168.85.144]:80 from VS [192.168.85.128]:80
Dec  2 15:55:50 localhost Keepalived_healthcheckers[28335]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 15:55:50 localhost Keepalived_healthcheckers[28335]: SMTP alert successfully sent.
Dec  2 15:55:52 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Transition to MASTER STATE
Dec  2 15:55:53 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Entering MASTER STATE
Dec  2 15:55:53 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) setting protocol VIPs.
Dec  2 15:55:53 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128
Dec  2 15:55:53 localhost Keepalived_healthcheckers[28335]: Netlink reflector reports IP 192.168.85.128 added
Dec  2 15:55:58 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128

MASTER节点出现故障后,备用节点检测到,移除MASTER节点并进入MASTER状态,接管了源MASTER主机的VIP资源;
[root@node2 ~]# ipvsadm -l -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.85.128:80 rr persistent 50
  -> 192.168.85.145:80            Local   1      0          0         

[root@node2 ~]# ip addr
1: lo: mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:62:eb:c2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.85.145/24 brd 192.168.85.255 scope global eth0
    inet 192.168.85.128/32 scope global eth0
    inet6 fe80::20c:29ff:fe62:ebc2/64 scope link 
       valid_lft forever preferred_lft forever

7.故障恢复过程分析
node1上启动httpd服务并观察日志:
Dec  2 16:02:05 localhost Keepalived_healthcheckers[28960]: TCP connection to [192.168.85.144]:80 success.
Dec  2 16:02:05 localhost Keepalived_healthcheckers[28960]: Adding service [192.168.85.144]:80 to VS [192.168.85.128]:80
Dec  2 16:02:05 localhost Keepalived_healthcheckers[28960]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 16:02:05 localhost Keepalived_healthcheckers[28960]: SMTP alert successfully sent.
Dec  2 16:02:05 localhost Keepalived_vrrp[28962]: VRRP_Script(chk_httpd) succeeded
Dec  2 16:02:06 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) prio is higher than received advert
Dec  2 16:02:06 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Transition to MASTER STATE
Dec  2 16:02:06 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election
Dec  2 16:02:07 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Entering MASTER STATE
Dec  2 16:02:07 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) setting protocol VIPs.
Dec  2 16:02:07 localhost Keepalived_healthcheckers[28960]: Netlink reflector reports IP 192.168.85.128 added
Dec  2 16:02:07 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128
Dec  2 16:02:12 localhost Keepalived_vrrp[28962]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.85.128

node1节点启动httpd服务后,自动切换到MASTER状态,同时集群资源也被夺回,将VIP再次绑定到eth0上;
[root@node1 ~]# ipvsadm -l -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.85.128:80 rr persistent 50
  -> 192.168.85.144:80            Local   3      1          0         
  -> 192.168.85.145:80            Route   1      0          0         

[root@node1 ~]# ip addr
1: lo: mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:97:1c:35 brd ff:ff:ff:ff:ff:ff
    inet 192.168.85.144/24 brd 192.168.85.255 scope global eth0
    inet 192.168.85.128/32 scope global eth0
    inet6 fe80::20c:29ff:fe97:1c35/64 scope link 
       valid_lft forever preferred_lft forever

node2上日志信息:
Dec  2 16:02:06 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Received higher prio advert
Dec  2 16:02:06 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) Entering BACKUP STATE
Dec  2 16:02:06 localhost Keepalived_vrrp[28336]: VRRP_Instance(VI_1) removing protocol VIPs.
Dec  2 16:02:06 localhost Keepalived_healthcheckers[28335]: Netlink reflector reports IP 192.168.85.128 removed
Dec  2 16:02:08 localhost Keepalived_healthcheckers[28335]: TCP connection to [192.168.85.144]:80 success.
Dec  2 16:02:08 localhost Keepalived_healthcheckers[28335]: Adding service [192.168.85.144]:80 to VS [192.168.85.128]:80
Dec  2 16:02:08 localhost Keepalived_healthcheckers[28335]: Remote SMTP server [127.0.0.1]:25 connected.
Dec  2 16:02:09 localhost Keepalived_healthcheckers[28335]: SMTP alert successfully sent.

node2节点发现主节点恢复正常后,释放了集群资源,重新进入BACKUP状态,整个系统恢复了正常的主备运行状态;

附上整个过程中监控脚本产生的信息:
[root@node1 keepalived]# cat /var/log/keepalived-state.log 
[Master]
Wed Dec  2 15:34:41 CST 2015
[Fault]
Wed Dec  2 15:55:51 CST 2015
[Master]
Wed Dec  2 16:02:07 CST 2015

[root@node2 ~]# cat /var/log/keepalived-state.log 
[Backup]
Wed Dec  2 15:40:19 CST 2015
[Master]
Wed Dec  2 15:55:53 CST 2015
[Backup]
Wed Dec  2 16:02:06 CST 2015

8.通过vrrp_script实现对集群资源的监控
vrrp_script模块专门用于对集群中资源进行监控(与此模块一起使用的还有track_script模块),在此模块中可以引入监控脚本,命令组合,shell语句等,以实现对服务,端口等多方面的监控。track_script模块主要用于调用vrrp_script模块使keepalived执行对集群资源的检测;
此外,在vrrp_script模块中还可以定义对服务资源检测的时间间隔,权重等参数,通过vrrp_script和track_script组合,可以实现对集群资源的监控并改变集群优先级,进而实现keepalived的主备切换;

8.1 通过killall命令探测服务运行状态
这种方式主要是通过killall命令实现的。killall会发送一个信号到正在运行的指定命令的进程,如果没有指定信号名,则发送SIGTERM,其代号为15,表示以正常的方式结束程序的运行。这里要用到的信号为0,其并不表示要关闭某个程序,而是表示对程序或进程的运行状态进行监控,如果发现进程关闭或异常退出将返回状态码1,反之,如果进程正常运行将返回状态码0;
如:
vrrp_script check_httpd {
script "killall -0 httpd"
interval 2 
}
track_script {
check_httpd
}
这里定义了一个采用监控方式为"killall -0 httpd"的服务监控模块check_httpd,其中interval表示检查的时间间隔;
httpd正常运行时:
[root@localhost ~]# killall -0 httpd
[root@localhost ~]# echo $?
0
httpd关闭时:
[root@localhost ~]# killall -0 httpd
httpd: no process killed
[root@localhost ~]# echo $?
1

8.2  检测端口运行状态
对本机的端口进行检测:
vrrp_script check_httpd {
script " interval 2
fall 2
rise 1
}
track_script {
check_httpd
}
这个例子中,通过的方式定义了一个对本机80端口的状态检测,其中,fall表示检测到失败的最大次数为2次,请求失败2次后认为此节点故障,将进行切换;rise表示如果请求一次成功,就认为此节点资源恢复正常;

8.3 通过shell语句进行状态监控
VRRP_Script模块中直接引用shell语句:
vrrp_script check_httpd {
script "if [ -f /var/run/httpd/httpd.pid ]; then exit 1; else exit 1; fi"
interface 2
fall 1
rise 1
}
track_script {
check_httpd
}
这个例子中,通过一个shell判断语句检测httpd.pid文件是否存在,如果存在(httpd服务正常),就认为状态正常,否则认为状态异常;

8.4 通过脚本进行服务状态监控
vrrp_script chk_mysqld {
script "/etc/keepalived/mysqld.sh"
interval 2
}
track_script {
chk_mysqld
}
其中的脚本为:
#!/bin/bash
mysql=/usr/bin/mysql
mysql_host=localhost
mysql_user=root
mysql_password='redhat'

$mysql -h $mysql_host -u $mysql_user -p $mysql_password -e "show status;" > /dev/null 2>&1
if [ $? = 0 ] ; then
mysql_status=0
else
mysql_status=1
fi
exit $mysql_status
这是一个实现MYSQL服务状态检测的shell脚本,它通过登录MYSQL数据库后执行查询操作来检测MYSQL运行是否运行正常。如果检测正常,将返回状态码0,否则返回状态码1;

总结:可以看到,vrrp_script模块其实并不会关注监控脚本或监控命令是如何实现的,它仅仅通过返回的状态码来识别集群服务是否正常,所以,自定义监控脚本时,只需要按照这个原则编写即可;

keepalived详解在:http://blog.chinaunix.net/uid-30212356-id-5548409.html
参考资料:
高性能Linux服务器构建实战(高俊峰 著)第11章
阅读(1675) | 评论(0) | 转发(0) |
0

上一篇:keepalived介绍及配置详解

下一篇:rsync详解

给主人留下些什么吧!~~